Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
August 22nd - 24th in Toronto, Canada
Register Now for LinuxCon+ContainerCon North America 2016!
View analytic
Monday, August 22 • 11:45am - 12:35pm
Monitoring the Linux Kernel at Facebook - Calvin Owens, Facebook

Sign up or log in to save this to your schedule and see who's attending!

Collecting kernel logs from a fleet of servers is an inherently difficult problem, since the messages you're interested in often result from crashes or other conditions that render any userspace collection of those logs unreliable or impossible. The traditional approach to this is to scrape consoles, but that becomes unworkable on a large scale, especially when the server fleet is comprised of many varying types of commodity and specialized hardware.

At Facebook, we use netconsole to solve this problem: since kernel log messages are emitted synchronously over UDP, it catches nearly all possible crashes, and is fantastically easy to deploy and run across a diverse server fleet. We use an open-source daemon called "netconsd" to process these messages on a very large scale.

In this talk, we'll discuss how we collect, analyze, and visualize the data from this system at Facebook. We'll briefly discuss how to setup and configure netconsole and netconsd in your own datacenter. Finally, we'll discuss some various sorts of problems/errors/crashes we've seen in production over the past year or so, how we found them, and how we fixed them.

Speakers
avatar for Calvin Owens

Calvin Owens

Production Engineer, Facebook


Monday August 22, 2016 11:45am - 12:35pm
Pier 4