The 2016 Linux Storage, Filesystem, and Memory-Management Summit

By Jonathan Corbet
April 20, 2016

LSFMM 2016

The 2016 Linux Storage, Filesystem, and Memory-Management Summit was held April 18 and 19 in Raleigh, North Carolina, USA. On the order of 100 developers representing those subsystems discussed a wide range of highly technical topics. LWN was there, resulting in the following reports.

The summit ran from one to three tracks, depending on the subjects to be discussed.

The plenary track included developers from all three subsystems and covered issues relevant to the kernel as a whole. The sessions from this track were:

Standards update: what the T10 (SCSI), T13 (ATA), and NVM Express (NVMe) standards groups have been working on.
$ sudo subscribe today
Subscribe today and elevate your LWN privileges. You’ll have access to all of LWN’s high-quality articles as soon as they’re published, and help support LWN in the process. Act now and you can start with a free trial subscription.
Persistent-memory error handling: what should the system do when persistent memory turns out to be less persistent than it should be?
Bulk memory-allocation APIs as a way of addressing networking performance bottlenecks.
reflink() and related topics: further development of the reflink() system call, an online scrubber for XFS, and more in a "plenary" session that the memory-management developers were too busy to attend.
Filesystems and containers / Self-encrypting drives: two lightning talks to finish out the first day.
Multi-order radix trees: an enhancement to radix-tree functionality that might be useful beyond the memory-management subsystem.
Performance-differentiated memory: how to cope with systems featuring memory with varying performance characteristics.
DAX on BTT. How to make the DAX direct-access layer play well with BTT, which, inherently, is a layer of indirection.

The memory management track discussed the following topics:

Two transparent huge page cache implementations. Transparent huge pages don't currently work with file-backed pages, but that's not through a lack of trying: there are currently two working implementations to choose between. In this session, the memory-management developers spent two hours trying to make that decision.
Ideas for rationalizing GFP flags: how to better define and improve the semantics around the ubiquitous GFP_ memory-allocation flags.
CMA and compaction: problems with and solutions for the kernel's mechanisms for supporting large contiguous allocations.
Virtual machines as containers, and, in particular, the memory-management challenges that come with packing a lot of virtual machines into a host.
Partial address-space mirroring is a hardware feature intended to improve reliability. As the discussion showed, though, the memory-management developers were not convinced it will work as well as advertised.
Heterogeneous memory management: giving GPUs and other peripherals access to process address spaces.
Memory-management testing: how can we find more problems before they bite users?
Memory control group fairness: how to get control groups to make the right decisions when faced with problematic workloads.
TLB flush optimization: reducing the performance cost that comes with flushing the translation lookaside buffer too often.
Improving the OOM killer: this year's episode in the perennial discussion on how the kernel should handle running out of memory.
Memory-management subsystem workflow: how is development working in this subsystem, and how can it be made to run more smoothly?

See also: Rik van Riel's notes for a terse summary of the memory-management sessions.

The filesystem-only track had a relatively small number of discussions, which were:

Parallel lookups: how to support multiple simultaneous directory lookups.
Network filesystems, and supporting network filesystems with case-insensitive semantics in particular.
The xstat() system call: how can this long-desired functionality finally make its way into the kernel?
Exposing extent information to user space: the best way to let applications learn about the layout of a file on disk.

The storage-only track also had a small number of sessions.

James Bottomley has posted his notes from those discussions.

The combined filesystem and storage track had the following discussions:

Persistent storage and remote data access protocols: what changes are needed to support accessing fast persistent devices over (relatively) slow remote access protocols.
Block and filesystem interfaces: better ways for the block and filesystem layers to work together.
DAX, mmap(), and a "go faster" flag: how to accommodate applications that are aware of persistent-memory behavior and want the best performance possible.
Partial drive depopulation: what to do when a storage device goes partially bad, but you want to keep using the part that still works?
fallocate() and the block layer: challenges in supporting the "bulk zero" functionality for block I/O
Using the multiqueue block subsystem by default: we've had multiqueue block support for a few years now, maybe it's time to start phasing out the single-queue interface?
Stream IDs and I/O hints: ways of telling block devices which data belongs together.
Background writeback: how to make it great again, even if it has never been great before.
Multipage bio_vecs: increasing the maximum size of an I/O operation in the block layer.

Note that these sessions are still being written up; they will be added to this page once they become available.

Group photo

This photo of the LSFMM 2016 group was provided by the Linux Foundation; more photos can be found on Flickr.

Acknowledgments

Thanks are due to LWN subscribers and the Linux Foundation for supporting our travel to this event.