The 2016 Linux Storage, Filesystem, and Memory-Management Summit
The summit ran from one to three tracks, depending on the subjects to be discussed.
The plenary track included developers from all three subsystems and covered issues relevant to the kernel as a whole. The sessions from this track were:
- Standards update: what the T10 (SCSI),
T13 (ATA), and NVM Express (NVMe) standards groups have been working on.
$ sudo subscribe today
Subscribe today and elevate your LWN privileges. You’ll have access to all of LWN’s high-quality articles as soon as they’re published, and help support LWN in the process. Act now and you can start with a free trial subscription.
- Persistent-memory error handling: what
should the system do when persistent memory turns out to be less
persistent than it should be?
- Bulk memory-allocation APIs as a way
of addressing networking performance bottlenecks.
- reflink() and related topics:
further development of the reflink() system call, an online
scrubber for XFS, and more in a "plenary" session that the
memory-management developers were too busy to attend.
- Filesystems and containers /
Self-encrypting drives: two lightning talks to finish out the
first day.
- Multi-order radix trees: an
enhancement to radix-tree functionality that might be useful beyond
the memory-management subsystem.
- Performance-differentiated memory: how
to cope with systems featuring memory with varying performance
characteristics.
- DAX on BTT. How to make the DAX direct-access layer play well with BTT, which, inherently, is a layer of indirection.
The memory management track discussed the following topics:
- Two transparent huge page cache
implementations. Transparent huge pages don't currently work with
file-backed pages, but that's not through a lack of trying: there are
currently two working implementations to choose between. In this
session, the memory-management developers spent two hours trying to
make that decision.
- Ideas for rationalizing GFP flags: how
to better define and improve the semantics around the ubiquitous
GFP_ memory-allocation flags.
- CMA and compaction: problems with and
solutions for the kernel's mechanisms for supporting large contiguous
allocations.
- Virtual machines as containers, and,
in particular, the memory-management challenges that come with packing
a lot of virtual machines into a host.
- Partial address-space mirroring is a
hardware feature intended to improve reliability. As the discussion
showed, though, the memory-management developers were not convinced it
will work as well as advertised.
- Heterogeneous memory management:
giving GPUs and other peripherals access to process address spaces.
- Memory-management testing: how can we
find more problems before they bite users?
- Memory control group fairness: how to
get control groups to make the right decisions when faced with
problematic workloads.
- TLB flush optimization: reducing the
performance cost that comes with flushing the translation lookaside
buffer too often.
- Improving the OOM killer: this year's
episode in the perennial discussion on how the kernel should handle
running out of memory.
- Memory-management subsystem workflow: how is development working in this subsystem, and how can it be made to run more smoothly?
See also: Rik van Riel's notes for a terse summary of the memory-management sessions.
The filesystem-only track had a relatively small number of discussions, which were:
- Parallel lookups: how to support
multiple simultaneous directory lookups.
- Network filesystems, and supporting
network filesystems with case-insensitive semantics in particular.
- The xstat() system call: how
can this long-desired functionality finally make its way into the
kernel?
- Exposing extent information to user space: the best way to let applications learn about the layout of a file on disk.
The storage-only track also had a small number of sessions.
- James Bottomley has posted his notes from those discussions.
The combined filesystem and storage track had the following discussions:
- Persistent storage and remote data access
protocols: what changes are needed to support accessing fast
persistent devices over (relatively) slow remote access protocols.
- Block and filesystem interfaces:
better ways for the block and filesystem layers to work together.
- DAX, mmap(), and a "go faster"
flag: how to accommodate applications that are aware of
persistent-memory behavior and want the best performance possible.
- Partial drive depopulation: what to do
when a storage device goes partially bad, but you want to keep using
the part that still works?
- fallocate() and the block
layer: challenges in supporting the "bulk zero" functionality for
block I/O
- Using the multiqueue block subsystem by
default: we've had multiqueue block support for a few years now,
maybe it's time to start phasing out the single-queue interface?
- Stream IDs and I/O hints: ways of
telling block devices which data belongs together.
- Background writeback: how to make it
great again, even if it has never been great before.
- Multipage bio_vecs: increasing the maximum size of an I/O operation in the block layer.
Note that these sessions are still being written up; they will be added to this page once they become available.
Group photo
This photo of the LSFMM 2016 group was provided by the Linux Foundation; more photos can be found on Flickr.
Acknowledgments
Thanks are due to LWN subscribers and the Linux Foundation for supporting
our travel to this event.
