Brice Goglin's Blog - MMU notifiers brings into Linux what we've been wanted for HPC for a while
Jul. 29th, 2008
19:06 - MMU notifiers brings into Linux what we've been wanted for HPC for a while
After the addition of ioremap_wc() in 2.6.26, MMU notifiers have now been merged in 2.6.27-rc1. It means that everything we have been wanting in the past to help HPC support is finally available upstream. We thought IB being merged (back in 2.6.11) would make things go fast, but it looks like these important features were not that obvious to people that did not work on HPC for a long time.
Back in 2004, I was trying to get a safe registration cache working in the kernel for distributed storage over Myrinet. User-space regcaches are known to be a mess because they need to intercept malloc/free/munmap to invalidate cached segments. It works sometimes, but it is often a mess. In the kernel, you just can't intercept anything. So I wrote a patch called VMASpy which allowed other subsystems to be notified when part of a "registered" VMA is unmapped or forked. I never submitted it since it couldn't be accepted unless somebody in the kernel (i.e. IB) used it. Given posts like this, we see that IB people weren't conscious of the problem (nowadays they are interested but something in the IB specs apparently prevents them from using this).
KVM needed some kernel support for its shadow pages, so MMU notifiers were written by Andrea Arcangeli (thanks a lot to him for keeping working on this despite many people not liking it). After a couple months of trolls, here we go with 2.6.27-rc1, we can now register a notifier per mm_struct and get a callback when part of the address space is unmapped. The implementation is very different from my VMASpy and of course much better :) But the final API provides similar features, so it should be great news for people working on registration caches or so.