Swapping via NFS for Linux
You are visitor number
since Tue Oct 16 1998. Current time is
localtime (MET).
Contents.
News.
The Mayersche Buchhandling is opening Germany's biggest book store in Cologne. The equipment includes 20 diskless Linux stations with -- wow! -- my NFS swap hacks. They were using patches/linux-2.0.32-nfs-swap.diff.gz with a slight modification needed to make it compile with Linux v2.0.35. This is first hand information. One of the Cellists of the Coll 'arco Chamber Orchestra was one of the guys who had to install the computer equipment.
Which of the patches are considered to be stable?
linux-2.0.32-nfs-swap.diff.gz -- yes, the others: no. I didn't test linux-2.0.35-nfs-swap.diff.gz extensively yet, and the 2.1 stuff may or may not work.
How to apply the patch files
To apply the patches below, change to the location of your kernel source tree, probably
/usr/src/linux/
and run the patch command like this
gunzip -c /usr/src/linux-2.0.35-nfs-swap.diff.gz | patch -p1 -l -swhere you have to replace /usr/src/linux-2.0.35-nfs-swap.diff.gz by the name of the patch file your are actually using. The example given above asume that you have down-loaded the patch file into /usr/src/.
After applying the patch you have to reconfigure your kernel. Answer yes to the Swapping via network question in the Networking options menu as well as to the NFS filesystem support and Allow swap files to be on NFS filesystems configuration options in the filesystem configuration menu.
How to enable swapping using the NFS protocol
After you have recompiled and installed the kernel in the usual way you have to reboot. Afterwards you can enable swapping to files located on an NFS server by using the commands dd, mkswap and swapon which are available on virtually any Linux machine. Proceed as follows (customize to your needs!!!). The example assumes that your machine also gets its root filesystem via NFS.
dd if=/dev/zero of=/SWAPFILE bs=1k count=20480 mkswap /SWAPFILE 20480 sync swapon /SWAPFILEThat's it. You have created a 20MB swapfile and told your kernel to use it. Please refer to the man-pages for the respective programs for more information (man 8 swapon, man 8 mkswap).
Swapping via NFS is really slow. But should work.
Implementation notes.
There are several problems one easily runs into when trying to implement swapping via any network protocal. Plus some additional traps when doing this via the sucking NFS (let's call it
Network Failure System
) protocol.The main problem, however, is that receiving data packets via the network consumes memory as such. Each packet first has to be copied into the systems RAM by the network device layer. Only after it has been copied into a previous allocated memory block one can have a look at its contents and devide what to do with it.
This means that it would be possible to bring a machine using any network-based swapping mechanism down to its knees by simply flood-pinging it.
Therefor the patches implement all (v2.1) or some (v2.0) of the following hacks:
Means to drop network packets not needed for swapping when the machine is running out of memory. This is save, the senders of those packets will simply retry after a short timeout when our poor NFS-swap machine doesn't ack the packets.
Introduce yet another level of memory allocation priority:
The
GFP_ATOMIC
level that used to be the highest priority is moved one level up (but making sureget_free_pages()
won't sleep) and the highest, newly introduced level is now calledGFP_NUCLEONIC
. Maybe this better should have been calledGFP_SWAP
.Normal interrupt handlers (i.e. those that aren't involved in swapping) still use
GFP_ATOMIC
, as soon as a network device is needed for swapping, it will useGFP_NUCLEONIC
which allows to use up every and the last page.
/proc/sys/vm/freepages
now contains a fourth component which gives the number of pages reserved for swapping purposes (GFP_ATOMIC
won't get these pages).
If you dare to use these patches, then please try to break them. Stress test the nfs client with flood-pings, possibly use several tcpspray's as well and busy the network-swapping machine with some memory consuming computations. Several concurrent kernel compilations are probably a nice way.
Of course, if you congest the network with tcpspray and flood-pings the poor nfs client won't be very responsive any more, as the performance of network swapping depends on the network bandwidth that can be used for that purpose.
The patches are a little bit ugly. But should work, basically. What is especially painful with the 2.1 patches is the resurrection of the ->pg_swap_entry component of struct page which disappeared a long time ago. Nowadays, page->offset is used as well for the file offset of a page as for the swap entry (in an un-patched kernel). I probably should start creating some special struct swap_ops structure and stop using file->write_page and file->read_page.
Where to download the patch files from
Protocol Location HTTP patches/ FTP iris1.math1.rwth-aachen.de
Related work
Visit Pavel Machek's Network Block Device patches at Pavel's nbd page. His hacks to allow swap files to be located on nbd volumes are partly based on my nfs-swap patches. His approach is probably superior to mine as the network block device protocol he has developed is faster than the NFS protocol.
Apologies
I have to apologize by Pavel Machek for blaming him publically for having forgotten my name. Thinking of myself I have to admit that I forget names (and email addresses and phone numbers) quite often ...
Credits.
- Pavel Machek
Fruitful discussions about one year ago when I started to really implement the thing and to make it more or less stable ...
- Volker Seebode
For testing and using the stuff in prefessional environments (thinking about Germany's biggest book store ...).
- Many others ...
Particulary, there was one guy who tried to implement an (IMHO never working) hack to implement NFS swapping via NFS. Unluckily, I have forgotten his name ...
![]() | ![]() | JSO | ![]() | ![]() | ![]() | ![]() |