[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 Development
 Commerce
 Linux in the news
 Announcements
 Letters
All in one big page

See also: last week's Kernel page.

Kernel development


The current development kernel release is 2.5.7. The current 2.5.8 prepatch from Linus is 2.5.8-pre3; it includes a big PowerPC64 update, a FireWire update, the new system calls for setting process CPU affinity, a bunch of USB updates, a great deal of merging from the "dj" series, and more.

Dave Jones's latest prepatch is 2.5.7-dj3. There's not much new in it; Dave appears to be concentrating more on feeding changes to Linus at the moment.

The latest 2.5 status summary from Guillaume Boissiere came out on April 2.

The current stable kernel release is 2.4.18. The current 2.4.19 prepatch is 2.4.19-pre6. It includes a long list of networking fixes, a netfilter update, lots of USB updates, and a vast number of other changes. Significantly, this patch also includes a few pieces of Andrea Arcangeli's VM update, as reworked by Andrew Morton. Much of the reworked VM code remains outside of the main 2.4 kernel, however.

Alan Cox's latest prepatch is 2.4.19-pre5-ac3. The most interesting part of this prepatch is the inclusion of Pavel Machek's software suspend code. If you want to actually play with that code, though, you'll also need to apply this patch from Pavel.

Alan has also released 2.2.21-rc3, the third 2.2.21 release candidate.

Reorganizing USB. It's all Lineo's fault. The company announced the contribution of its "USB Device Software" to the Linux kernel. This code allows a Linux system to behave as a device (not the host) on a USB bus; it is used in the Sharp Zaurus PDA. The code was welcomed by all, but it led quickly to the inevitable question: "where do we put all that code?"

After some discussion, it was decided that the USB source tree needed to be reorganized. The final organization looks like this (everything under drivers/usb, of course):

core The core USB code (including device-side code)
host Controller code for USB hosts
device Controller code for USB device systems
class Drivers for USB devices with defined 'class' specifications
net Network drivers
image Scanner drivers
input Input drivers
media Media drivers (i.e. cameras)
serial Serial drivers
storage Storage drivers
misc Everything else

The resulting changes were merged in 2.5.8-pre3, resulting in a huge patch that, for the most part, just moves files around. The Lineo code has not yet been merged, but it's on the list of things to do.

kbuild 2.5 is back. We last heard from the kbuild 2.5 project, which is mostly the work of Keith Owens, some months ago. At that point, the project had a much improved, cleaner, and more accurate kernel build process which provided some interesting new features. There was just one little problem: a full kernel build took twice as long. That kind of bad news does not get you very far with kernel hackers, who spend a lot of time as it is waiting for kernel builds; Keith was essentially told, politely, to come back when the performance problems had been dealt with. (See the January 3 LWN Kernel Page).

Keith is back. Kbuild 2.5 version 2.0 is now available for 2.4.16, with version for the 2.5 kernel available as well. While previous versions of kbuild worked with a text file that was read at every step in the process, the new kbuild uses a memory-mapped database implementation borrowed from BitKeeper. The database code, like a few other pieces of BitKeeper, has been released under the GPL, so there should be no licensing objections here.

The new code has made a difference. On Keith's system, a full kernel build with the traditional kbuild code takes a full 15 minutes (with everything configured in). With the new code, that time drops to just under nine minutes. If you immediately run a second make on the fully-built tree, things look even better. The old kbuild recompiles a bunch of stuff unnecessarily, resulting in a "build" time of just over two minutes. The new kbuild, instead, figures out that nothing needs to be done in 14 seconds. Says Keith:

More accurate kernel build, easier to write and understand Makefiles, 30% faster than kbuild 2.4. Now the nay-sayers will have to find something else to complain about!

Keith has no plans to try to get the new code into the 2.4 kernel tree ("Changing the kernel build on a stable kernel is a bad idea"), but there will probably be a renewed push to see it incorporated into 2.5. The "nay-sayers" may have to scramble if they want to keep it out.

EVMS 1.0 released. The news is a bit stale (due to the Kernel Page taking last week off), but still worth a mention: the Enterprise Volume Management System team has announced the release of EVMS 1.0, the first full release. EVMS is a high-end system for the management of disk drives, partitions, and volumes; in addition to the usual nice volume management features it supports snapshots, bad block handling, and more. See the EVMS web page for more information.

Tagged command queueing for IDE drives. SCSI drives have supported tagged command queueing (TCQ) for many years. TCQ allows a device driver to attach an identifying "tag" onto each request passed to a drive; the drive will then use that tag when reporting on the status of an operation. This tagging allows the drive to have multiple requests outstanding, and to satisfy them in any order it chooses. TCQ improves performance in a couple of ways:

  • Having multiple operations outstanding reduces idle time by ensuring that the drive always has work to do. In a single-request mode, the drive must wait after signalling completion until the system gets around to handing it another request. In the tagged mode, that next request is already available.

  • The drive can optimize the ordering of requests for the best performance. The Linux filesystem and driver code already tries to perform this optimization, but there limits to how successful the host system can be in this regard. The simple cylinders / heads / tracks model of a disk drive's block layout has been an approximate fiction for years; blocks may not actually be close to where the host system thinks they should be. And it is hard for the host system to know the current head and platter positions. The drive (one hopes) is better informed, and can make better decisions.

TCQ support has been a justification for SCSI user smugness for years. IDE is catching up, however, and Linux is almost ready: Jens Axboe has released a patch which uses TCQ on IDE drives which support that feature. With the release of the second version of the patch, Jens states: "The code has taken quite a lot of beating, so I'm ready to call this beta and ask for more testers. No malfunctions have been detected here."

Note that the patch is still a little way from being ready for widespread enterprise deployment - among other things, no real performance testing has been done yet. Jens has been most concerned with issues like data integrity so far - something that most Linux users will likely appreciate. It's also worth taking a look at this note from Andre Hedrick on the (dismal) state of TCQ support in most IDE hardware.

Nonetheless, the TCQ code has begun to find its way into Martin Dalecki's IDE patch set, and will thus likely show up in a 2.5 prepatch before too long.

Dealing with discontiguous memory. Most computers out there organize their memory as a single, contiguous array of bytes - or something close to that. If there are gaps (such as the x86 memory hole at 640K), they tend to be small and easily worked around. Linux on most systems takes advantage of this contiguous nature by treating memory as a simple, linear array.

But what do you do if your hardware is not so reasonable? The Linux kernel has had discontiguous memory support for some time, but the implementation has not been considered satisfactory by all. Its performance is suboptimal, and the code tends to be strongly tied to specific architectures.

Daniel Phillips has set out to apply an old computer science axiom to this challenge: any problem can be solved by adding another layer of indirection. He has posted a patch which makes some interesting changes to how the Linux kernel sees the memory it runs on.

In kernel space, there is a fundamental distinction between "virtual" and "physical" addresses. Kernel virtual addresses are different from user-space virtual addresses; most of the code treats them as if they were really physical, hardware addresses. In fact, on most architectures, the only difference between (most) kernel virtual addresses and the corresponding physical addresses is a constant offset. The kernel usually works with virtual addresses, translating them to physical addresses only when it is really necessary.

With Daniel's patch, the kernel works with a third address type, called a "logical" address. The characteristics of the three address types, from lowest-level to highest, now are:

physical Addresses recognized directly by the hardware. They may have significant gaps.
logical Pseudo-physical addresses as seen by the kernel; the logical address space is contiguous
virtual Virtual addresses used by most kernel code

The establishment of the logical address space is handled at the lowest levels of the kernel; most of the rest of the system is unaware of it. By setting up the logical address tables properly, the patch takes a system with randomly-organized, discontiguous memory and makes that memory look like a nice, linear array. As a result, most of the kernel code need not be aware of the real arrangement of the hardware.

This patch is a fundamental change in how Linux deals with its memory. Despite that, it is relatively small in size, and it makes it easy for the kernel to deal with complicated hardware arrangements. That extra layer of indirection hides the complexity of the underlying system. Maybe the old axiom is right.

(Here is the latest version of Daniel's patch as of this writing).

SUBTERFUGUE needs a new maintainer. As if in response to the project's having been mentioned in NTK, SUBTERFUGUE maintainer Mike Coleman has announced that he can no longer maintain the project. Have a look if you think you might like to take on this interesting tool.

Other patches and updates released this week include:

Kernel trees:

  • Andrea Arcangeli: 2.4.19-pre6-aa1; a number of fixes and performance patches.

  • Greg Kroah-Hartman: 2.5.7-gregkh-1; includes a great many USB patches.

  • Jörg Prante: 2.4.19-pre5-jp9; the kitchen sink is missing but not much else.

  • Marc-Christian Petersen: 2.4.18-WOLK3.3; also includes the kitchen sink.

  • Christoph Hellwig: 1.0.9-hch1. "After all the discussions about VFS races and VM problems and growing bloat in all areas of the kernel people seem to have forgotten the good old days of the small and simple linux kernels. Even more important the ego of a young kernel developer will suffer in the long term if he doesn't have his own kernel patchkit, so here it is." Yes, it really is based on 1.0.9.

  • J.A. Magallon: 2.4.19-pre5-jam2; updated to the latest Arcangeli VM.

  • Paul P Komkoff Jr: 2.4.19-pre5-ac3-s43.

Core kernel code:

Development tools:

Device drivers

Filesystems:

Miscellaneous:

Networking:

  • Dmitry Kasatkin: Affix 0.98 (Bluetooth stack for Linux).

  • Kazunori Miyazawa: USAGI 3.1, a stable release from this project, which is working to improve Linux IPv6 support.

Ports:

Section Editor: Jonathan Corbet


April 11, 2002

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2002 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds