Sunday, October 25, 2009

Organization of the Material










Organization of the Material


Some aspects of networking code require as many as seven chapters, while for other aspects one chapter is sufficient. When the topic is complex or big enough to span different chapters, the part of the book devoted to that topic always starts with a concept chapter that covers the theory necessary to understand the implementation, which is described in another chapter. All of the reference and secondary material is usually located in one miscellaneous chapter at the end of the part. No matter how big the topic is, the same scheme is used to organize its presentation.


For each topic, the implementation description includes:


  • The big picture, which shows where the described kernel component falls in the network stack.

  • A brief description of the main data structures and a figure that shows how they relate to each other.

  • A description of which other kernel features the component interfaces withfor example, by means of notification chains or data structure cross-references. The firewall is an example of such a kernel feature, given the numerous hooks it has all over the networking code.

  • Extensive use of flow charts and figures to make it easier to go through the code and extract the logic from big and seemingly complex functions.


The reference material always includes:


  • A detailed description of the most important data structures, field by field

  • A table with a brief description of all functions, macros, and data structures, which you can use as a quick reference

  • A list of the files mentioned in the chapter, with their location in the kernel source tree

  • A description of the interface between the most common user-space tools used to configure the topic of the chapter and the kernel

  • A description of any file in /proc that is exported


The Linux kernel's networking code is not just a moving target, but a fast runner. The book does not cover all of the networking features. New ones are probably being added right now while you are reading. Many new features are driven by the needs of single users or organizations, or as university projects, but they find their way into the official kernel when they're considered useful for a large audience. Besides detailing the implementation of a subset of those features, I try to give you an idea of what the generic implementation of a feature might look like. This will help you greatly in understanding changes to the code and learning how new features are implemented. For example, given any feature, you need to take the following points into consideration:


  • How do you design the data structures and the locking semantics?

  • Is there a need for a user-space configuration tool? If so, is it going to interact with the kernel via an existing system call, an ioctl command, a /proc file, or the Netlink socket?

  • Is there any need for a new notification chain, and is there a need to register to an already existing chain?

  • What is the relationship with the firewall?

  • Is there any need for a cache, a garbage collection mechanism, statistics, etc.?


Here is the list of topics covered in the book:




Interface between user space and kernel


In Chapter 3, you will get a brief overview of the mechanisms that networking configuration tools use to interact with their counterparts inside the kernel. It will not be a detailed discussion, but it will help you to understand certain parts of the kernel code.




System initialization


Part II describes the initialization of key components of the networking code, and how network devices are registered and initialized.




Interface between device drivers and protocol handlers


Part III offers a detailed description of how ingress (incoming or received) packets are handed by the device drivers to the upper-layer protocols, and vice versa.




Bridging


Part IV describes transparent bridging and the Spanning Tree Protocol, the L2 (Layer two) counterpart of routing at L3 (Layer three).




Internet Protocol Version 4 (IPv4)


Part V describes how packets are received, transmitted, forwarded, and delivered locally at the IPv4 layer.




Interface between IPv4 and the transport layer (L4) protocols


Chapter 20 shows how IPv4 packets addressed to the local host are delivered to the transport layer (L4) protocols (TCP, UDP, etc.).




Internet Control Message Protocol (ICMP)


Chapter 25 describes the implementation of ICMP, the only transport layer (L4) protocol covered in the book.




Neighboring protocols


These find local network addresses, given their IP addresses. Part VI describes both the common infrastructure of the various protocols and the details of the ARP neighboring protocol used by IPv4.




Routing


Part VII, the biggest one of the book, describes the routing cache and tables. Advanced features such as Policy Routing and Multipath are also covered.




What Is Not Covered


For lack of space, I had to select a subset of the Linux networking features to cover. No selection would make everyone happy, but I think I covered the core of the networking code, and with the knowledge you can gain with this book, you will find it easier to study on your own any other networking feature of the kernel.


In this book, I decided to focus on the networking code, from the interface between device drivers and the protocol handlers, up to the interface between the IPv4 and L4 protocols. Instead of covering all of the features with a compromise on quality, I preferred to keep quality as the first goal, and to select the subset of features that would represent the best start for a journey into the kernel networking implementation.


Here is a partial list of the features I could not cover for lack of space:




Internet Protocol Version 6 (IPv6)


Even though I do not cover IPv6 in the book, the description of IPv4 can help you a lot in understanding the IPv6 implementation. The two protocols share naming conventions for functions and often for variables. Their interface to Netfilter is also similar.




IP Security protocol


The kernel provides a generic infrastructure for cryptography along with a collection of both ciphers and digest algorithms. The first interface to the cryptographic layer was synchronous, but the latest improvements are adding an asynchronous interface to allow Linux to take advantage of hardware cards that can offload the work from the CPU.


The protocols of the IPsec suiteAuthentication Header (AH), EncapsulatingSecurity Payload (ESP), and IP Compression (IPcomp)are implemented in the kernel and make use of the cryptographic layer.




IP multicast and IP multicast routing


Multicast functionality was implemented to conform to versions 2 and 3 of the Internet Group Management Protocol (IGMP). Multicast routing support is also present, conforming to versions 1 and 2 of Protocol Independent Multicast (PIM).




Transport layer (L4) protocols


Several L4 protocols are implemented in the Linux kernel. Besides the two well-known ones, UDP and TCP, Linux has the newer Stream Control Transmission Protocol (SCTP). A good description of the implementation of those protocols would require a new book of this size, all on its own.




Traffic Control


This is the Quality of Service (QoS) layer of Linux, another interesting and powerful component of the kernel's networking code. Traffic control is implemented as a general infrastructure and as a collection of traffic classifiers and queuing disciplines. I briefly describe it and the interface it provides to the main transmission routine in Chapter 11. A great deal of documentation is available at http://lartc.org.




Netfilter


The firewall code infrastructure and its extensions (including the various NAT flavors) is not covered in the book, but I describe its interaction with most of the networking features I cover. At the Netfilter home page, http://www.netfilter.org, you can find some interesting documentation about its kernel internals.




Network filesystems


Several network filesystems are implemented in the kernel, among them NFS (versions 2, 3, and 4), SMB, Coda, and Andrew. You can read a detailed description of the Virtual File System layer in Understanding the Linux Kernel, and then delve into the source code to see how those network filesystems interface with it.




Virtual devices


The use of a dedicated virtual device underlies the implementation of networking features. Examples include 802.1Q, bonding, and the various tunneling protocols, such as IP-over-IP (IPIP) and Generalized Routing Encapsulation (GRE). Virtual devices need to follow the same guidelines as real devices and provide the same interface to other kernel components. In different chapters, where needed, I compare real and virtual device behaviors. The only virtual device that is described in detail is the bridge interface, which is covered in Part IV.




DECnet, IPX, AppleTalk, etc.


These have historical roots and are still in use, but are much less commonly used than IP. I left them out to give more space to topics that affect more users.




IP virtual server


This is another interesting piece of the networking code, described at http://www.linuxvirtualserver.org/. This feature can be used to build clusters of servers using different scheduling algorithms.




Simple Network Management Protocol (SNMP)


No chapter in this book is dedicated to SNMP, but for each feature, I give a description of all the counters and statistics kept by the kernel, the routines used to manipulate them, and the /proc files used to export them, when available.




Frame Diverter


This feature allows the kernel to kidnap ingress frames not addressed to the local host. I will briefly mention it in Part III. Its home page is http://diverter.sourceforge.net.



Plenty of other network projects are available as separate patches to the kernel, and I can't list them all here. One that I find particularly fascinating and promising, especially in relation to the Linux routing code, is the highly configurable Click router, currently offered at http://pdos.csail.mit.edu/click/.


Because this is a book about the kernel, I do not cover user-space configuration tools. However, for each topic, I describe the interface between the most common user-space configuration tools and the kernel.













No comments:

Post a Comment