Sunday, October 25, 2009

Hack 56. Centralize Resources Using NFS










Hack 56. Centralize Resources Using NFS



Make recovering from disasterand preparing for itsimpler by centralizing shared resources and service configuration.


A key goal of all system administrators is to maximize the availability of the services they maintain. With an unlimited budget you could create a scenario where there are two or three "hot standby" machines for every one machine in production, waiting to seamlessly take over in the event of a problem. But who has an unlimited budget?


Standalone machines that store their own local copies of configuration and data can be nice, if you have lots of them, and you have load balancers, and you have a good cloning mechanism so you don't spend all your time making sure all of your mail servers (for example) are identical. Oh yeah, and when you make a configuration change to one, you'll need a system to push it out to the other clones. This could take quite a bit of time and/or money to get rightand this doesn't even touch on the expense of putting backup software on every single machine on your network. I'm sure there are some smaller sites using standard Unix and Linux utilities for backup and nothing else, but the majority of sites are using commercial products, and they're not cheap!


Wouldn't it be nice if a test box could be repurposed in a matter of minutes to take over for a server with a failed drive? Wouldn't it be great if you only needed to back up from a couple of file servers instead of every single service machine? NFS, the Network File System, can get you to this place, and this hack will show you how.


Admins new to Linux, particularly those coming from Microsoft products, may not be familiar with NFS, the file-sharing protocol used in traditional Unix shops. What's great about NFS is that it allows you to store configuration files and data in a centralized location and transparently access that location from multiple machines, which treat the remote share as a local filesystem.


Let's say you have five Apache web servers, all on separate hardware. One is the main web presence for your company, one is a backup, and the other three perform other functions, such as hosting user home pages, an intranet site, and a trouble-ticket system. They're all configured to be standalone machines right now, but you want to set things up so that the machine that's currently just a hot standby to the main web server can serve as a standby for pretty much any web server.


To do this, we'll create an NFS server with mountable partitions that provide the configuration information, as well as the content, to the web servers. The first step is to configure the NFS server.



6.2.1. Configuring the NFS Server


To configure the NFS server, you must first create a directory hierarchy to hold Apache configurations for all of your different web servers, since it's hubris to assume they're all configured identically. There are numerous ways to organize the hierarchy. You could try to emulate the native filesystem as closely as possible, using symlinks to get it all perfect. You could also create a tree for each web server to hold its configuration, so that when you add another web server you can just add another directory on the NFS server for its configuration. I've found the latter method to be a bit less taxing on the brain.


The first thing to do on the NFS server is to create the space where this information will live. Let's say your servers are numbered web1 through web5. Here's an example of what the directory structure might look like:



/servconf
mail/
common/
web/
web1/
conf/
httpd.conf
access.conf
modules.conf
conf.d/
php4.conf
web2/
conf/
httpd.conf
access.conf
modules.conf
conf.d/
php4.conf
python.conf
mod_auth_mysql.conf



This sample hierarchy illustrates a few interesting points. First, notice the directories mail/ and common/. As these show, the configuration tree doesn't need to be limited to a single service. In fact, it doesn't actually have to be service-specific at all! For example, the common/ tree can hold configuration files for things like global shell initialization files that you want to be constant on all production service machines (you want this, believe me) and the OpenSSH server configuration file, which ensures that the ssh daemon acts the same way on each machine.


That last sentence brings up another potential benefit of centralized configuration: if you want to make global changes to something like the ssh daemon, you can make the changes in one place instead of many, since all of the ssh daemons will be looking at the centralized configuration file. Once a change is made, the daemons will need to be restarted or sent a SIGHUP to pick up the change. "Execute Commands Simultaneously on Multiple Servers" [Hack #29]) shows a method that will alow you to do this on multiple servers quickly.


All of this is wonderful, and some sites can actually use a hierarchy like this to have a single NFS server provide configuration to all the services in their department or business. However, it's important to recognize that, depending on how robust your NFS deployment is, you could be setting yourself up with the world's largest single point of failure. It's one thing to provide configuration to all your web servers, in which case a failure of the NFS server affects the web servers. It's quite another to use a single NFS server to provide configuration data to every production service across the board. In this case, if there's a problem with the file server, you're pretty much dead in the water, all owing to a glitch in a single machine! It would be smart to either invest in technologies that ensure the availability of the NFS service, or break up the NFS servers to lessen the impact of a failure of any one server.


Now it's time to export our configuration tree. It's important to note that some NFS daemons are somewhat "all or nothing" in the sense that they cannot export a subdirectory of an already exported directory. The exception to that rule is if the subdirectory is actually living on a separate physical device on the NFS server. For safety's sake, I've made it a rule never to do this anyway, in the event that changes in the future cause the subdirectory to share a device with its parent. Note that the same rule applies to exporting a subdirectory and then trying to export a parent directory separately.


Some implementations of the nfsd server do allow subdirectory exports, but for the sake of simplicity I avoid this, because it has implications as to the rules applied to a particular exported directory and can make debugging quite nightmarish.


Let's see how this works. Using the above "best practices," you cannot export the whole /servconf tree in our example to one server, and then export mail/ separately to the mail servers. You can export each of the directories under /servconf separately if /servconf itself is not exported, but that would make it slightly more work to repurpose a server, because you'd have to make sure permissions were in place to allow the mount of the new configuration tree, and you'd have to make sure the /etc/fstab file on the NFS client was updatedotherwise, a reboot would cause bad things to happen.


It's easier just to export the entire /servconf tree to a well-defined subset of the machines, so that /etc/fstab never has to be changed and permissions are not an issue from the NFS server side of the equation. That's what we'll do here. The file that tells the NFS server who can mount what is almost always /etc/exports. After all this discussion, here's the single line we need to accomplish the goal of allowing our web servers to mount the /servconf directory:



/servconf 192.168.198.0/24(ro,root_squash) @trusted(rw,no_root_squash)



The network specified above is a DMZ where my service machines live. Two important things to note here are the options applied to the export. The ro option ensures that changes cannot be made to the configuration of a given machine by logging into the machine itself. This is for the sake of heightened security, to help guarantee that a compromised machine can't be used to change the configuration files of all the other machines. Also to that end, I've explicitly added the root_squash option. This is a default in some NFS implementations, but I always state it explicitly in case that default ever changes (this is generally good practice for all applications, by the way). This option maps UID 0 on the client to nobody on the server, so even root on the client machine cannot make changes to files anywhere under the mount point.


The second group of hosts I'm exporting this mount point to are those listed in an NIS netgroup named trusted. This netgroup consists of two machines that are locked down and isolated such that only administrators can get access to them. I've given those hosts read/write (rw) access, which allows administrators to make changes to configuration files from machines other than the NFS server itself. I've also specified the no_root_squash option here, so that admins can use these machines even to change configuration files on the central server owned by root.


For the Apache web server example, we can create a very similar hierarchy on our NFS server to store content served up by the servers, and export it in the exact same way we did for the configuration. However, keep in mind that many web sites assume they can write in the directories they own, so you'll need to make sure that you either export a writable directory for these applications to use, or export the content tree with read/write privileges.




6.2.2. Configuring the NFS Clients



Getting NFS clients working is usually a breeze. You'll need to decide where you want the local Apache daemon to find its configuration and content, create the mount points for any trees you'll need to mount, and then edit the /etc/fstab file to make sure that the directory is always mounted at boot time.


Generally, I tend to create the local mount points under the root directory, mainly for the sake of consistency. No matter what server I'm logged in to, I know I can always run ls -l / and see all of the mount points on that server. This is simpler than having to remember what services are running on the machine, then hunting around the filesystem to check that the mount points are all there. Putting them under / means that if I run the mount command to see what is mounted, and something is missing, I can run one command to make sure the mount point exists, which is usually the first step in troubleshooting an NFS-related issue.


I also attempt to name the mount point the same as the exported directory on the server. This makes debugging a bit simpler, because I don't have to remember that the mount point named webstuff on the client is actually servconf on the server. So, we create a mount point on the NFS client like this:



# mkdir /servconf



Then we add a line like the following to our /etc/fstab file:



mynfs:/servconf /servconf nfs ro,intr,nfsvers=3,proto=tcp 0 0



Now we're assured that the tree will be mounted at boot time. The other important factor to consider is that the tree is mounted before the service that needs the files living there is started. It should be safe to assume that this will just work, but if you're trying to debug services that seem to be ignoring configuration directives, or that fail to start at all, you'll want to double check, just in case!




6.2.3. Configuring the Service



We've now mounted our web server configuration data to all of our web servers. Let's assume for now that you've done the same with the content. What we've essentially accomplished is a way to have one hot spare machine, which also mounts all of this information, that can take over for any failed web server in the blink of an eye. Two ways to get it to work are to use symlinks or to edit the service's initialization script.


To use the symlink method, you consult the initialization script for the service. In the case of Apache, the script will most likely be /etc/init.d/apache or /etc/init.d/httpd. This script, like almost all service initialization scripts, will tell you where the daemon will look for its configuration file(s). In my case, it looks under /etc/apache. The next thing to do is to move this directory out of the way and make a symlink to the directory that will take its place. This is done with commands like the following:



# mv /etc/apache /etc/apache.DIST
# ln -s /servconf/web/web1 /etc/apache



Now when the service starts up, it will use whichever configuration files are pointed to by the symlink. The critical thing to make sure of here is that the files under the mount point conform to what the initialization script expects. For example, if the initialization script for Apache in this case was looking for /etc/apache/config/httpd.conf, it would fail to start at all, because the /etc/apache directory is now a symlink to a mount point that has put the file under a subdirectory called conf/, not config/. These little "gotchas" are generally few, and are worked out early in the testing phase of any such deployment.


Now, if we want to make our hot spare look like web3 instead of web1, we can simply remove the symlink we had in place, create a new symlink to point to web3's configuration directory, and restart the service. Note that if all of the web servers mount the content in the same way under the same mount points, you don't have to change any symlinks for content, since the configuration file in Apache's case tells the daemon where to find the content, not the initialization script! Here are the commands to change the personality of our hot spare to web3:



# rm /etc/apache; ln -s /servconf/web/web3 /etc/apache
# /etc/init.d/apache restart



The commands used to restart Apache can vary depending on the platform. You might run the apachectl program directly, or you might use the service command available on some Linux distributions.




6.2.4. A Final Consideration


You can't assume that you're completely out of the woods just because a server looks and acts like the one it replaces. In the case of Apache, you'll also want to make sure that your hot spare is actually reachable by clients without them having to change any of their bookmarks. This might involve taking down the failed web server and assigning its IP address to the hot spare or making the DNS record for the failed web server point to the hot spare.













No comments:

Post a Comment