Thursday, November 12, 2009

Using Ports













Using Ports

Installing software via the ports system takes longer than installing it via packages, and the ports system requires a live Internet connection. The ports system can produce better results for a given situation than packages, however, which more than compensates for these issues in most cases.


What makes ports so interesting is the level of automation they implement. With one command a port can find the source code for a program, fetch that code over the Internet, verify the integrity of the downloaded source code, patch the code to run properly on OpenBSD, integrate any changes required by your system setup, build the code into actual program binaries, and install it, without further human intervention. If you have compiled software on other platforms, you'll quickly realize what a time-saver this is.


Let's take a look at a port. Here's one of my non-negotiable requirements for comfortable systems administration, the tcsh shell.




# cd /usr/ports/shells/tcsh/
# ls
1 CVS 2 Makefile 3 README.html 4 distinfo 5 patches 6 pkg
#


As you'll see everywhere else, the 1 CVS directory contains revision control information.


The 2 Makefile contains the basic instructions needed to build this port. If you were to look at the Makefile, you would find that it's pretty minimal compared to the Makefiles found in almost all other software. Rather than target definitions, it contains variable definitions and an instruction to include another Makefile, bsd.port.mk. You'll find the real guts of the ports system in bsd.port.mk. (While you don't have to understand bsd.port.mk, or even look at it, some time when you have free time you might want to peruse this file and see just how the ports system hangs together.) This Makefile only contains settings of interest to this particular port, rather than global settings used by the ports system as a whole.



3 README.html is a basic description of the port, generated by the "make readmes" command in /usr/ports (see "Browsing the Ports Tree").


The 4 distinfo file contains a variety of checksums for the source code of the port. The ports system uses this information to verify the integrity of downloaded source code. We discuss checksums in Chapter 10, but the ports system performs automatic checksum verification.


The 5 patches directory contains any patches necessary for the software to compile and run properly on this particular release of OpenBSD.


Finally, the 6 pkg directory contains various information about the port itself. Let's take a look.




# ls pkg/
CVS DEINSTALL DESCR INSTALL PLIST SECURITY
#


The DEINSTALL file contains a message that will be displayed when you uninstall the software, either through the ports mechanism's uninstall tool or pkg_delete(1).


DESCR contains the original long description of the port in plain text, including the home page and any flavors the port supports (See "Flavors").


INSTALL contains a message that is displayed when the port or the package built from the port finishes installing.


PLIST contains a list of all the files contained in the completed port.



Finally the SECURITY file contains a list of known security issues with the port. The OpenBSD team does not subject ports to the same level of security scrutiny that it inflicts upon the base system, but does audit third-party software as time permits. Because add-on software is maintained outside OpenBSD, their project team's ability to secure distributed software is limited by the producers of the add-on software.


Combined, the files in the ports directory create the tools and instructions needed to build a port.




Installing a Port


You have probably noticed that we didn't see any actual source code in the port directory. Sure, there are patches to apply to source code, and scripts to run on source code, and notes about source code, but no actual code! You might rightly ask how this is supposed to run without the source code.


When you activate a port, your system automatically downloads the appropriate source code from an approved Internet site. It then checks the downloaded code for integrity errors, extracts the code to a working directory, patches it, builds it, creates a package, and installs the package. If the port has dependencies that are not already installed, it will interrupt the build of this software to build those dependencies. To trigger all this, all you have to do is go to a port directory and type




# make install


When you do, you'll see lots of text scroll down your terminal window, ending with a "Installing" message. If you have a good Internet connection, this can be easier than using packages!





What the Port Install Does


Let's dissect a port installation. I need tcsh much in the same way I need oxygen or caffeine, so we'll build that.




# make install
===> Checking files for tcsh-6.12.00
>> tcsh-6.12.00.tar.gz doesn't seem to exist on this system.
>> Attempting to fetch /usr/ports/distfiles/tcsh-6.12.00.tar.gz from ftp://
ftp.astron.com/pub/tcsh/.
...


If the source code for this particular version of tcsh was in the /usr/ports/ distfiles directory, make would have found it. Because it isn't there, make(1) tries to download the source from a list of approved sites stored in the ports system or in the port itself. You will see various chunks of FTP output where the ports system download the file, then continues with the building process.





...
1 >> Checksum OK for tcsh-6.12.00.tar.gz. (sha1)
2 ===> Extracting for tcsh-6.12.00
3 ===> Patching for tcsh-6.12.00
5 ===> Configuring for tcsh-6.12.00
creating cache ./config.cache
checking host system type... i386-unknown-openbsd3.2
...


Make 1 compares the downloaded source code with the integrity information available in the distinfo file and finds that the downloaded file matches the one the OpenBSD port maintainer used. That means that the port will treat the file as intact and correct — presumably the port maintainer verified that the code they used did not contain any backdoors! It then 2 uncompresses the source code, 3 applies any local OpenBSD patches, and 4 starts the build process. (Observant and knowledgeable readers know that "configure" is not the same as "build," but if you keep your eyes open you'll also see a "build" statement later.) You will see many lines of build output — this particular program contains a few dozen, but a large program such as KDE can build for hours and have thousands of lines of make output. Eventually, however, you'll see a message like this:




...
===> Faking installation for tcsh-6.12.00
install -c -s -o root -g bin -m 555 /usr/ports/shells/tcsh/w-tcsh-6.12.00/tcsh-
6.12.00/tcsh /usr/ports/shells/tcsh/w-tcsh-6.12.00/fake-i386/usr/local/bin/tcsh
...


The ports system is installing the software in a temporary location, so as to correctly build a clean package. This is called a "fake" installation because, well, it isn't the real installation that will end up on your hard drive. Once the fake install is complete, you'll see packages being built and then the real install.




...
===> Building package for tcsh-6.12.00
Creating package /usr/ports/packages/i386/All/tcsh-6.12.00.tgz
Using SrcDir value of /usr/ports/shells/tcsh/w-tcsh-6.12.00/fake-i386/usr/local
Creating gzip'd tar ball in '/usr/ports/packages/i386/All/tcsh-6.12.00.tgz'
===> Installing tcsh-6.12.00 from 1 /usr/ports/packages/i386/All/tcsh-6.12.00.tgz
...


The important thing to notice here is the 1 location of the package file. You may want to grab this package to install on other machines. This package will contain any local optimizations you may have added to the system, so you can use it to quickly install an utterly identical version of the software on any other OpenBSD systems of the same release and architecture.


Finally, the port spits out a message before returning you to a command prompt.





+--------------
| For proper use of tcsh-6.12.00 you should notify the system
| that /usr/local/bin/tcsh is a valid shell by adding it to the
| the file /etc/shells. If you are unfamiliar with this file
| consult the shells(5) manual page
+--------------


This message is important enough that an overworked OpenBSD ports developer spent a few moments to make it appear. Read it. In this case, you have to make some manual changes to /etc/shells (see Chapter 14) for this port to work correctly. In general, OpenBSD ports do not make drastic system changes that can affect system integrity or security; they require the sysadmin to do that himself. Remember, any piece of software you install impacts system security in some way, and even something as innocuous as a shell program might have programming errors that provide a back door into a system. OpenBSD may provide a large-caliber gun and high-explosive armor-piercing bullets, but if you want to shoot yourself in the foot you must pull the trigger yourself.


The interesting thing about this process is that the port build process can actually be stopped at any of these steps. If you want to do some custom work on a port as it builds, you can carefully control the build process.





Port Build Stages


The port installation process includes several stages, which can all be called separately. Each stage performs all of the stages before it — the final stage, "make install," calls all of these. We'll discuss each stage and some of the customizations that can be made during this process.


Many port-building customizations are performed via variables. You can set these variables in the environment of the person building the program, on the command line, or if you want them to be used by all users or by every port you can set them in /etc/mk.conf. You can set these environment variables on any target that includes the target you're running — for example, if you need to use a special command to download files, you can use the variable to set that command during "make install."




Make Fetch


The "make fetch" process checks to see if the source code for the port is available locally. This source file is called a distfile. The location it will check is defined by the environment variable DISTDIR, or /usr/ports/distfiles if DISTDIR is not set. For example, if you have a central software source code repository on your network mounted over NFS as /central/sourcerepo, you could set "DISTDIR=/ central/sourcerepo" and use that location instead. This allows you to share the downloads among as many machines as you like and reduces external bandwidth usage.


If the software is not available locally, "make fetch" tries to download it from the Internet. The source code location is specified in the port's Makefile as the variable MASTER_SITES. (See "Customizing Download Sources" for some more hints on this variable.) By default, make fetch uses ftp(1) to grab the software.



The ftp(1) program might have problems in certain environments, however. If you need to change the command used, you can do that with the FETCH_CMD variable. For example, I frequently download software from behind a SOCKS5 proxy; I can set FETCH_CMD='/usr/local/bin/runsocks /usr/bin/ftp' to have things work transparently. Alternately, if I'm in a location where a simple ftp program does not work properly, I could use wget (/usr/ports/net/wget) and try FETCH_CMD='/usr/local/bin/wget - - passive-ftp' with whatever other wget options (usernames, passwords, etc) are necessary to grab my source code files.


I said earlier that you could use this sort of variable during any make command, and it would be used at the proper stage. Here's an example of that in action:




# make FETCH_CMD='/usr/local/bin/wget - - passive-ftp' install


We're trying to run the "make install" command, which runs "make fetch" at an earlier stage. Setting FETCH_CMD or DISTDIR will affect "make fetch," but not interfere with later commands.


The "make fetch" command is very useful when you have certain times that you can download more easily than others. For example, I have a laptop which has a great deal of bandwidth available at certain locations and almost no bandwidth elsewhere. I can run "make fetch" on several ports where bandwidth is available and then unplug from the network and let them build.





Make Checksum


The "make checksum" command confirms that the distfile has not been corrupted, either maliciously or during download. A checksum is the result of a mathematical computation on a file and is discussed in great detail in Chapter 10. If you change the file, the checksum will be changed. Because it is theoretically possible to "pad" an altered file so that a particular checksum will be matched, OpenBSD checks each distfile against three different sorts of checksum: MD5, RMD160, and SHA1. A hacker could alter a bunch of source code and pad the files so that the MD5 checksum would be unchanged, but he couldn't change it so that all three checksums would be met!


After downloading the software, "make checksum" computes all three checksums for the downloaded distfiles. If the checksums match those given in the "distinfo" file in the port, the port build continues. If the checksums do not match, the build immediately aborts and does not continue until you find a distfile that matches the checksums. This might seem overly paranoid, but checksum matching quickly alerted the public when an intruder placed a Trojan in the official OpenSSH source code distribution.


If your port download fails, it might be because the software distributor updated the source file without changing the version number. This is endemic in the free software world. It can also happen if an intruder compromises a mirror site where the software is distributed. You can set the REFETCH variable to "yes" to make the ports system try to fetch a distfile from ftp.OpenBSD.org, if the original file fails the checksum match. This will get around both of these problems.



If you have downloaded the file, and you are certain it is correct, but the checksums still do not match, you can define NO_CHECKSUM to have OpenBSD skip the checksum computation and continue. This is an extraordinarily bad idea. At the time the port was made, the files had the checksum given in the port. If the checksum doesn't match, that means that the software is changed in some manner. Perhaps the file was corrupted, or changed by a hacker, or updated by the distributor, or any number of other problems. The simple issue is, the source code does not match what the port was made for. Perhaps the patches included in the port will not apply cleanly any more, or maybe the software will not run on OpenBSD, or you could even be installing a back door inviting a hacker to store his kiddy porn collection on your hard drive. You're on your own if you insist on compiling software after a checksum mismatch.




Make Depends


The make depends target confirms that all the dependencies of the port are available and builds them if they are not. For example, our WindowMaker package required two other packages. If we had built WindowMaker from ports, the WindowMaker port build would have triggered builds of those other ports at this stage. It will recurse through any dependencies and build them all.


You can set NO_DEPENDS to "yes" to skip this stage, but your port may not compile or run if you do so.





Make Extract


The ports system needs to uncompress and extract the source code from the distfile before the port can be built. Source code is extracted in the port directory, in a subdirectory with a w-prefix but named after the software's name and version. For example, the tcsh port we built earlier, tcsh version 6.12.00, was extracted in the directory /usr/ports/shells/tcsh/w-tcsh-6.12.00. Look in this directory for the actual code that is being compiled.





Make Patch


Now that you have the source code, the ports system can apply the local OpenBSD patches from the "patch" directory.


If you want to apply your own patches to the code, or if you want to review the code before compiling it, run "make patch" first. Your patches might conflict with the OpenBSD patches if you apply them first, or cause compilation failures, or any number of other things. By running "make patch" first, you get to see the code as OpenBSD will compile it.





Make Configure


The "make configure" command runs any precompilation configure scripts included in the software. If you want to edit the configure script, do so before letting this step run! If there is no configure script, the port system silently skips it.






Make Build


This step actually builds the extracted, patched, and configured software. Any customizations you want to make need to be finished before you run this step! When it is completed, you will have the finished program binaries in the port's work directory.




Make Fake


This command installs the software in a subdirectory of the work directory, laid out exactly as it would be if was actually installed. This directory is generally named with the word "fake" and the name of the architecture it was built for, e.g., "fake-i386." You can look through this directory to see what will be installed.




Make Package


This command bundles up the port's fake directory, adds in packing and installation instructions, and ties it all up in a package exactly like those available on CD-ROM. The package will be stored under /usr/ports/packages, in a subdirectory by architecture, in a further subdirectory by port category. For example, the package built from the tcsh port is stored in /usr/ports/packages/ i386/shells/tcsh-6.12.00.tgz. You can build packages on a machine without installing it on the local machine.


You can then install this port on other architectures, or use it to verify the integrity of installed software.





Make Install


This final step performs some sanity checks and runs pkg_add(1) to installs the package compiled from the port.




Tracking Port-Building Information


How does the ports system know what has been run before? If you run "make patch," edit the source files, and then run "make install" to complete the port installation, why doesn't the port start over at the beginning? The ports system keeps track of completed make targets with hidden files in the ports work directory. These files begin with a period so they don't show up in a normal directory listing, but they show up easily up easily with "ls -a". Let's look at the tcsh work directory after the build is done.




# ls -a
. 1 .configure_done 2 .patch_done pkg
.. 3 .extract_done bin tcsh-6.12.00
4 .build_done 5 .extract_started fake-i386
#


The 1 .configure_done file indicates that "make configure" has been successfully run. Similarly, 2 .patch_done means that make patch has been run, 3 .extract_done means that the make extract has finished, 4 .build_done means that the port has been built, and 5 .extract_started means that the make extract process has been finished.



You can use this to your advantage. Suppose you don't want to apply the OpenBSD patches to a port. (This isn't a good idea, but presumably you have reasons other than wanting a bullet in your foot.) You could run "make extract," create a file called ".patch_done" in the work directory, and then run "make install." The "make install" process would then think that the patches had been applied and continue on its merry way. You can do something similar to run your own configure process or make any other changes you like. Of course, if you do this you're completely on your own if the software breaks or has a security issue.













No comments:

Post a Comment