December 17, 2003 Edition

By Amit "Prototyped" Gurdasani (mailto:amitg@alumni.cmu.edu), Ron "Soko" Sokoloski (mailto:rsokoloski1@cogeco.ca), Matt "XWRed1" Thrailkill (mailto:xwred1@modestolan.com)

Welcome back to another holiday edition of Linux.Ars. In this issue, we delve under the depths to show you just how your x86 machine pulls itself up by its bootstraps to start up Linux. x86 hardware has been around for a long time, and along the way, has accumulated lots of quirks and complexities in its boot process. We hope to make the picture a bit clearer. If that isn't enough, we introduce you to FreeBSD's unique package management system and finally, we've got your GUI firewall solution that makes it easy to protect yourself from those evil h4x0rs. Read on . . .

 

The Linux boot process on IA32 hardware

Normally, when an x86 computer is started up or the CPU reset is invoked, the first thing that the CPU does is to put itself in a known state, and load a specific address in its program counter. This address starts the execution of the Basic Input/Output System (BIOS), which is firmware (firmware is software that resides in memory that's read-only during most operations, usually in an EEPROM, hence "firm"). The BIOS's first task is to verify that it has enough RAM to properly start up, then check that a video device is available in the system. Once found, it initializes it, allocating resources using the Plug-and-Play protocol, if appropriate, and runs the BIOS found on a ROM chip on the video device. This provides a low-level interface (the so-called "int 0x10" interrupt handler) to allow the BIOS and subsequent stages to use the display.

The BIOS knows how to do a set of cursory tests, called the Power On Self Test (POST). These typically include detecting the CPU model and testing RAM. After performing these sanity checks, it looks for plug-and-play devices (AGP, PCI/PCI-X and ISA devices, typically), and allocates resources for them.

In modern BIOSes, the decisions taken at each of these stages are configurable via a setup program that is usually invoked during the boot process itself. These settings are stored in a non-volatile RAM (NVRAM) chip. (NVRAM is usually just normal DRAM that is kept in state via a battery of some sort — like the coin battery some motherboards have.)

After setting up resources for various system devices, the BIOS probes for "option ROMs" on various peripheral devices. (This is the case for both add-in cards on the PCI bus, as well as for devices integrated on the motherboard, such as a Serial ATA controller chip, a SCSI host bus adapter, or an integrated Ethernet devices that is accompanied by a ROM chip.)

The BIOS then looks for devices to boot off, as specified in the NVRAM. Usually, this entails looking for bootable removable media first, then the hard disks, if any, and then the option ROMs. If no appropriate boot device can be found attached to the onboard controllers, the BIOS will leave it to the option ROMs to attempt booting. (Of course, you can often cause the BIOS to try the option ROMs first through a configuration setting.)

The boot process is handled by the BIOS or the option ROM that knows about the device that is chosen to boot from. This means that if the boot device is on an onboard device that the BIOS knows about, the BIOS will handle booting; if it is attached to a peripheral with an option ROM, the option ROM will boot off it.

To boot off hard disks, floppy disks, Zip disks, LS-120 floptical disks and the like, typically, the first sector of the disk is loaded into a position in memory (physical addresses 0x7C0 through 0x9BF) and run. On floppies, this is the boot sector. On hard disks, this is the MBR (Master Boot Record) that also includes the partition table, which is a set of four 16-byte records that describe the primary and extended partitions on the hard disk. (Logical drives within the extended partitions are not described here; they're described within the extended partitions themselves, in a structure not dissimilar to the partition table.)

A default MBR starts with a program whose job it is to scan the partition table to find a partition marked Active for Boot. Once such a partition is found, the MBR uses BIOS disk services (called "int 0x13" services) to load the partition's boot sector into memory and passes control to it.

However, Linux bootloaders such as LILO and GRUB are often configured to take over the MBR for their own purposes, preserving the partition table, but placing their own first stage into it, so that the boot sector is never loaded. Instead, the bootloader's first stage focuses on retrieving its next stage off the disk and passing control to it.

The subsequent stages of the bootloader (a program designed to set up the system environment for the operating system kernel to be loaded and started) typically locate the kernel image in the appropriate partition (usually defined beforehand, but with some bootloaders, able to be provided at boot time by the user via a bootprompt) and load it off the hard disk into memory, and pass control to it. Advanced bootloaders such as GRUB are even able to read the filesystem on which the kernel image resides. (The likes of LILO have the location written into one of the bootloader stages as a physical location on the disk; this is why LILO users must run the lilo utility to recreate it every time they install a new kernel image.) The Linux kernel has the ability to boot off a compressed initial RAM disk image (initrd); the bootloader is what retrieves this image off the hard disk into memory, so that the kernel can uncompress it and boot off it. An initrd is necessary when you use a modular kernel that does not carry the drivers for, say, a SCSI device that the root filesystem is on. Since the kernel won't be able to get the module for the SCSI card until the SCSI disk is mounted — a Catch-22 situation — the module is placed in the initrd, which is loaded by the bootloader using the services of the SCSI card's option ROM that extend the standard BIOS disk services. The initrd will have the required drivers, and allow booting to proceed.

ISO 9660 filesystems as used on most bootable CD-ROMs and DVD-ROMs are special. An El Torito–compliant CD image can carry a floppy or hard disk image that the BIOS will load into memory and treat as though it really was a floppy disk or a hard disk. Alternatively, it can load a set number of 2KB sectors into memory and run them without emulating any device. Such a "no-emulation" system would be comprised of (surprise) the bootloader.

Some network cards incorporate a boot ROM that has a TCP/IP protocol stack, and is able to receive a BOOTP or DHCP lease off the network. BOOTP and DHCP are protocols used to automatically configure a network host's IP address, network mask, default route, name server, etc., and can also specify the location of a file for such a boot ROM to retrieve and run. The boot ROM will try to retrieve this file using the Trivial File Transfer Protocol (TFTP), and then run it. This file can be a network bootloader such as pxelinux (that comes as part of H. Peter Anvin's excellent SYSLINUX (ftp://ftp.kernel.org/pub/linux/utils/boot/syslinux/) bootloader suite by H. Peter Anvin. (The name SYSLINUX may seem familiar to you, since most recent distro installers and rescue floppy sets use variants of it to boot.) pxelinux uses the network boot BIOS' resources to load its configuration script using TFTP, and to load the Linux kernel indicated in it, and then to pass control to it.

As an alternative to such a BIOS (usually conforming to the PXE [Pre-Execution Environment] specification) on a network adapter, it is possible to use bootable removable media that achieves a similar effect. Vendors such as emBoot and Argon provide PXE-on-disk implementations (for-pay) that can be used to boot off a network adapter.

An alternative to using a PXE BIOS and pxelinux is to use a (free) Etherboot (http://www.etherboot.org/) floppy image (http://www.rom-o-matic.net/5.2.2) that can be written to a floppy disk (using dd or rawrite or some such tool) or a bootable CD-ROM image (using floppy emulation, prepared using a tool such as mkisofs or Ahead's Nero Burning ROM or similar). In each case, starting up the computer with no other bootable media will cause the system to attempt to boot off the network. Etherboot is not a PXE client implementation, and does not currently work with pxelinux. However, it has the ability to load and execute kernel NBIs (network boot images) prepared with the accompanying mknbi tool.

Code at the beginning of the kernel image rearranges things in memory and uncompresses the image. Next, the kernel sets up memory page tables and switches to protected-mode paged virtual addressing from the 8088-compatibility real mode that the CPU starts up in. Next, the CPU type and count are determined. Then, the kernel sets up interrupt handlers, which are routines that the hardware runs to notify the software of events. (These include things like timer ticks, completion of peripheral I/O, software or hardware failures, bad memory accesses [most of which are intentional], and so on, and are central to the operation of a modern computer.) It also sets up the terminal driver and a virtual terminal. It sets up various system caches. Next, it initializes interprocess communications and other infrastructure to support user tasks. It then starts initializing the peripheral buses (such as the PCI bus) and starts the device drivers built into it, where each driver looks at the peripheral bus device information and initializes the hardware it controls as appropriate. The network stack is also initialized. The root filesystem is identified and mounted. Finally, it loads and spawns off the init program, as well as the kernel idle thread. The scheduler invoked by the interrupt handler for the timer interrupt) kicks off the init program.

If an initial RAM disk was specified as the root device, it is decompressed, and running a /linuxrc program is attempted. Then the real root device is mounted, and the RAM disk is either freed up, or mounted at some mount point on the real root device, as specified.

If the root device is an export off an NFS server, the kernel attempts to configure the TCP/IP stack, using either parameters provided to it by the bootloader at boot time, or by BOOTP, DHCP or RARP. (Of course, this requires the device driver for the network card to have been initialized.) Then it attempts to contact the NFS server and mount the root filesystem. After this, init is run on the NFS root filesystem as usual.

init loads the /etc/inittab configuration file and runs the command (usually a script) specified in it. That command will then run other scripts and commands as needed to bring up the system. init has the concept of runlevels— 0, 1, 2, 3, 4, 5 and 6. Of these, runlevels 0, 1 and 6 have special meaning. Runlevel 0 is used to bring tasks down to the halted state. Runlevel 1 is "single-user mode" — an administrative mode where a single console is available, and a minimum number of tasks are running. Runlevel 6 is used to bring down tasks and reboot the system. The rest are distro-dependent.

/etc/inittab specifies what init needs to run when the runlevel changes. Typically, a script is run for each runlevel change, and that stops or starts programs depending on the system configuration and the runlevel being switched to. On System V–style setups, there are scripts (or symbolic links to them) in /etc/rc.d/rcS[0-6].d named in a certain way that are run with the parameter "start" or "stop" when the runlevel changes. Other setups will run scripts in other patterns.

Usually, inittab will specify that getty or one of its variants be run on various terminals to allow interactive login. (This can include a getty on a serial port, or a modem device, etc. for specialized uses). At this point, the system is ready for interactive use.

In the next issue, we'll explore how to take advantage of this flexible boot process in order to have the computer boot off the network, without requiring any disks at all.

 

Overview of the FreeBSD ports and packages collection

Package management on *nix systems is pretty important and stirs up nearly as much religious fervor as the operating systems themselves. People usually take point with binary- vs. source-based packaging, optimizations, quality assurance on packages and the package system, the sheer number of packages, and of course the plain flexibility and robustness of the system.

How does FreeBSD's Ports system compare to other package management systems? If you use Gentoo Linux you'll be pretty accustomed to the way Ports work — Gentoo models its Portage system on FreeBSD Ports — hence its name. The Port system is a lot more developed and mature than Portage, and far more comprehensive (9,662 ports at the time of this writing). On FreeBSD, the Ports system resides in /usr/ports by default. It is optional, and can be omitted entirely at install time (for instance, if one is setting up an embedded or otherwise customized system, or one only intends to install binary packages — more on this later). This is possible because the base FreeBSD system is maintained separately from the Ports collection, in contrast to most Linux distributions (that track the base OS install in the package system).

This is a double-edged sword. On the one hand, you can update tools such as Samba, Apache or XFree86 with impunity and be confident that it won't disturb your core OS install with unnecessary library upgrades or anything of that sort. On the other, you might want to use the automatic package-updating tools to track exploits and bug fixes in the base system. Staying on the security mailing lists will usually help you stay safe, but won't be an assurance of any kind of immediate exploit patching.

Another quality of the Ports system is that it is source-based, like Portage. This would to turn off people who strongly prefer the use of prebuilt binary packages for quick and consistent deployment. However, FreeBSD plays it both ways, satisfying proponents of both source-based packaging and binary package management: it provides binary and source packages at the same time. How does it do this?

Normally when you build a port, it is put into a package not unlike the tgz packages that Slackware uses. Then the system installs that package, and does some bookkeeping indicating that the package has been installed, what files were installed from it, what scripts should be run on removal, etc. That's handy enough for most administrators who may want to perform upgrades and installations among a fleet of machines. Dependency resolution and remote fetching comparable to the capabilities of apt and yum are still present when you install these packages. The FreeBSD team goes beyond this though: when they make a release, such as 5.2 due in the next couple months, they build packages of most of the things in Ports. And they host them on the mirrors. And the system will remotely fetch and install them if you ask it to. This provides you with the option of compiling — with customized optimization settings — ports where things like compiler optimizations do matter, and which are not expected to take a long time to build, and installing prebuilt binary packages of the rest.

If you are concerned about non-optimal compiler optimizations in these binary packages, don't fret; when you are building from source, you can still optimize away all you want. The settings are even stored in /etc/make.conf, just like in Gentoo Linux. There is an analog to Gentoo's USE flags as well, to allow you to control compile-time options. Usually it will tell you about important flags and you set them as environment variables when you build. More complex programs, like Samba, will actually give you a nice menu like so:

Missing image
Portmenu.png
Description

How else does it differ from Gentoo's Portage system? The anatomy of a port is a bit different from a Portage ebuild. The compilation and installation of a port is controlled by a Makefile, with various description files, readmes, manifests, and so on. Building a port is as simple as cd-ing into the directory of that port (e.g. /usr/ports/audio/rhythmbox) and issuing a make command. Just running make will pull down any dependencies and build those as well. You install the built port with a make install. You can save a step and just do a make install from the start. It'll be smart enough and notice it needs to be compiled for that to work.

How are updates handled? You typically pull down the latest copy of the Ports tree from a FreeBSD CVS mirror with CVSup (http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/cvsup.html). The use of CVSup is comparable with the use of rsync in Portage. Once that pulls down the latest tree, you can install the latest things. You can't update existing things — not automatically, anyway. This is one of users' most frequent complaints about the system — the complex task of tracking updates to packages isn't handled by the system itself — you have to install a helper utility named Portupgrade (http://www.freebsd.org/cgi/man.cgi?query=portupgrade&apropos=0&sektion=0&manpath=FreeBSD+5.1-RELEASE+and+Ports&format=html). Anyone using Ports should install this tool. Almost any shortcoming in fancy features is handled by it. Once you have it, you can upgrade away and do other things.

Outside the package system, CVSup is also used to upgrade the base OS. You can choose to pull down from many different versions of the system. Mainly, there are 4 types you'd want to pull:

Both CURRENT and STABLE are loosely versioned branches of development. CURRENT represents the bleeding edge of FreeBSD. Only the most adventurous users and system developers would want to pull this down. STABLE follows the major version releases. There is 4-STABLE and eventually, there is an intended 5-STABLE branch (when they feel it is truly stable enough for the sticklers). STABLE is endorsed for production use if you really know what you are doing, though RELEASEs (cut from the STABLE branch periodically) are strongly recommended. RELENG stands for Release Engineering (http://www.freebsd.org/releng/index.html). You would tell CVSup to pull down, say, RELENG_5_1 for you. This would get you FreeBSD 5.1 as it was released, plus the latest security patches for the base system. A RELEASE is the version you ask for precisely as it was released. If you were duplicating systems or perhaps stamping custom ISOs of a given version, this is what you'd want.

The Ports system has been around for a while now and is still undergoing evolution, in a sense. The system talked about here, what you get with FreeBSD, is pretty mature — it just gets new packages and otherwise stays the same. OpenBSD uses a variant of it, but their Ports tree is smaller and they encourage everyone to use the prebuilt binaries, unless there is a need to build your own — they have similar advice regarding their kernel, too. Gentoo's Portage is the most prominent variant, although it is more inspired by Ports than directly related to it. Anyone who runs Gentoo and has read this far can see the similarities. Likewise, Gentoo has made some improvements on the concept: the functionality provided by portupgrade comes with the base system, ebuilds are perhaps easier to write from scratch than Makefiles, and Portage runs primarily on Linux. With time, we might expect Portage to become as mature and powerful as the Ports system is on FreeBSD. Portage development continues to be inspired by Ports — there is some talk about switching to CVSup and building some sort of workalike to the menu system for choosing USE flags.

We find that the Ports and Packages system in FreeBSD is unique in its flexibility. It satisfies both optimization daredevils and administrators seeking to streamline package installation.

 

Cool app of the week

It has been said many times and many ways that security is a risk management process, not a technological solution. One of the first steps in managing this risk and maintaining a secure environment is having a proper firewall between your private network and the public Internet — it's the equivalent of locking your front door.

Linux is an excellent system to use as a firewall, since the kernel has the necessary plumbing integrated in the form of the netfilter/iptables firewall (http://www.netfilter.org/). Using iptables means you don't require an overly powerful machine or an expensive OS to serve on the front line — but you do need to know some arcane firewall rules syntax in order to implement a properly functioning firewall. Fortunately for those not well-versed in network theory or command line editing, Firestarter (http://firestarter.sourceforge.net/) provides an easy and rather foolproof way of implementing a iptables rule set that gives you the functionality you want.

Firestarter is a GNOME app that leads you through setting up the netfilter/iptables firewall setup with a wizard. Here, I have an old 400MHz AMD K6-III machine with 128MB of RAM, a 6GB disk and two NICs (Network Interface Cards) all rescued from the scrap heap. I'll be deploying this at a friend's home as a firewall and Samba server — I added a new 200GB disk as well for sharing files to his Windows XP machine. After installing a fresh copy of Fedora Core 1 and applying all available security updates to the OS (don't skip this step, since an insecure OS means an insecure firewall) I installed the Firestarter package through yum. There are packages available for other RPM-based distros and, of course, apt-gettable packages for Debian. Once installed, Firestarter had my firewall configured and running properly in about 5 minutes.

There is an excellent installation manual (http://firestarter.sourceforge.net/manual/wizard.php) available on the Firestarter web site, so I won't rehash that info here, but will provide a helpful hint or two for each step in the Setup Wizard. First off, all of your network connections should be set up and functioning properly before you run Firestarter. The first time Firestarter runs, it goes through a seven-step setup wizard that assumes all of the interfaces are up and running and your connections all work. The first screen of the wizard is just informational, so let's move right into the second screen:

Missing image
Firestarter2.png
Description

Firestarter uses the term "Internet Connected Device", which sounds straightforward enough. The term in firewall parlance is "Untrusted Interface", and it is the network interface that is exposed to the Internet, say plugged into your DSL or cable modem. This interface is what the firewall will apply its access rules to, so make sure you know which interface is which, or you'll be letting everyone on the Internet in while blocking your internal machines entirely. As the tip on the screen says, be mindful of PPPoE if you're on DSL, since you need to select the PPPoE device — not the NIC that PPPoE is bound to — as the Untrusted Interface. The firewall can do dial-on-demand for Internet access if you need it, so that selection is there. Most users will need to select "IP Address is Assigned via DHCP" — unless your ISP has blessed you with a static public IP address. That done, onto the next screen by clicking Forward.

Missing image
Firestarter3.png
Description

This screen is a little confusing, as there are two rather important decisions to make. It asks to enable or disable NAT (Network Address Translation) and to "select your internal network device." If you are sharing an internet connection on a network within a private IP space (192.168.0.0/24 for example) you must enable NAT, or only one internal machine will be able to use the firewall at a time. NAT is the technical term for "masquerading", or having the firewall substitute the IP address of its Untrusted Interface out there on the Internet in place of your internal, trusted IP address. Speaking of which, that is the next selection on this screen — which network device to use on the internal LAN. This is the "Trusted Interface" — the interface to the network you're trying to protect. This should be an Ethernet device using a static IP address in the same address space as your network clients. The address of the firewall is something you'll need to know in order to set up the client machine(s), since most or all of them will use the firewall as their default IP Gateway. The next button will autodetect the range of IP addresses that it should listen to. You can also click the bottom radio button and type in the address range that you want your firewall using in nnn.nnn.nnn.nnn/XX notation. The first four digits are the IP network address you want (say 192.168.1.0). The number after the slash specifies the number of bits in the netmask — /24 translates to 255.255.255.0 for example. This feature is for multiple routed LANS all using the same firewall, so most users won't need to change this. Once NAT has been enabled and the interface chosen, we can click Forward and move on.

Missing image
Firestarter4.png
Description

This screen will let you set up network services that are available through your firewall. Note that there is one lacking bit of functionality in Firestarter — implementing a DMZ (De-Militarised Zone) (http://www.webopedia.com/TERM/D/DMZ.html). This would add a larger measure of safety to allowing public access to services on your network. (Some other tools, such as gShield (http://muse.linuxmafia.org/gshield/), do. gShield in particular does not have a GUI or any monitoring capability, though.) For most home users, the lack of a DMZ feature in Firestarter isn't of grave concern and you can allow through whatever services you wish. (Please read your ISP's Terms of Service to make sure that this is allowed by your provider.) It would be advisable to leave everything closed at first. That way, you can make sure the firewall is functional and fix any problems before you add more things to debug.

Missing image
Firestarter5.png
Description

TOS filtering can prioritize network traffic according to the applications being served through your firewall. You can do things like assign a higher priority to the Citrix ICA protocol than the HTTP protocol in order to make sure the accounting app you're running on a corporate Citirx server stays alive while your family is surfing for Christmas carol lyrics. It's especially useful if you have a lot of FTP traffic through your firewall, as FTP can flood an entire connection and block access to other applications. In most cases, you can just click the Forward button and leave TOS disabled.

Missing image
Firestarter6.png
Description

The ICMP filter will help you if you're under a DoS attack, or want to try and avoid having your machine discovered. This is an advanced feature, so most times you can just click Forward again.

Missing image
Firestarter7.png
Description

This looks like another informational screen, but it's what saves your settings and starts your firewall. Once you click the Save button, you should be able to get to the Internet from your internal network — safely.

The functionality of Firestarter doesn't end with the setup of your iptables firewall — it's useful for starting and stopping the firewall service, monitoring who or what's trying to gain access to your network and for adding or changing firewall access rules as your needs change. You can even rerun the wizard if you don't like the rule set you've created the first time around.

Missing image
Firestarter8.png
Description

Firestarter up and running

The machine that this screen shot was taken on was connected to the internet for less than 4 hours, yet you can see 2 separate attempts to access SWAT via the public internet. SWAT is the web-based application that helps you set up Samba (http://us1.samba.org/samba/samba.html) for sharing files to your Windows client machines — it is most certainly something that you do not want exposed to the public internet. Having the firewall in place means that this service (if you have it running) is pretty much safe from being compromised.

Clicking on the Rules tab allows you to change the access rules of your firewall. If you decide to change the access rules, you should be sure of several things:

Missing image
Firestarter9.png
Description

Rules tab

The function that will be most commonly used will be the "Forwarded Ports" function. This lets you tell the firewall to pass packets destined for a particular service through the firewall to a particular internal host. As an example, say I have a web server running on an internal machine with a private IP of 192.168.1.20. In order to keep the existence of this server somewhat quiet, I want to access it on port 8080 instead of the default port of 80. Double clicking the "Forwarded Ports" on the above screen brings up the window below:

Missing image
Firestarter10.png
Description

I've filled in the rule fields on the screen to provide the function I described above. Clicking OK will make the internal web server accessible through port 8080 at the firewalls external, public IP address. The firewall daemon has to be stopped and restarted when you change your rules, since they are only read when the daemon launches. You can do this easily by using the Stop and Start buttons on the Firestarter main window. This procedure can be repeated for other protocols too, such as ssh (port 22) and FTP (port 21).

You can see that having a proper firewall protecting your LAN makes a lot of sense, and using Firestarter to set up a netfilter/iptables firewall makes it easy and inexpensive to do. Firestarter provides a wealth of functionality in a pretty, easy to use interface, making the installation and management of a Linux firewall a much less daunting task.

 

/dev/random