Home Menu

Disk Management

 

Preparation


Contents


Disk Architecture

A hard drive is composed of a stack of platters, each with a read/write head. Each platter is further divided into tracks which are concentric circles where data can be written, and each track is divided into sectors where the data is written with various kinds of management information. This architecture is shown below.

A cylinder is a single track on a hard drive through all of the platters. Often the two terms are used interchangeably.

For example, a Maxtor 91303D6 13 gigabyte drive has 16383 cylinders 16 heads and 63 sectors for a total of 16,514,064 sectors. These are ostensibly 1024 byte sectors, but some space is wasted. Also, drives can be addressed differently due to onboard translation.

IDE (Integrated Device Electronics) drives can show a variety of different structures by interpreting the addresses.

The addressing for these drives can be determined arbitrarily, but we will assume that when addressing a particular sector, counting will begin with C(ylinder) 0, H(ead) 0, and S(ector) 1, or CHS = 0,0,1. Then, succeeding sectors will be CHS = 0,0,2, then 0,0,3, ... 0,1,1, 0,1,2, ... 1,0,1 and so on.

Partition Tables and Master Boot Records

Each operating system handles this problem differently, but they are all similar. The first sector on a disk is usually special. In the PC world, it is called the Master Boot Record, and it contains 512 bytes that contain the partition table and the bootup information.

When a machine is started, the firmware bios know where to find this information. It loads it into memory, and executes the master boot program. The layout of the MBR is:

The LILO boot program is an example of a program that might be found here.

The partition table has a maximum of four 16-byte entries of the form:

LBA is important because it removed the 1024 cylinder limit for the location of a partition. ATA-2 incorporated LBA and rid the industry of a significant limitation. When you are partitioning a hard drive and see a warning about a partition that exceeds the 1024 cylinder limit, you only need to worry if it is using ATA-1 or earlier hard drive technology.

A simple boot program looks through the partition table for the active partition (there can only be one) and boots that partition. More sophisticated boot managers (LILO, GRUB, NT Boot Manager) can handle multiple boots by reading disk files that contain additional information concerning bootable partitions.

After the boot sector, the remainder of the the first cylinder CHS = 0,0,2 through 0,0,n is unused and the first sector on the second cylinder, CHS = 0,1,1, begins the operating system portion of the disk. Depending on the operating system this can vary, but in general, it contans a boot sector for the operating system itself, information about the file system, and finally, data.


The Linux Disk Layout

There are a number of possible file systems that you might use under Linux, and each has its own disk layout. Not format, format is a term usually reserved for the low level disk structure. Here, we will talk about the ext2 file system, which is the Linux default for hard drives. ext3 is identical to ext2, except that it adds a journal file.

ext2 has a particular disk layout that it uses as does every operating system and/or file format within an operating system. While there is some variation, a Linux file system looks basically like this:

In modern systems, it is much to inefficient to keep all of the inodes at the beginning of a partition, so block groups are created which organize smaller parts of a partition so that the inodes related to groups of blocks are kept locally. A block group structure is:


Inodes and Directories

The basic unit of structure for an ext2 filesystem is an inode which has the following format:

There is actually quite a bit of additional information stored in the inodes of an ext2 file system which is more relevant to a course in operating systems.

Directory Structures

A directory in an ext2 file system is simply a file with the directory mode bit set. So when you create a directory, it allocates everything just like a regular file. When you add files to the directory, each file is provided an entry in the directory file with this format:

Consider the following directory tree, where the partition is mounted on /test.

Assuming that nothing else is going on with the disk, and that the files were created in a breadth-first order that the first data block is block 1000, the inodes will be:

Inode Type Size Direct Pointers Indirect
1dir1024 1000
2dir1024 1001
3dir1024 1002
4file20 1003
5file15,814 1004-1015 1016
6dir1024 1021

Block 1000 is the top level directory pointed to by inode 1. Note that inode 0 indicates that a reference is beyond the top of the partition mount point. Abbreviating the inode entries, the block contents are:

Similarly block 1001 is the dirone directory:

Block 1002 is the stuff directory:

Block 1003 is the file junk data block. Junk only contains 20 bytesk so 1003 has 20 bytes of data and nothing else.

Blocks 1004 through 1015 are the first 12 data blocks for the file named file1. Since it is longer than 12 times 1024 bytes, it also has an indirect block, 1016, which has pointer to the first set of indirect blocks. Since it will take an additional four blocks to store the 15,814 - 12,288 = 3526 bytes, it will have four addresses in it:

Each of the blocks (1017 - 1019) will have 1024 bytes of data, and 1020 will have the remaining bytes.

Finally, block 1021 contains the directory for morestuff:

Normally, directory trees don't get built sequentially like this, and files have a tendancy to grow and shrink, so the block assignments for a file are not usually in order.

Questions:

  1. Show the changes needed for the following sequence of changes:
    1. A file named newfile is added to the directory stuff with a size of 1200 bytes.
    2. The file junk is reduced to 13,500 bytes.
    3. Newfile is increased in size to 6400 bytes.
    4. The directory dirone is removed.
    5. A directory evenmorestuff is added to morestuff


Common Operations

Mounting a File System

Labels

A new trick in the 2.4 and later Linux kernels is the use of volume labels. Instead of mounting /dev/hda4, you can give /dev/hda4 a volume label and then mount using the label. For example, mount /usr instead of mount /dev/hda5. These labels are set using the e2label command or when building with mke2fs. By default, Red Hat 8.0 uses labels in the fstab file as well, so if you need to see the actual devices, enter mount without any parameters.

NFS mounts

To do an NFS mount, you need to export the particular file system from the host machine, and then mount it with a command such as the following: For example,

To export a file system so that it can be NFS mounted, you edit the file /etc/exports and enter a record of the form:

For example,

Unmounting a File System

Unmounting is very simple. You can specify either the device or the mount point, if the mount point is listed in /etc/fstab.

Remember that under Linux, data written to a removable device may not be written unless you unmount it properly or mount it with the sync option. So, mount the device, write to it and then umount it before removing it. If you remove it without unmounting, plan on doing the process again, and possibly with a reboot because it thinks the device is busy. The eject command will unmount and then eject the device for most removable devices.

automounting systems may help here, but only if the system is able to control the actions associated with the mechanical eject.

Miscellaneous

fstab

The way Linux can automatically mount your devices is that they are described in /etc/fstab. The format of the fstab file is:

  1. Device specification
  2. Mount point
  3. File system type
  4. Options
  5. Dump flag - should this device be backed up
  6. Check order - what order should they be checked at start up
An example is:

Device   Mount Point   Type    Options    Dump    Order
/dev/hda1    /    ext2    defaults    1   1
/dev/hda6    /home    ext2    defaults    1    2
/dev/cdrom    /mnt/cdrom    iso9660   noauto,owner,ro,user,exe c,dev,suid    0   0
/dev/hda5    /usr    ext2   defaults    1   2
/dev/hda7    swap    swap   defaults    0   0
/dev/fd0    /mnt/floppy    ext2    noauto,owner,user,exec,d ev,suid    0   0
/dev/fd0    /mnt/dosfloppy   vfat   noauto,owner,user,exec,d ev,suid    0   0
none    /proc    proc    defaults    0   0
none    /dev/pts    devpts   gid=5,mode=620    0   0
/dev/hdd4    /mnt/doszip    vfat    noauto,owner,rw,user,exe c,dev,suid   0   0
/dev/hdd4    /mnt/zip    ext2   noauto,owner,rw,user,exe c,dev,suid   0   0
clowns:/home/bozo    /home/bozo/nfs    nfs    defaults   0   0

User Mounts

If you want to allow ordinary users to mount devices, you can put that information in fstab, even if the devices are not always mounted. For example, if you wanted a user to be able to do an NFS mount when needed, you could put the following in your fstab:

Any user with permission to access /users/bozo/csfiles can perform the mount by entering:

Automounting

If you want to automount removable storage devices like zip disks and CD-ROMS, you need to run the autofs daemon at system startup. It will read /etc/auto.master which describes mount points that are to be automounted. If an access is attempted on one of those mount points, it will attempt to mount the device. If the device is not used for a certain period of time, it will unmount it.

auto.master sets up the mountpoints for autofs. Each line describes a mountpoint and the file that describes the specific file systems for the mountpoint. For example,

Describes the /vol mountpoint which is further described in /etc/auto.vol. The timeout value is the wait time until an attempt to automount is aborted. For example, if you put a CD in your reader and attempt to read it, after 60 seconds the automounter will give up. The automounter also unmounts devices after a period of inactivity (typically 600 seconds). If it auto-unmounted and then the device is accessed again, autofs will try to remount it.

auto.vol describes the things that will be mounted on /vol. For example,

describes the mount properties for a dvd, cd-rw, zip and floppy disk. Why not simply use /mnt. There isn't a specific reason, but /mnt is used for other things and in order to prevent conflicts, its not a bad idea to put the auto mountpoints somewhere else.

Other Files

Two other files of interest are /etc/mtab and /proc/mounts. Both of these files contain the information about the currently mounted file systems and are very similar. It appears that at some point in time the traditional /etc/mtab file will be completely replaced by /proc/mounts.

The root of your filesystem tree is /proc/mounts that has a list of all of the mounts. Any absolute pathname (starting with a /) begins by finding the correct filesystem to use.


Creating File Systems

One of the tasks that system administrators are charged with is storage management. This includes:

Partitioning

sfdisk - a more sophisticated partition program based on DRDOS, a popular MSDOS product. More options, and basically the same results. However, it requires that you unmount the drive before starting.

cfdisk - a curses based (sort of graphical) partitioning tool. Easier to use and easier to screw up if you aren't careful. It displays a list of partitions and their parameters and then provides commands to:

Bootable   -   Make a partition bootable
Delete   -   Delete a partition
Maximize   -   Maximize the size of a partition
Print   -   Print to screen or file
Type   -   Set the partition type
Units   -   Change the display units
Write   -   Write partition table

The interface looks like this:

parted

People familiar with the Windows partitioning program Partition Magic will appreciate parted, which does the same functions for free. Theoretically partitions are not operating system dependent, so its really a matter of what the software runs on.

parted comes from the fine folks at GNU and will allow you to create, destroy, resize, move and copy ext2, FAT and FAT32 partitions without the loss of data (obviously not if you destroy something). They always warn you to backup your data and since they wrote the software, that's probably good advice. parted is easy to use and very fast. Like Partition Magic, it saves a bunch of operations and then updates all at once. Unlike PM, parted does not have a graphical interface, and in fact, is the backend processor for DiskDruid which you used to partition your hard drive during installation.

If you use Partition Magic (probably from a DOS boot floppy) be a little careful. PM saves a list of operations and performs them all when you apply the changes (although it appears that the are being done in real-time in the interface). If you are going to do quite a few things, it is best to do a couple, let PM perform the operations and then do a couple more things and so on, rather than building up a backlog of operations. Experience says that this will prevent potential problems resulting in data loss, heartbreak, unemployment, depression and/or someone tatooing moron on your forehead.


Creating a File System

A file system is created with mkfs command, which has the format:

There are a number of specific commands to make this easier, such as:

These commands have a wide variety of specific options for the file system type. Normally, the defaults are reasonable.

Some examples are:

Once the file system is created, it is ready to be mounted and used.

ext2 and ext3

ext3 is simply ext2 with a journal. This journal file is typically hidden so you can't see it, but all operations on the partition are written to this file. If the system crashes (or maybe even the disk), the journal file can be used to recreate the disk. Because the remainder of the file system is the same, you can mount an ext3 filesystem as ext2 if you need to do so. Likewise, it is easy to convert a from ext2 to ext3 with the command:


File Permissions

File permissions are the mechanism used by Unix in general and Linux to control file access. The file permissions are organized by types of user and types of access.

These permissions represent a 12-bit value that is organized in the following way:

  • Execute for a directory means it can be searched.
  • Set-uid means that anyone that executes an executable with this permission takes on the uid of the owner while executing. Is this dangerous?
  • Set-gid is ditto, but for gid, rather than uid.
  • The sticky bit can force an executable to stay in memory. This is no longer of much use.

    Setting file permissions

    The program for setting file permissions is chmod. The syntax is:

    Because the permissions are a bit map, you can set the permission

    Command    Owner Group User Uid Gid Sticky
    chmod 644   rw_r__r_____
    chmod 755   rwxr_xr_x___
    chmod 6700   rwx______ss_

    Or you can use key letter, as indicated above.

    Command Owner Group User Uid Gid Sticky
    chmod o+rw gu+rrwxr__r_____
    chmod o+rwx gu+rx rwxr_xr_x___
    chmod o+rwx og+s rwx______ss_

    When files are listed with ls -l, as shown below, the permissions are shown as foooggguuu, where f indicates a file is a regular file, a link, a device or a directory. The other groups are the permissions for owner, group and user respectively. There is no room for the setuid and setgid bits, so they replace the x bits for owner or group as shown in the last entry.

    Setting File Ownership

    File ownership has two parts - the owner and the group of the owner. If you list your files with the command ls -l, you will see something of the form:

    After the permissions and the file size in blocks there are two names - the first is the owner identifier and the second is the owner group. You cannot change your user id, but you can belong to several groups and you can set your current group with the newgrp command.

    Only the root user can change the ownership of a file. This is done with the command:

    For example,

    Users can change the group, but only files that they own, and only to a group that they belong to, as described in /etc/group.

    File Permission Policies


    The Linux File Layout

    Every operating system has a standard layout that you need to be familiar with. For Linux and most Unix's, it is:


    Debugging and Analyzing File Systems

    There are a few tools that can be quite useful in solving problems with file systems and disk drives, or just for collecting information.

    file

    The file command attempts to determine the type of a file. To do this, it tests what the filesystem knows, then it checks for magic numbers in the files, and finally, it reads the file data to try and figure it out. This is a fairly complicated process, but it is accurate an large percentage of the time. The operation of file depends on the contents of /usr/share/magic and /usr/share/magic/magic.mime which contain information about various file types that file uses.

    Syntax:   file [options] file [file] ...

    Important Options:

    Option    Meaning
    -b    Don't prepend the file name
    -f namefile   Read the names from a namefile (one per line)
    -i    Output the names as MIME types
    -n    Flush stdout after each name (useful when piping output)
    -z    Try to look inside compressed files

    Example

    stat

    stat is the foundation program for files, as it returns the filesystem information about a file

    Syntax:   stat [options] filename [filename] ...

    Important Options:

    Option   Meaning
    -f   stat the filesystem where the file is located instead of the file
    -t    Terse output mode

    Example

    fsck or e2fsck

    fsck is the general purpose file system check and repair utility and e2fsck is fsck for a file system of type ext2 (and ext3) but with more options. Since ext2 filesystems are the most common, it will be discussed here. e2fsck and fsck will check and possibly repair a file system. The fsck programs should only be run on unmounted file systems; forcing a repair could damage the system as mounted file systems are dynamic.

    Syntax:   e2fsck [options] [fsck-option] filesys [...]

    Important Options:

    Example

    dumpe2fs

    This command prints the superblock and blocks group information for the file system on a device. dumpe2fs program should only be run on unmounted file systems.

    Syntax:   dumpe2fs [options] device

    Important Options:

    Example

    debugfs

    debugfs allows you to debug a file system. It is dangerous in that it allows you to manually change the state of an ext2 file system, but it is quite handy when you have to have it. You must be superuser to use dumpe2fs.

    Syntax:   debugfs [options] device

    Important Options:

    debugfs is an interactive program with the following set of commands.

    Important Commands:

    As you can see debugfs is a powerful tool for fixing and (or destroying) file systems. Use it with care. The examples show a few of the possible uses.

    Example

    Related Tools

    tune2fs

    tune2fs allows you to modify the tuneable parameters on a file system of type ext2. You must be superuser to run tune2fs and the filesystem should be unmounted if you make changes.

    Syntax:   tune2fs [options] device

    Important Options:

    Example

    smartmon

    SMART


    Backup and recovery

    Using tar for Backup

    The ubiquitous tar (tape archive) command has been the standard method for backing up systems since the early days of Unix.

    Using dump for Backup