Linking/Loading

Last Updated 07/23/2024 17:13:36
Welcome to my presentation on linking and loading. This presentation is designed to help students understand the basics of loading and linking in Linux. Note that it tends to focus on the dynamic linking ability of Linux.

It begins with a short discussion of terms and definition followed by moving into loading and linking in a little more detail. It should provide you with a basic to intermediate understanding of the Linux run-time linker and the ELF binary format. It focuses on both the Linux 2.4 Kernel and the run-time linker code from glibc 2.3.3.

Here is a brief overview of the presentation:

Run-time linking allows Linux to resolve undefined symbol references during either the load of a program or the execution of a program. It should be noted that this type of activity is not required for binaries that are considered statically linked. In this case the portions of a particular code base that are required for execution are copied into the actual binary at compile time. Loading provides for the movement of a binary into main memory, along with the loader passing execution from itself to the binary.

At compile time the symbol reference fixing, data relocation, and object combination is done by ld - the linker. At run-time the necessary operations are completed by ld.so - the dynamic linker and loader. Many configurations can be made to ld.so, of which one important one is the LD_BIND_NOW environment variable. If set to a non-empty string ld.so will resolve all symbols at the time a given shared object library is loaded. If not set, which seems the standard, symbol resolution follows the idea of lazy binding. Lazy binding indicates that symbols will not be resolved and bound until their first call by the binary in question even though the libraries are all located and loaded at load time. Lazy binding uses, among others, two specific sections of the ELF format - the .plt (procedure linkage table) and the .got (global offset table). When ld.so loads a library it will set up the read-write .got to have addresses that point back into the .plt which turns out to be the section where a particular procedure is called from. The first time a procedure is called it jumps to an address stored in the .got, which points back to the .plt. The ensuing instructions end up calling a fixer function, aptly named fixup, in ld.so to rewrite the .got entry so that the next call to that procedure will correctly be directed to the necessary instructions.

The ELF binary file format, briefly mentioned above, is currently the standard binary format used on most Linux systems. It is built up of segments and sections, and has three potential types of files: relocatable object files (.o's), shared object files (.so's), and executable object files (the binaries). All these formats have some sections and segments in common, and some that are not in common. Certain binutils can be used to help look at and learn about the ELF format. The two most notable are readelf and objdump. Using information output by those utilities an enquirer should, for example, be able to follow the instruction path of a particular function call before and after its 'fixup'.

 

Links

Basic Definitions/Acronyms

  • Linking: The combination of multiple code pieces into a single piece of executable code
  • Loading: The process of locating an application, potentially on disk, and loading it into main memory followed by passing it control to execute
  • ELF: Executable and Linking Format
  • PIC: Position Independent Code

Kernel File References

  • [linux_src]/fs/exec.c
  • [linux_src]/include/linux/binfmts.h
  • [linux_src]/fs/binfmt_elf.c
  • [linux_src]/include/asm-i386/processor.h

glibc File References

  • [glib_src]/elf/rtld.c
  • [glib_src]/elf/dl-deps.c
  • [glib_src]/elf/dl-load.c
  • [glib_src]/elf/dynamic-lib.h
  • [glib_src]/elf/do-lookup.h
  • [glib_src]/elf/dl-runtime.c
  • [glib_src]/sysdeps/i386/dl-machine.h

Helpful Commands

  • readelf -a sharedlib.so
    (produces all the readelf information from sharedlib.so for the header and sections/segments)
  • objdump -D -z -j .got -j .plt somebinary
    (produces the dissasembled code for the got and plt sections with zero valued instrunctions included)