Introduction to ELF (Executable & Linkable Format)

Iheb Yahyaoui
5 min readJan 26, 2023
Figure 1: A brief example of an elf file and the data it contains

A binary format is always needed in every generic/standard operating system. Today we are going to dig deep down through Linux and its famous format ELF.
Although, Linux does not mandate an extension for ELF files (it can be *.bin, *.so and more (it can be without any extension also)) Executable & Linkable Format files are usualy used for executables, kernel models, shared libraries, core dumps and object files.
Furthermore, ELF is way flexible, and it is not bound to any particular processor or instruction set architecture.

Generally speaking, ELF files are composed of three major components (ELF Header, sections and segments). Each of this elements play a different role in the linking/loading process of ELF executables. Let’s become familiar with the structure of each component:

ELF Header

Figure 2: ELF header structure

The ELF header is denoted by an Elfxx_Ehdr structure. Mainly, this contains general information about the binary. Definitions of these structure’s fields are the following:

  • e_ident: Array of 16 bytes containing identification flags about the file, which serve to decode and interpret the file’s contents. Examples of these identification flags include:
    -EI_MAG0–3: ELF magic
    - EI_CLASS: File class.
    - EI_DATA: File’s data encoding.
    - EI_VERSION: File’s version.
    - EI_OSABI: OS/ABI identification.
    - EI_ABIVERSION: ABI version.
    - EI_PAD: Start of padding bytes.
    - EI_NIDENT: Size of ei_ident.
  • e_type: Type of executable.
  • e_machine: File’s architecture.
  • e_version: Object file version.
  • e_entry: Entry point of application.
  • e_phoff: File offset of the Program Header Table.
  • e_shoff: File offset of the Section Header Table.
  • e_flags: Processor-specific flags associated with the file.
  • e_ehsize: ELF header size.
  • e_phentsize: Program Header entry size in Program Header Table.
  • e_phnum: Number of Program Headers.
  • e_shentsize: Section Header entry size in Section Header Table.
  • e_shnum: Number of Section Headers.
  • e_shstrndx: index in Section Header Table Denoting Section dedicated to Hold Section names.

In order to preview these fields for a given ELF binary, we can use any ELF parser of choice. A common tool to quickly parse ELF files is the readelf utility from GNU binutils.

In order to use readelf so that we can display the contents of the ELF header for a given executable, we can use the following command:

readelf -h <executable>

Sections

Figure 3: ELF sections structure

Sections comprise all information needed for linking a target object file in order to build a working executable. (It’s important to highlight that sections are needed on linktime but they are not needed on runtime.) In every ELF executable, there is a Section Header Table. This table is an array of Elfxx_Shdr structures, having one Elfxx_Shdr entry per section. Definitions of these structure’s fields involve:

  • sh_name: index of section name in section header string table.
  • sh_type: section type.
  • sh_flags: section attributes.
  • sh_addr: virtual address of section.
  • sh_offset: section offset in disk.
  • sh_size: section size.
  • sh_link: section link index.
  • sh_Info: Section extra information.
  • sh_addralign: section alignment.
  • sh_entsize: size of entries contained in section.

Some common sections are the following:

  • .text: code.
  • .data: initialised data.
  • .rodata: initialised read-only data.
  • .bss: uninitialized data.
  • .plt: PLT (Procedure Linkage Table) (IAT equivalent).
  • .got: GOT entries dedicated to dynamically linked global variables.
  • .got.plt: GOT entries dedicated to dynamically linked functions.
  • .symtab: global symbol table.
  • .dynamic: Holds all needed information for dynamic linking.
  • .dynsym: symbol tables dedicated to dynamically linked symbols.
  • .strtab: string table of .symtab section.
  • .dynstr: string table of .dynsym section.
  • .interp: RTLD embedded string.
  • .rel.dyn: global variable relocation table.
  • .rel.plt: function relocation table.

In order to display sections using readelf, we can use the following command:

readelf -S <executable>

Segments

Figure 4: ELF segments structure

Segments, which are commonly known as Program Headers, break down the structure of an ELF binary into suitable chunks to prepare the executable to be loaded into memory. In contrast with Section Headers, Program Headers are not needed on linktime.

On the other hand, similarly to Section Headers, every ELF binary contains a Program Header Table which comprises of a single Elfxx_Phdr structure per existing segment. Definitions of these structure’s fields are the following:

  • p_type: Segment type.
  • p_flags: Segment attributes.
  • p_offset: File offset of segment.
  • p_vaddr: Virtual address of segment.
  • p_paddr: Physical address of segment.
  • p_filesz: Size of segment on disk.
  • p_memsz: Size of segment in memory.
  • P_align: segment alignment in memory.

There are a wide range of segment types. Some of common types are the following

Figure 5: Legal values for p_type
  • PT_NULL: unassigned segment (usually first entry of Program Header Table).
  • PT_LOAD: Loadable segment.
  • PT_INTERP: Segment holding .interp section.
  • PT_TLS: Thread Local Storage segment (Common in statically linked binaries).
  • PT_DYNAMIC: Holding .dynamic section.

Something important to highlight about segments is that only PT_LOAD segments get loaded into memory. Therefore, every other segment is mapped within the memory range of one of the PT_LOAD segments.

As previously mentioned, sections are responsible for gathering all needed information to link a given object file and build an executable splitted into segments with different attributes by the Program Header. Eventualy, they will be loaded into memory.

To play more with ELF files Linux provides us with three commands as shown below:

readelf: displays information about one or more ELF format object
files. The options control what particular information to
display.

objdump: displays information about one or more object files.
The options control what particular information to display.
This information is mostly useful to programmers who are
working on the compilation tools, as opposed to programmers who
just want their program to compile and work.

nm: lists the symbols from object files objfile.... If no object files
are listed as arguments, nm assumes the file a.out.

Note : something important to highlight about readelf it performs a similar
function to objdump but it goes into more detail and it exists
independently of the BFD library, so if there is a bug in BFD then
readelf will not be affected.

Refrences & Recources:

https://www.intezer.com/blog/research/executable-linkable-format-101-part1-sections-segments/
https://medium.com/ax1al/a-brief-introduction-to-executable-linkable-format-1ed9a3fdcc91
https://man7.org/linux/man-pages/man1/readelf.1.html
https://linux.die.net/man/1/objdump
https://linux.die.net/man/1/nm

Thank you for reading.

If you have any questions, feel free to comment and, if you would like to see more like these articles please clap your hands 👏👏👏

--

--

Iheb Yahyaoui

Software Developer/ Holbertoon School Student / Blogger / Bibliophile ! LinkedIn: https://www.linkedin.com/in/ihebyh/