GNU Compiler Collection


A set of notes and recipes about the GNU Compiler Collection, mainly related to the C frontend

Basics

Compilation is a multi-stage process involving the compiler itself, the assembler, and the linker. This complete set of tools is referred to as a toolchain. We may manually go through this multi-stage process like this:

  1. Use cpp to expand each source’s preprocessor directives

    cpp file.c > file.i
  2. Convert each resulting source file into assembly code

    gcc -Wall -S file.i -o file.s
  3. Convert assembly code into machine code

    as file.s -o file.o
  4. Create a final executable using the linker. This command usually very complex and system-specific. We can check the command that gcc uses in a specific system by running gcc in its verbose mode. On my particular system, this is:

/usr/bin/ld \
  -dynamic \
  -arch x86_64 \
  -macosx_version_min 10.13.5 \
  -weak_reference_mismatches non-weak \
  -o main \
  -L/usr/local/Cellar/gcc/7.3.0_1/lib/gcc/7/gcc/x86_64-apple-darwin17.3.0/7.3.0 \
  -L/usr/local/Cellar/gcc/7.3.0_1/lib/gcc/7/gcc/x86_64-apple-darwin17.3.0/7.3.0/../../.. \
  main.o \
  -no_compact_unwind \
  -lSystem \
  -lgcc_ext.10.5 \
  -lgcc \
  -lSystem

Common Options

GCC exposes many frontend-independent options that are sometimes platform-dependent. These can be accessed using the -m option. Here are some common ones:

Search Paths

The various default search paths may include system-dependent or installation-specific directories.

Include Path

The include path determines where the preprocessor will look for header files. We can preppend a directory to the include path by using the -I option, which may be used multiple times. Alternatively, we can use the C_INCLUDE_PATH (or CPLUS_INCLUDE_PATH for C++), where we may put multiple paths separated by colons. Note that -I takes precedence over the environment variables, which in turn take precedence over the system include directories.

We can inspect the default include path by running:

echo | gcc -E -Wp,-v -

From https://unix.stackexchange.com/a/77781/43448.

Library Search Path

Also called link path, determines where the linker will look for static and shared libraries. We may use the -L option, or the LIBRARY_PATH environment variable, to preppend directories to this search path. The same precedence rules as with the include path apply.

Load Library Path

The load library path determines the places the linker will check when resolving shared libraries at runtime. We may tweak this directory list using the LD_LIBRARY_PATH environment variable. GNU systems may have a /etc/ld.so.conf configuration file as well.

We can inspect the default library path by running:

ld -v 2

Compiler Warnings

GCC will not output any warnings by default. Its recommended to always enable -Wall, which can check most common issues. The -Werror option can be used to turn all warnings into errors. Data-flow analysis is not performed unless you compile with optimizations, so the optimization level -O2 is recommended to get the best warnings.

Here are some of the most important options enabled by -Wall:

The -W option is another general option such as -Wall, which warns about a selection of common programming errors. In practice, the options -W and -Wall are used together.

Other additional warnings include:

Optimizations

The compiler can optimise for speed or binary size, usually at the expense of the other. GCC provides various optimization levels, as well as some individual options for specific types of optimizations.

Optimizing a program makes debugging more complex, and increases resource usage during compilation. In most case, we can use -O0 debugging, and -O2 for development and releases. Optimizations may have a negative impact in certain programs, so the rule of thumb is to always measure before commiting to any optimization options.

Enabling any optimization level triggers data flow analysis, which may result in further warnings.

Preprocessor

We may set an preprocessor macro during compilation with the -D option. For example: gcc -Wall -DFOO main.c, or gcc -Wall -DBAR=BAZ main.c.

When including headers, the only difference between #include "file.h" and #include <file.h> is that the former looks at the current working directory before looking at the system include directories.

The compiler usually defines some macros on the reserved namespace (prefixed with double undercores) by default, and some small number of system-specific macros. We can check these by running:

$ cpp -dM /dev/null
#define OBJC_NEW_PROPERTIES 1
#define _LP64 1
#define __APPLE_CC__ 6000
#define __APPLE__ 1
#define __ATOMIC_ACQUIRE 2
#define __ATOMIC_ACQ_REL 4
#define __ATOMIC_CONSUME 1
#define __ATOMIC_RELAXED 0
#define __ATOMIC_RELEASE 3
#define __ATOMIC_SEQ_CST 5
#define __BIGGEST_ALIGNMENT__ 16
#define __BLOCKS__ 1
...
#define __x86_64 1
#define __x86_64__ 1

These may be disabled with the -ansi option.

The programmer may print the result of applying the preprocessor over a source file by running:

gcc -E file.c

The resulting source code is printed to standard output.

Compatibility

We can force GCC to adhere to specific language standards using the -std option.

GNU Extensions

The GNU C Library provides macros to control POSIX extensions (__POSIX_C_SOURCE), BSD extensions (__BSD_SOURCE), SVID extensions (__SVID_SOURCE), XOPEN extensions (__XOPEN_SOURCE) and GNU extensions (__GNU_SOURCE).

The __GNU_SOURCE macro enables all these extensions together, with the POSIX ones taking precedence over the others in case there are conflicts. This macro will enable the C library extensions even when compiling with the -ansi option.

See https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html.

Static Libraries

A static library is a collection of precompiled object files that may be copied into the final executable using static linking. These object files are joined using the ar archive utility, into libNAME.a files.

Given a static library libfoo.a, you can include it directly on the final executable: gcc -Wall main.c libfoo.a -o main, or use the -l shortcut: gcc -Wall main.c -L. -lfoo -o main. The -l option will default to a shared library if one is available.

In order to create a static library:

  1. Compile the source files into objects:

    gcc -Wall file1.c file2.c fileN.c -c
  2. Create an archive out of the object files:

    ar cr libfoo.a file1.c file2.c fileN.c

We can inspect an archive’s “table of contents” with ar t:

$ ar t libfoo.a
__.SYMDEF SORTED
file1.o
file2.o
file3.o
...

Shared Libraries

A shared library is a collection of precompiled object files that are resolved at runtime. In order to create a shared library with GCC:

  1. Compile the source files into objects:

    gcc -fPIC -Wall file1.c file2.c fileN.c -c

The PIC option tells GCC to generate Position Independent Code, which means the linker will not need to relocate anything from the .text section (i.e. because it uses relative jumps), and thus the library’s code can be loaded as read-only, and shared among different programs.

  1. Create a shared library using the -shared option:

    gcc -shared -o libfoo.so file1.o file2.o fileN.o
  2. Link against the shared library as usual:

    gcc -Wall -lfoo -o test test.c

The library will be dynamically loaded given its present on the load library path. We can inspect the test binary to see the shared libraries its pointing to:

test:
        libfoo.so (compatibility version 0.0.0, current version 0.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.50.4)

Code Coverage

First, compile a program with -fprofile-arcs and -ftest-coverage, which adds additional instructions to record what lines have been executed, and then run the binary as many times as needed.

GCC will create .gcda and .gcno files recording the code path information that we can parse by calling gcov with the name of the binary:

$ gcov
File 'main.c'
Lines executed:85.71% of 7
Creating 'main.c.gcov'

We can then inspect the .gcov files, which are annotated versions of the source files stating what code paths have not been executed.

Resources