diff options
Diffstat (limited to 'chapter05/toolchaintechnotes.xml')
-rw-r--r-- | chapter05/toolchaintechnotes.xml | 214 |
1 files changed, 213 insertions, 1 deletions
diff --git a/chapter05/toolchaintechnotes.xml b/chapter05/toolchaintechnotes.xml index afc621085..9c2951387 100644 --- a/chapter05/toolchaintechnotes.xml +++ b/chapter05/toolchaintechnotes.xml @@ -7,6 +7,218 @@ <title>Toolchain Technical Notes</title> <?dbhtml filename="toolchaintechnotes.html"?> -<para>See testing</para> +<para>This section explains some of the rationale and technical +details behind the overall build method. It is not essential to +immediately understand everything in this section. Most of this +information will be clearer after performing an actual build. This +section can be referred back to at any time during the process.</para> + +<para>The overall goal of <xref linkend="chapter-temporary-tools"/> is +to provide a temporary environment that can be chrooted into and from +which can be produced a clean, trouble-free build of the target LFS +system in <xref linkend="chapter-building-system"/>. Along the way, we +separate from the host system as much as possible, and in doing so, +build a self-contained and self-hosted toolchain. It should be noted +that the build process has been designed to minimize the risks for new +readers and provide maximum educational value at the same time. In +other words, more advanced techniques could be used to build the +system.</para> + +<important> +<para>Before continuing, be aware of the name of the working platform, +often referred to as the target triplet. Many times, the target +triplet will probably be <emphasis>i686-pc-linux-gnu</emphasis>. A +simple way to determine the name of the target triplet is to run the +<command>config.guess</command> script that comes with the source for +many packages. Unpack the Binutils sources and run the script: +<userinput>./config.guess</userinput> and note the output.</para> + +<para>Also be aware of the name of the platform's dynamic linker, +often referred to as the dynamic loader (not to be confused with the +standard linker <command>ld</command> that is part of Binutils). The +dynamic linker provided by Glibc finds and loads the shared libraries +needed by a program, prepares the program to run, and then runs it. +The name of the dynamic linker will usually be +<filename class="libraryfile">ld-linux.so.2</filename>. On platforms that are less +prevalent, the name might be <filename class="libraryfile">ld.so.1</filename>, +and newer 64 bit platforms might be named something else entirely. The name of +the platform's dynamic linker can be determined by looking in the +<filename class="directory">/lib</filename> directory on the host +system. A sure-fire way to determine the name is to inspect a random +binary from the host system by running: <userinput>readelf -l <name +of binary> | grep interpreter</userinput> and noting the output. +The authoritative reference covering all platforms is in the +<filename>shlib-versions</filename> file in the root of the Glibc +source tree.</para> +</important> + +<para>Some key technical points of how the <xref linkend="chapter-temporary-tools"/> build +method works:</para> + +<itemizedlist> +<listitem><para>The process is similar in principle to +cross-compiling, whereby tools installed in the same prefix work in +cooperation, and thus utilize a little GNU +<quote>magic</quote></para></listitem> + +<listitem><para>Careful manipulation of the standard linker's library +search path ensures programs are linked only against chosen +libraries</para></listitem> + +<listitem><para>Careful manipulation of <command>gcc</command>'s +<filename>specs</filename> file tell the compiler which target dynamic +linker will be used</para></listitem> +</itemizedlist> + +<para>Binutils is installed first because the +<command>./configure</command> runs of both GCC and Glibc perform +various feature tests on the assembler and linker to determine which +software features to enable or disable. This is more important than +one might first realize. An incorrectly configured GCC or Glibc can +result in a subtly broken toolchain, where the impact of such breakage +might not show up until near the end of the build of an entire +distribution. A test suite failure will usually alert this error +before too much additional work is performed.</para> + +<para>Binutils installs its assembler and linker in two locations, +<filename class="directory">/tools/bin</filename> and <filename +class="directory">/tools/$TARGET_TRIPLET/bin</filename>. The tools in +one location are hard linked to the other. An important facet of the +linker is its library search order. Detailed information can be +obtained from <command>ld</command> by passing it the +<parameter>--verbose</parameter> flag. For example, an <userinput>ld +--verbose | grep SEARCH</userinput> will illustrate the current search +paths and their order. It shows which files are linked by +<command>ld</command> by compiling a dummy program and passing the +<parameter>--verbose</parameter> switch to the linker. For example, +<userinput>gcc dummy.c -Wl,--verbose 2>&1 | grep +succeeded</userinput> will show all the files successfully opened +during the linking.</para> + +<para>The next package installed is GCC. An example of what can be +seen during its run of <command>./configure</command> is:</para> + +<screen><computeroutput>checking what assembler to use... + /tools/i686-pc-linux-gnu/bin/as +checking what linker to use... /tools/i686-pc-linux-gnu/bin/ld</computeroutput></screen> + +<para>This is important for the reasons mentioned above. It also +demonstrates that GCC's configure script does not search the PATH +directories to find which tools to use. However, during the actual +operation of <command>gcc</command> itself, the same +search paths are not necessarily used. To find out which standard +linker <command>gcc</command> will use, run: <userinput>gcc +-print-prog-name=ld</userinput>.</para> + +<para>Detailed information can be obtained from <command>gcc</command> +by passing it the <parameter>-v</parameter> command line option while +compiling a dummy program. For example, <userinput>gcc -v +dummy.c</userinput> will show detailed information about the +preprocessor, compilation, and assembly stages, including +<command>gcc</command>'s included search paths and their order.</para> + +<para>The next package installed is Glibc. The most important +considerations for building Glibc are the compiler, binary tools, and +kernel headers. The compiler is generally not an issue since Glibc +will always use the <command>gcc</command> found in a +<envar>PATH</envar> directory. +The binary tools and kernel headers can be a bit more complicated. +Therefore, take no risks and use the available configure switches to +enforce the correct selections. After the run of +<command>./configure</command>, check the contents of the +<filename>config.make</filename> file in the <filename +class="directory">glibc-build</filename> directory for all important +details. Note the use of <parameter>CC="gcc -B/tools/bin/"</parameter> +to control which binary tools are used and the use of the +<parameter>-nostdinc</parameter> and <parameter>-isystem</parameter> +flags to control the compiler's include search path. These items +highlight an important aspect of the Glibc package—it is very +self-sufficient in terms of its build machinery and generally does not +rely on toolchain defaults.</para> + +<para>After the Glibc installation, make some adjustments to ensure +that searching and linking take place only within the <filename +class="directory">/tools</filename> prefix. Install an adjusted +<command>ld</command>, which has a hard-wired search path limited to +<filename class="directory">/tools/lib</filename>. Then amend +<command>gcc</command>'s specs file to point to the new dynamic linker +in <filename class="directory">/tools/lib</filename>. This last step +is vital to the whole process. As mentioned above, a hard-wired path +to a dynamic linker is embedded into every Executable and Link Format +(ELF)-shared executable. This can be inspected by running: +<userinput>readelf -l <name of binary> | grep +interpreter</userinput>. Amending gcc's specs file +ensures that every program compiled from here through the end of this +chapter will use the new dynamic linker in <filename +class="directory">/tools/lib</filename>.</para> + +<para>The need to use the new dynamic linker is also the reason why +the Specs patch is applied for the second pass of GCC. Failure to do +so will result in the GCC programs themselves having the name of the +dynamic linker from the host system's <filename +class="directory">/lib</filename> directory embedded into them, which +would defeat the goal of getting away from the host.</para> + +<para>During the second pass of Binutils, we are able to utilize the +<parameter>--with-lib-path</parameter> configure switch to control +<command>ld</command>'s library search path. From this point onwards, +the core toolchain is self-contained and self-hosted. The remainder of +the <xref linkend="chapter-temporary-tools"/> packages all build +against the new Glibc in <filename +class="directory">/tools</filename>.</para> + +<para>Upon entering the chroot environment in <xref +linkend="chapter-building-system"/>, the first major package to be +installed is Glibc, due to its self-sufficient nature mentioned above. +Once this Glibc is installed into <filename +class="directory">/usr</filename>, perform a quick changeover of the +toolchain defaults, then proceed in building the rest of the target +LFS system.</para> + +<sect2> +<title>Notes on Static Linking</title> + +<para>Besides their specific task, most programs have to perform many +common and sometimes trivial operations. These include allocating +memory, searching directories, reading and writing files, string +handling, pattern matching, arithmetic, and other tasks. Instead of +obliging each program to reinvent the wheel, the GNU system provides +all these basic functions in ready-made libraries. The major library +on any Linux system is Glibc.</para> + +<para>There are two primary ways of linking the functions from a +library to a program that uses them—statically or dynamically. When +a program is linked statically, the code of the used functions is +included in the executable, resulting in a rather bulky program. When +a program is dynamically linked, it includes a reference to the +dynamic linker, the name of the library, and the name of the function, +resulting in a much smaller executable. A third option is to use the +programming interface of the dynamic linker (see the +<emphasis>dlopen</emphasis> man page for more information).</para> + +<para>Dynamic linking is the default on Linux and has three major +advantages over static linking. First, only one copy of the executable +library code is needed on the hard disk, instead of having multiple +copies of the same code included in several programs, thus saving +disk space. Second, when several programs use the same library +function at the same time, only one copy of the function's code is +required in core, thus saving memory space. Third, when a library +function gets a bug fixed or is otherwise improved, only the one +library needs to be recompiled instead of recompiling all programs +that make use of the improved function.</para> + +<para>If dynamic linking has several advantages, why then do we +statically link the first two packages in this chapter? The reasons +are threefold—historical, educational, and technical. The +historical reason is that earlier versions of LFS statically linked +every program in this chapter. Educationally, knowing the difference +between static and dynamic linking is useful. The technical benefit is +a gained element of independence from the host, meaning that those +programs can be used independently of the host system. However, it is +worth noting that an overall successful LFS build can still be +achieved when the first two packages are built dynamically.</para> + +</sect2> </sect1> + |