aboutsummaryrefslogtreecommitdiffstats
path: root/chapter07
diff options
context:
space:
mode:
authorArchaic <archaic@linuxfromscratch.org>2005-12-26 19:00:06 +0000
committerArchaic <archaic@linuxfromscratch.org>2005-12-26 19:00:06 +0000
commit5536f7440f2f4a12782e8d741cbbba5f1c3cfea8 (patch)
treea0f3a22ecc9c0eedfd891d54e9acf06502d2b07f /chapter07
parent2550494b85359ff42bb95a72911c7751cbce14d4 (diff)
Applied Alexander Patrakov's patch which adds UTF-8 capability to the
development branch of the LFS Book. git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@7235 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689
Diffstat (limited to 'chapter07')
-rw-r--r--chapter07/bootscripts.xml6
-rw-r--r--chapter07/console.xml261
-rw-r--r--chapter07/profile.xml53
3 files changed, 230 insertions, 90 deletions
diff --git a/chapter07/bootscripts.xml b/chapter07/bootscripts.xml
index 775215e7e..6e884ac77 100644
--- a/chapter07/bootscripts.xml
+++ b/chapter07/bootscripts.xml
@@ -47,6 +47,12 @@
<screen><userinput>make install</userinput></screen>
+ <para>The <command>console</command> script that comes with
+ LFS-Bootscripts-&lfs-bootscripts-version; doesn't support Unicode. Install
+ a replacement version:</para>
+
+<screen><userinput>install -m755 ../console /etc/rc.d/init.d</userinput></screen>
+
</sect2>
<sect2 id="contents-bootscripts" role="content">
diff --git a/chapter07/console.xml b/chapter07/console.xml
index 315112366..b0b9417a3 100644
--- a/chapter07/console.xml
+++ b/chapter07/console.xml
@@ -17,96 +17,207 @@
<para>This section discusses how to configure the <command>console</command>
bootscript that sets up the keyboard map and the console font. If non-ASCII
- characters (e.g., the British pound sign and Euro character) will not be used
- and the keyboard is a U.S. one, skip this section. Without the configuration
- file, the <command>console</command> bootscript will do nothing.</para>
+ characters (e.g., the copyright sign, the British pound sign and Euro symbol)
+ will not be used and the keyboard is a U.S. one, skip this section. Without
+ the configuration file, the <command>console</command> bootscript will do
+ nothing.</para>
<para>The <command>console</command> script reads the
<filename>/etc/sysconfig/console</filename> file for configuration information.
Decide which keymap and screen font will be used. Various language-specific
- HOWTO's can also help with this (see <ulink
- url="http://www.tldp.org/HOWTO/HOWTO-INDEX/other-lang.html"/>. A pre-made
- <filename>/etc/sysconfig/console</filename> file with known settings for several
- countries was installed with the LFS-Bootscripts package, so the relevant
- section can be uncommented if the country is supported. If still in doubt, look
- in the <filename class="directory">/usr/share/kbd</filename> directory for valid
- keymaps and screen fonts. Read <filename>loadkeys(1)</filename> and
- <filename>setfont(8)</filename> to determine the correct arguments for
- these programs. Once decided, create the configuration file with the following
- command:</para>
-
-<screen><userinput>cat &gt;/etc/sysconfig/console &lt;&lt;"EOF"
-<literal>KEYMAP="<replaceable>[arguments for loadkeys]</replaceable>"
-FONT="<replaceable>[arguments for setfont]</replaceable>"</literal>
+ HOWTO's can also help with this, see <ulink
+ url="http://www.tldp.org/HOWTO/HOWTO-INDEX/other-lang.html"/>. If still in
+ doubt, look in the <filename class="directory">/usr/share/kbd</filename>
+ directory for valid keymaps and screen fonts. Read
+ <filename>loadkeys(1)</filename> and <filename>setfont(8)</filename> manual
+ pages to determine the correct arguments for these programs.</para>
+
+ <para>The <filename>/etc/sysconfig/console</filename> file should contain lines
+ of the form: VARIABLE="value". The following variables are recognized:</para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>KEYMAP</term>
+ <listitem>
+ <para>This variable specifies the arguments for the
+ <command>loadkeys</command> program, typically, the name of keymap
+ to load, e.g. "es". If this variable is not set, the bootscript will
+ not run the <command>loadkeys</command> program, and the default kernel
+ keymap will be used.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>KEYMAP_CORRECTIONS</term>
+ <listitem>
+ <para>This (rarely used) variable
+ specifies the arguments for the second call to the
+ <command>loadkeys</command> program. This is useful if the stock keymap
+ is not completely satisfactory and a small adjustment has to be made. E.g.,
+ to include the Euro sign into a keymap that normally doesn't have it,
+ set this variable to "euro2".</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>FONT</term>
+ <listitem>
+ <para>This variable specifies the arguments for the
+ <command>setfont</command> program. Typically, this includes the font
+ name, "-m", and the name of the application character map to load.
+ E.g., in order to load the "lat1-16" font together with the "8859-1"
+ application character map, set this variable to "lat1-16 -m 8859-1".
+ If this variable is not set, the bootscript will not run the
+ <command>setfont</command> program, and the default VGA font will be
+ used together with the default application character map.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>UNICODE</term>
+ <listitem>
+ <para>Set this variable to "1", "yes" or "true" in order to put the
+ console into UTF-8 mode. This is useful in UTF-8 based locales and
+ harmful otherwise.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>LEGACY_CHARSET</term>
+ <listitem>
+ <para>For many keyboard layouts, there is no stock Unicode keymap in
+ the Kbd package. The <command>console</command> bootscript will
+ convert an available keymap to UTF-8 on the fly if this variable is
+ set to the encoding of the available non-UTF-8 keymap. Note, however,
+ that dead keys and composing will not work in UTF-8 mode without the
+ special kernel patch.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>BROKEN_COMPOSE</term>
+ <listitem>
+ <para>Set this to "0" if you are going to apply that kernel patch in
+ Chapter 8. Note that you also have to add the character set expected
+ by composition rules in your keymap to the FONT variable after the
+ "-m" switch.</para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>Support for compiling the keymap directly into the kernel has been
+ removed because there were reports that it leads to incorrect results.</para>
+
+ <para>Some examples:</para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>For a non-Unicode setup, only the KEYMAP and FONT variables are
+ generally needed. E.g., for a Polish setup, one would use:</para>
+
+<screen role="nodump"><userinput>cat &gt; /etc/sysconfig/console &lt;&lt; "EOF"
+<literal># Begin /etc/sysconfig/console
+
+KEYMAP="pl2"
+FONT="lat2a-16 -m 8859-2"
+
+# End /etc/sysconfig/console</literal>
EOF</userinput></screen>
+ </listitem>
- <para>For example, for Spanish users who also want to use the Euro
- character (accessible by pressing AltGr+E), the following settings are
- correct:</para>
+ <listitem>
+ <para>As mentioned above, it is sometimes necessary to adjust a
+ stock keymap slightly. The following example adds the Euro symbol to the
+ German keymap:</para>
-<screen role="nodump"><userinput>cat &gt;/etc/sysconfig/console &lt;&lt;"EOF"
-<literal>KEYMAP="es euro2"
-FONT="lat9-16 -u iso01"</literal>
+<screen role="nodump"><userinput>cat &gt; /etc/sysconfig/console &lt;&lt; "EOF"
+<literal># Begin /etc/sysconfig/console
+
+KEYMAP="de-latin1"
+KEYMAP_CORRECTIONS="euro2"
+FONT="lat0-16 -m 8859-15"
+
+# End /etc/sysconfig/console</literal>
EOF</userinput></screen>
+ </listitem>
- <note>
- <para>The <envar>FONT</envar> line above is correct only for the ISO 8859-15
- character set. If using ISO 8859-1 and, therefore, a pound sign
- instead of Euro, the correct <envar>FONT</envar> line would be:</para>
+ <listitem>
+ <para>Here is a Unicode-enabled example for Bulgarian, where a stock
+ UTF-8 keymap exists and defines no dead keys or composition rules:</para>
-<screen role="nodump"><userinput>FONT="lat1-16"</userinput></screen>
- </note>
+<screen role="nodump"><userinput>cat &gt; /etc/sysconfig/console &lt;&lt; "EOF"
+<literal># Begin /etc/sysconfig/console
+
+UNICODE="1"
+KEYMAP="bg_bds-utf8"
+FONT="LatArCyrHeb-16"
- <para>If the <envar>KEYMAP</envar> or <envar>FONT</envar> variable is not set,
- the <command>console</command> initscript will not run the corresponding
- program.</para>
-
- <para>In some keymaps, the Backspace and Delete keys send characters different
- from ones in the default keymap built into the kernel. This confuses some
- applications. For example, Emacs displays its help (instead of erasing the
- character before the cursor) when Backspace is pressed. To check if the keymap
- in use is affected (this works only for i386 keymaps):</para>
-
-<screen role="nodump"><userinput>zgrep '\W14\W' <replaceable>[/path/to/your/keymap]</replaceable></userinput></screen>
-
- <para>If the keycode 14 is Backspace instead of Delete, create the
- following keymap snippet to fix this issue:</para>
-
-<screen role="nodump"><userinput>mkdir -pv /etc/kbd &amp;&amp; cat &gt; /etc/kbd/bs-sends-del &lt;&lt;"EOF"
-<literal> keycode 14 = Delete Delete Delete Delete
- alt keycode 14 = Meta_Delete
- altgr alt keycode 14 = Meta_Delete
- keycode 111 = Remove
- altgr control keycode 111 = Boot
- control alt keycode 111 = Boot
-altgr control alt keycode 111 = Boot</literal>
+# End /etc/sysconfig/console</literal>
+EOF</userinput></screen>
+ </listitem>
+
+ <listitem>
+ <para>Due to the use of a 512-glyph LatArCyrHeb-16 font in the previous
+ example, bright colors are no longer available on the Linux console unless
+ a framebuffer is used. If one wants to have bright colors without
+ framebuffer and can live without characters not belonging to his language,
+ it is still possible to use a language-specific 256-glyph font, as
+ illustrated below. This would, however, also break single quotes in manual
+ pages.</para>
+
+ <!-- And even with the LatArCyrHeb-16 font, copying-and-pasting produces
+ non-ASCII variants of opening and closing single quote instead of ` and '.
+ Maybe another sed has to be added to groff instructions that will remove
+ both issues. -->
+
+<screen role="nodump"><userinput>cat &gt; /etc/sysconfig/console &lt;&lt; "EOF"
+<literal># Begin /etc/sysconfig/console
+
+UNICODE="1"
+KEYMAP="bg_bds-utf8"
+FONT="cyr-sun16"
+
+# End /etc/sysconfig/console</literal>
EOF</userinput></screen>
+ </listitem>
- <para>Tell the <command>console</command> script to load this
- snippet after the main keymap:</para>
+ <listitem>
+ <para>The following example illustrates keymap autoconversion from
+ ISO-8859-15 to UTF-8 and enabling dead keys in Unicode mode:</para>
-<screen role="nodump"><userinput>cat &gt;&gt;/etc/sysconfig/console &lt;&lt;"EOF"
-<literal>KEYMAP_CORRECTIONS="/etc/kbd/bs-sends-del"</literal>
+<screen role="nodump"><userinput>cat &gt; /etc/sysconfig/console &lt;&lt; "EOF"
+<literal># Begin /etc/sysconfig/console
+
+UNICODE="1"
+KEYMAP="de-latin1"
+KEYMAP_CORRECTIONS="euro2"
+LEGACY_CHARSET="iso-8859-15"
+BROKEN_COMPOSE="0"
+FONT="LatArCyrHeb-16 -m 8859-15"
+
+# End /etc/sysconfig/console</literal>
EOF</userinput></screen>
+ </listitem>
+
+ <listitem>
+ <para>For Chinese, Japanese, Korean and some other languages, the Linux
+ console cannot be configured to display the needed characters. Users
+ who need such languages should install the X Window System, fonts that
+ cover the necessary character ranges, and the proper input Method (e.g.
+ SCIM, it supports a wide variety of languages).</para>
+ </listitem>
- <para>To compile the keymap directly into the kernel instead of
- setting it every time from the <command>console</command> bootscript,
- follow the instructions given in <xref linkend="ch-bootable-kernel" role="."/>
- Doing this ensures that the keyboard will always work as expected,
- even when booting into maintenance mode (by passing
- <parameter>init=/bin/sh</parameter> to the kernel), because the
- <command>console</command> bootscript will not be run in that
- situation. Additionally, the kernel will not set the screen font
- automatically. This should not pose many problems because ASCII characters
- will be handled correctly, and it is unlikely that a user would need
- to rely on non-ASCII characters while in maintenance mode.</para>
-
- <para>Since the kernel will set up the keymap, it is possible to omit
- the <envar>KEYMAP</envar> variable from the
- <filename>/etc/sysconfig/console</filename> configuration file. It can
- also be left in place, if desired, without consequence. Keeping it
- could be beneficial if running several different kernels where it is
- difficult to ensure that the keymap is compiled into every one of
- them.</para>
+ </itemizedlist>
+
+ <!-- Added because folks keep posting their console file with X questions
+ to blfs-support list -->
+ <note>
+ <para>The <filename>/etc/sysconfig/console</filename> file only controls
+ Linux text console localization. It has nothing to do with setting the proper
+ keyboard layout and terminal fonts in X Window System.</para>
+ </note>
</sect1>
diff --git a/chapter07/profile.xml b/chapter07/profile.xml
index dd53a5141..ae7617ba7 100644
--- a/chapter07/profile.xml
+++ b/chapter07/profile.xml
@@ -69,17 +69,19 @@
for the desired language (e.g., <quote>en</quote>) and
<replaceable>[CC]</replaceable> with the two-letter code for the appropriate
country (e.g., <quote>GB</quote>). <replaceable>[charmap]</replaceable> should
- be replaced with the canonical charmap for your chosen locale.</para>
+ be replaced with the canonical charmap for your chosen locale. Optional
+ modifiers such as <quote>@euro</quote> may also be present.</para>
<para>The list of all locales supported by Glibc can be obtained by running
the following command:</para>
<screen role="nodump"><userinput>locale -a</userinput></screen>
- <para>Locales can have a number of synonyms, e.g. <quote>ISO-8859-1</quote>
+ <para>Charmaps can have a number of aliases, e.g. <quote>ISO-8859-1</quote>
is also referred to as <quote>iso8859-1</quote> and <quote>iso88591</quote>.
- Some applications cannot handle the various synonyms correctly, so it is
- safest to choose the canonical name for a particular locale. To determine
+ Some applications cannot handle the various synonyms correctly (e.g. require
+ that "UTF-8" is written as "UTF-8", not "utf8"), so it is safest in most
+ cases to choose the canonical name for a particular locale. To determine
the canonical name, run the following command, where <replaceable>[locale
name]</replaceable> is the output given by <command>locale -a</command> for
your preferred locale (<quote>en_GB.iso88591</quote> in our example).</para>
@@ -115,6 +117,7 @@ LC_ALL=[locale name] locale int_prefix</userinput></screen>
Further instructions assume that there are no such error messages from
Glibc.</para>
+ <!-- FIXME: the xlib example will became obsolete real soon -->
<para>Some packages beyond LFS may also lack support for your chosen locale. One
example is the X library (part of the X Window System), which outputs the
following error message:</para>
@@ -139,23 +142,43 @@ LC_ALL=[locale name] locale int_prefix</userinput></screen>
<screen><userinput>cat &gt; /etc/profile &lt;&lt; "EOF"
<literal># Begin /etc/profile
-export LANG=<replaceable>[ll]</replaceable>_<replaceable>[CC]</replaceable>.<replaceable>[charmap]</replaceable>
+export LANG=<replaceable>[ll]</replaceable>_<replaceable>[CC]</replaceable>.<replaceable>[charmap]</replaceable><replaceable>[@modifiers]</replaceable>
export INPUTRC=/etc/inputrc
# End /etc/profile</literal>
EOF</userinput></screen>
+ <para>The <quote>C</quote> (default) and <quote>en_US</quote> (the recommended
+ one for United States English users) locales are different. <quote>C</quote>
+ uses the US-ASCII 7-bit character set, and treats bytes with the high bit set
+ as invalid characters. That's why, e.g., the <command>ls</command> command
+ substitutes them with question marks in that locale. Also, an attempt to send
+ mail with such characters from Mutt or Pine results in non-RFC-conforming
+ messages being set (the charset in the outgoing mail is indicatsed as "unknown
+ 8-bit"). So you can use the <quote>C</quote> locale only if you are sure that
+ you will never need 8-bit characters.</para>
+
+ <para>UTF-8 based locales are not supported well by many programs. E.g., the
+ <command>watch</command> program displays only ASCII characters in UTF-8
+ locales and has no such restriction in traditional 8-bit locales like en_US.
+ Without patches and/or installing software beyond BLFS, in UTF-8 based locales
+ you will not be able to do such basic tasks as printing plain-text files from
+ the command line, recording Windows-readable CDs with filenames containing
+ non-ASCII characters, viewing ID3v1 tags in MP3 files and so on. It is also
+ impossible (without damaging non-ASCII characters) to connect using ssh from
+ the system using a UTF-8 based locale to a host that still uses a traditional
+ 8-bit locale, and vice versa. In short, use UTF-8 only if you are going to
+ use KDE or GNOME and never open the terminal, or if you are going to tolerate
+ bugs.</para>
+ <!-- All abovementioned problems except "watch" have a known fix beyond BLFS -->
+
<note>
- <para>The <quote>C</quote> (default) and <quote>en_US</quote> (the
- recommended one for United States English users) locales are different.</para>
+ <para>Bug reports reproducible only in UTF-8 locales and for which there
+ is no patch or other fix mentioned in the report, will be closed immediately,
+ without investigation, with the "WONTFIX" resolution and a "don't use this
+ program or revert to non-UTF-8 locale" comment. Patches that have ill
+ effects in non-UTF-8 locales (other than replacement of translated program
+ messages with English ones) will be rejected.</para>
</note>
- <para>Setting the keyboard layout, screen font, and locale-related environment
- variables are the only internationalization steps needed to support locales
- that use ordinary single-byte encodings and left-to-right writing direction.
- More complex cases (including UTF-8 based locales) require additional steps
- and additional patches because many applications tend to not work properly
- under such conditions. These steps and patches are not included in the LFS
- book and such locales are not yet supported by LFS.</para>
-
</sect1>