From 5536f7440f2f4a12782e8d741cbbba5f1c3cfea8 Mon Sep 17 00:00:00 2001 From: Archaic Date: Mon, 26 Dec 2005 19:00:06 +0000 Subject: Applied Alexander Patrakov's patch which adds UTF-8 capability to the development branch of the LFS Book. git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@7235 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689 --- chapter07/profile.xml | 53 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 38 insertions(+), 15 deletions(-) (limited to 'chapter07/profile.xml') diff --git a/chapter07/profile.xml b/chapter07/profile.xml index dd53a5141..ae7617ba7 100644 --- a/chapter07/profile.xml +++ b/chapter07/profile.xml @@ -69,17 +69,19 @@ for the desired language (e.g., en) and [CC] with the two-letter code for the appropriate country (e.g., GB). [charmap] should - be replaced with the canonical charmap for your chosen locale. + be replaced with the canonical charmap for your chosen locale. Optional + modifiers such as @euro may also be present. The list of all locales supported by Glibc can be obtained by running the following command: locale -a - Locales can have a number of synonyms, e.g. ISO-8859-1 + Charmaps can have a number of aliases, e.g. ISO-8859-1 is also referred to as iso8859-1 and iso88591. - Some applications cannot handle the various synonyms correctly, so it is - safest to choose the canonical name for a particular locale. To determine + Some applications cannot handle the various synonyms correctly (e.g. require + that "UTF-8" is written as "UTF-8", not "utf8"), so it is safest in most + cases to choose the canonical name for a particular locale. To determine the canonical name, run the following command, where [locale name] is the output given by locale -a for your preferred locale (en_GB.iso88591 in our example). @@ -115,6 +117,7 @@ LC_ALL=[locale name] locale int_prefix Further instructions assume that there are no such error messages from Glibc. + Some packages beyond LFS may also lack support for your chosen locale. One example is the X library (part of the X Window System), which outputs the following error message: @@ -139,23 +142,43 @@ LC_ALL=[locale name] locale int_prefix cat > /etc/profile << "EOF" # Begin /etc/profile -export LANG=[ll]_[CC].[charmap] +export LANG=[ll]_[CC].[charmap][@modifiers] export INPUTRC=/etc/inputrc # End /etc/profile EOF + The C (default) and en_US (the recommended + one for United States English users) locales are different. C + uses the US-ASCII 7-bit character set, and treats bytes with the high bit set + as invalid characters. That's why, e.g., the ls command + substitutes them with question marks in that locale. Also, an attempt to send + mail with such characters from Mutt or Pine results in non-RFC-conforming + messages being set (the charset in the outgoing mail is indicatsed as "unknown + 8-bit"). So you can use the C locale only if you are sure that + you will never need 8-bit characters. + + UTF-8 based locales are not supported well by many programs. E.g., the + watch program displays only ASCII characters in UTF-8 + locales and has no such restriction in traditional 8-bit locales like en_US. + Without patches and/or installing software beyond BLFS, in UTF-8 based locales + you will not be able to do such basic tasks as printing plain-text files from + the command line, recording Windows-readable CDs with filenames containing + non-ASCII characters, viewing ID3v1 tags in MP3 files and so on. It is also + impossible (without damaging non-ASCII characters) to connect using ssh from + the system using a UTF-8 based locale to a host that still uses a traditional + 8-bit locale, and vice versa. In short, use UTF-8 only if you are going to + use KDE or GNOME and never open the terminal, or if you are going to tolerate + bugs. + + - The C (default) and en_US (the - recommended one for United States English users) locales are different. + Bug reports reproducible only in UTF-8 locales and for which there + is no patch or other fix mentioned in the report, will be closed immediately, + without investigation, with the "WONTFIX" resolution and a "don't use this + program or revert to non-UTF-8 locale" comment. Patches that have ill + effects in non-UTF-8 locales (other than replacement of translated program + messages with English ones) will be rejected. - Setting the keyboard layout, screen font, and locale-related environment - variables are the only internationalization steps needed to support locales - that use ordinary single-byte encodings and left-to-right writing direction. - More complex cases (including UTF-8 based locales) require additional steps - and additional patches because many applications tend to not work properly - under such conditions. These steps and patches are not included in the LFS - book and such locales are not yet supported by LFS. - -- cgit v1.2.3-54-g00ecf