diff options
Diffstat (limited to 'chapter06')
-rw-r--r-- | chapter06/man-db.xml | 196 |
1 files changed, 119 insertions, 77 deletions
diff --git a/chapter06/man-db.xml b/chapter06/man-db.xml index 5be9fa4e1..729abbaf7 100644 --- a/chapter06/man-db.xml +++ b/chapter06/man-db.xml @@ -111,71 +111,95 @@ <screen><userinput remap="install">make install</userinput></screen> - <para>Some packages provide UTF-8 manual pages, which previous versions of - <application>Man-DB</application> were unable to display. This limitation - has been fixed in recent versions, and <application>Man-DB</application> - can now convert manual pages from legacy encodings to UTF-8 - (and vice-versa) on the fly. This used to be a rather annoying - problem across different distributions, as packages written for one - distribution would require changes to work on another. The following - script will allow you to convert manual pages to and from legacy and UTF-8 - encodings.</para> - -<screen><userinput remap="install">cat >> convert-mans << "EOF" -<literal>#!/bin/sh -e -FROM="$1" -TO="$2" -shift ; shift -while [ $# -gt 0 ] -do - FILE="$1" - shift - iconv -f "$FROM" -t "$TO" "$FILE" >.tmp.iconv - mv .tmp.iconv "$FILE" -done</literal> -EOF -install -m755 convert-mans /usr/bin</userinput></screen> - - <para>Additional information regarding the compression of - man and info pages can be found in the BLFS book at - <ulink url="&blfs-root;view/cvs/postlfs/compressdoc.html"/>.</para> - </sect2> <sect2> <title>Non-English Manual Pages in LFS</title> +<!-- + <para>Some packages provide UTF-8 manual pages, which previous versions of + <application>Man-DB</application> were unable to display correctly because + the expected (8-bit) encoding for each language was hard-coded in the + source of <application>Man-DB</application>. + <application>Man-DB</application> now uses the extension of the directory + name in order to determine the encoding of the manual pages stored within. + If no extension exists, <application>Man-DB</application> uses a built-in + table (see below) to determine the encoding. E.g., because of "UTF-8" in + the directory name, it knows that all manual pages residing in + <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8 + encoded and, according to the built-in table, expects all manual pages + residing in <filename class="directory">/usr/share/man/ru</filename> to + be encoded using KOI8-R.</para> <para>Linux distributions have different policies concerning the character encoding in which manual pages are stored in the filesystem. E.g., RedHat stores all manual pages in UTF-8, while Debian previously used - language-specific (mostly 8-bit) encodings. As mentioned above, this leads - to incompatibility of packages with manual pages designed for different - distributions.</para> - - <para>LFS previously used the same convention as Debian. This was chosen - because <application>Man-DB</application> did not understand manual pages - stored in UTF-8 at the time of its introduction into LFS. For our purposes - at that time, <application>Man-DB</application> was preferable to - <application>Man</application> as it worked without any additional - configuration in any locale. This is still true today as - <application>Man-DB</application> with Debian patched - <application>Groff</application> will now dynamically convert UTF-8 encoded - manual pages to the user's locale. Additionally, this combination provides - support for Chinese and Japanese locales, and limited support for Korean, - whereas <application>Man</application> does not. The current offering of - <application>Man</application> as used in RedHat requires major - modifications to both the <application>Man</application> and - <application>Groff</application> packages, and still falls short on - Chinese, Japanese, and Korean encodings.</para> - - <para>Finally, most distributions, including Debian, are rapidly migrating - to all UTF-8 encoded manual pages. Upstream packagers will very likely drop - legacy encodings in favor of UTF-8, though adoption has been slow due to - the hacks required to make the current <application>Man</application> and - <application>Groff</application> packages work correctly together.</para> - - <para>The relationship between language codes and the expected encoding - of legacy manual pages is listed below.</para> + language-specific (mostly 8-bit) encodings. Many other distributions simply + ignore the problem all together. LFS also used the legacy encodings in + previuos versions of the book. This was chosen because of the ease of + configuration associated with <application>Man-DB</application>. + Additionally, <application>Man-DB</application> provided support for + Chinese and Japanese locales, and limited support for Korean, whereas + <application>Man</application> did not at that time.</para> + + <para>In contrast, the setup in Fedora Core expects all manual pages + to be UTF-8 encoded, and stored in directories without suffixes. + Disagreement about the expected encoding of manual pages amongst + distribution vendors, has led to confusion for upstream package maintainers. + Some packages contain, UTF-8 manual pages, while others ship with manual + pages in legacy encodings. Unlike the + <application>Man</application>/<application>Groff</application> setup in + Fedora Core, <application>Man-DB</application> can make very good decisions + about the on disk encoding and present the information to the user in their + prefered format, without complex configurations.</para> + + <para><application>Man-DB</application> has, for the most part, made this + problem completely transparent to end users, as long as the manual pages + are installed into the correct directory. There may be times, however, + where one encoding is preferred over the other. For this purpose, the + <command>convert-mans</command> script was written. It will convert manual + pages to another encoding before (or after) installation. Install the + <command>convert-mans</command> script with the following + instructions:</para> +--> + <para>Some packages provide non-English manual pages. They are displayed + correctly only if their location and encoding matches the expectation of + the "man" program. However, different Linux distributions have different + policies (expressed in the choice of the <command>man</command> program, + its configuration and patches applied to it) concerning the character + encoding in which manual pages are stored in the filesystem.</para> + + <para>E.g., Debian previously required Russian manual pages to be encoded + in KOI8-R and to be placed in + <filename class="directory">/usr/share/man/ru</filename>. Now, in addition, + their <command>man</command> program (<application>Man-DB</application>) + searches for UTF-8 encoded Russian manual pages in + <filename class="directory">/usr/share/man/ru.UTF-8</filename>. On the + other hand, Fedora uses UTF-8 encoded manual pages exclusively. Russian + manual pages are found in + <filename class="directory">/usr/share/man/ru</filename> and their + <command>man</command> program doesn't acknowledge + <filename class="directory">/usr/share/man/ru.UTF-8</filename>. Many + other distributions ignore the on disk encodings completely, leaving the + end user with a mix of improperly encoded manual pages for their + configuration. When <command>man</command> processes the requtested page, + it will display the contents as configured, resulting in completely + unreadable text if the on disk encoding is not what is expected for that + configuration.</para> + + <para>Disagreement about the expected encoding of manual pages amongst + distribution vendors, has led to confusion for upstream package + maintainers. One package may contain UTF-8 manual pages, while another + ships with manual pages in legacy encodings. <command>man</command> + searches for manual pages based on the user's locale settings. + <application>Man-DB</application> uses a built-in table (see below) to + determine the on disk encoding of manual pages found for a user's + locale, only if the directories found do not have an extension that + describes the encoding. E.g., because of ".UTF-8" in the directory name, + <application>Man-DB</application> knows that all manual pages residing in + <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8 + encoded and, according to the built-in table, expects all manual pages + residing in <filename class="directory">/usr/share/man/ru</filename> to + be encoded using KOI8-R.</para> <!-- Origin: man-db-2.5.2/src/encodings.c --> <table> @@ -308,7 +332,7 @@ install -m755 convert-mans /usr/bin</userinput></screen> <entry>GBK</entry> </row> <row> - <entry>Simplified Chinese,Singapore} (zh_SG)</entry> + <entry>Simplified Chinese, Singapore (zh_SG)</entry> <entry>GBK</entry> </row> <row> @@ -330,12 +354,36 @@ install -m755 convert-mans /usr/bin</userinput></screen> Norwegian does not work because of the transition from no_NO to nb_NO locale, and will be fixed in the next release of <application>Man-DB</application>. Korean is currently non functional - because of incomplete fixes in the Groff patch.</para> + because of incomplete fixes in the Debian + <application>Groff</application> patch applied in LFS.</para> </note> + <para>Packages may install manual pages into an improperly named directory, + depending on which distributions the author develops the package for. To + assist in the conversion of the manual pages to the proper encoding for the + directory in which they are installed, the <command>convert-mans</command> + script was written. It will convert manual pages to another encoding before + (or after) installation. Install the <command>convert-mans</command> + script with the following instructions:</para> + +<screen><userinput remap="install">cat >> convert-mans << "EOF" +<literal>#!/bin/sh -e +FROM="$1" +TO="$2" +shift ; shift +while [ $# -gt 0 ] +do + FILE="$1" + shift + iconv -f "$FROM" -t "$TO" "$FILE" >.tmp.iconv + mv .tmp.iconv "$FILE" +done</literal> +EOF +install -m755 convert-mans /usr/bin</userinput></screen> + - <para>If upstream distributes the manual pages in a legacy encoding, - the manual pages can simply be copied to + <para>If upstream distributes the manual pages in a legacy encoding, the + manual pages can simply be copied to <filename class="directory">/usr/share/man/<replaceable><language code></replaceable></filename>. For example, <ulink url="http://www.infodrom.org/projects/manpages-de/download/manpages-de-0.5.tar.gz"> @@ -353,26 +401,20 @@ cp -rv man? /usr/share/man/de</userinput></screen> code></replaceable>.UTF-8</filename>.</para> <para>For example, to install <ulink - url="http://ditec.um.es/~piernas/manpages-es/man-pages-es-1.55.tar.bz2"> - Spanish manual pages</ulink> in the legacy encoding, use the following + url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2"> + French manual pages</ulink> in the legacy encoding, use the following commands:</para> -<screen role="nodump"><userinput>mv man7/iso_8859-7.7{,X} -convert-mans UTF-8 ISO-8859-1 man?/*.? -mv man7/iso_8859-7.7{X,} -make install</userinput></screen> +<screen role="nodump"><userinput>convert-mans UTF-8 ISO-8859-1 man?/*.? +mkdir -p /usr/share/man/fr +cp -rv man? /usr/share/man/fr</userinput></screen> - <note> - <para>The <filename>man7/iso_8859-7.7</filename> file needs to be - exclueded from the conversion process because it is already in - ISO-8859-1 format. This is a packaging bug in man-pages-es-1.55. - Future versions should not require this workaround.</para> - </note> + <note><para>The French manual pages ship with ready made scripts to do the + same conversion. The above instructions are used only as an example for + use of the <command>convert-mans</command> script.</para></note> - <para>Finally, as an example installation of UTF-8 manual pages, the <ulink - url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2"> - French manual pages</ulink> can be installed with the following - commands:</para> + <para>Finally, as an example installation of UTF-8 manual pages, again, the + French manual pages could be installed with the following commands:</para> <screen role="nodump"><userinput>mkdir -p /usr/share/man/fr.UTF-8 cp -rv man? /usr/share/man/fr.UTF-8</userinput></screen> |