From 9f405998bbba6a2a88328d9d807eab07d8937a40 Mon Sep 17 00:00:00 2001 From: Matthew Burgess Date: Sat, 16 May 2009 13:35:05 +0000 Subject: Resolve several man-db encodoing configuration issues. Fixes #2298. git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@8871 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689 --- chapter06/man-db.xml | 278 ++++++++++++++------------------------------------- 1 file changed, 73 insertions(+), 205 deletions(-) (limited to 'chapter06/man-db.xml') diff --git a/chapter06/man-db.xml b/chapter06/man-db.xml index bc33c4a57..79e817d83 100644 --- a/chapter06/man-db.xml +++ b/chapter06/man-db.xml @@ -41,12 +41,11 @@ Installation of Man-DB - LFS creates /usr/man and - /usr/local/man as symlinks. Remove them from the - man_db.conf file to prevent redundant - results when using programs such as whatis: + Apply a patch to fix a problem with the testsuite, which doesn't + expect col to be UTF-8 aware, which Util-Linux-NG's + version is: -sed -i -e '\%\t/usr/man%d' -e '\%\t/usr/local/man%d' src/man_db.conf.in +patch -Np1 -i ../&man-db-testsuite-patch; Prepare Man-DB for compilation: @@ -88,7 +87,9 @@ make - This package does not come with a test suite. + To test the results, issue: + +make check Install the package: @@ -99,47 +100,13 @@ Non-English Manual Pages in LFS - Some packages provide non-English manual pages. They are displayed - correctly only if their location and encoding matches the expectation of - the "man" program. However, different Linux distributions have different - policies (expressed in the choice of the man program, - its configuration and patches applied to it) concerning the character - encoding in which manual pages are stored in the filesystem. - - E.g., Debian previously required Russian manual pages to be encoded - in KOI8-R and to be placed in - /usr/share/man/ru. Now, in addition, - their man program (Man-DB) - searches for UTF-8 encoded Russian manual pages in - /usr/share/man/ru.UTF-8. On the - other hand, Fedora uses UTF-8 encoded manual pages exclusively. Russian - manual pages are found in - /usr/share/man/ru and their - man program doesn't acknowledge - /usr/share/man/ru.UTF-8. Many - other distributions ignore the on disk encodings completely, leaving the - end user with a mix of improperly encoded manual pages for their - configuration. When man processes the requtested page, - it will display the contents as configured, resulting in completely - unreadable text if the on disk encoding is not what is expected for that - configuration. - - Disagreement about the expected encoding of manual pages amongst - distribution vendors, has led to confusion for upstream package - maintainers. One package may contain UTF-8 manual pages, while another - ships with manual pages in legacy encodings. man - searches for manual pages based on the user's locale settings. - Man-DB uses a built-in table (see below) to - determine the on disk encoding of manual pages found for a user's - locale, only if the directories found do not have an extension that - describes the encoding. E.g., because of ".UTF-8" in the directory name, - Man-DB knows that all manual pages residing in - /usr/share/man/fr.UTF-8 are UTF-8 - encoded and, according to the built-in table, expects all manual pages - residing in /usr/share/man/ru to - be encoded using KOI8-R. - - + The following table shows the character set that Man-DB assumes + manual pages installed under + /usr/share/man/<ll> will be + encoded with. In addition to this, Man-DB correctly determines if manual + pages installed in that directory are UTF-8 encoded. + + Expected character encoding of legacy 8-bit manual pages @@ -164,38 +131,44 @@ Danish (da) ISO-8859-1 - Bulgarian (bg) - CP1251 + Croation (hr) + ISO-8859-1 German (de) ISO-8859-1 - Czech (cs) + Hungarian (hu) ISO-8859-2 English (en) ISO-8859-1 - Croatian (hr) - ISO-8859-2 + Japanese (ja) + EUC-JP Spanish (es) ISO-8859-1 - Hungarian (hu) - ISO-8859-2 + Korean (ko) + EUC-KR + + + Estonian (et) + ISO-8859-1 + Lithuanian (lt) + ISO-8859-13 Finnish (fi) ISO-8859-1 - Japanese (ja) - EUC-JP + Latvian (lv) + ISO-8859-13 French (fr) ISO-8859-1 - Korean (ko) - EUC-KR + Macedonian (mk) + ISO-8859-5 Irish (ga) @@ -206,117 +179,88 @@ Galician (gl) ISO-8859-1 + Romanian (ro) + ISO-8859-2 + + + Indonesian (id) + ISO-8859-1 Russian (ru) KOI8-R - Indonesian (id) + Icelandic (is) ISO-8859-1 Slovak (sk) ISO-8859-2 - Icelandic (is) + Italian (it) ISO-8859-1 - Serbian (sr) - ISO-8859-5 + Slovenian (sl) + ISO-8859-2 - Italian (it) + Norwegian Bokmal (nb) ISO-8859-1 - Turkish (tr) - ISO-8859-9 + Serbian Latin (sr@latin) + ISO-8859-2 Dutch (nl) ISO-8859-1 - Simplified Chinese (zh_CN) - GBK + Serbian (sr) + ISO-8859-5 + + + Norwegian Nynorsk (nn) + ISO-8859-1 + Turkish (tr) + ISO-8859-9 - Norwegian (no) ISO-8859-1 - Simplified Chinese, Singapore (zh_SG) - GBK + Ukrainian (uk) + KOI8-U - Portuguese (pt) ISO-8859-1 - Traditional Chinese (zh_TW) - BIG5 + Vietnamese (vi) + TCVN5712-1 Swedish (sv) ISO-8859-1 - Traditional Chinese, Hong Kong (zh_HK) - BIG5HKSCS - - - - + Greek (el) + ISO-8859-7 + + + + @@ -324,75 +268,9 @@
- Manual pages in languages not in the list are not supported. - Norwegian does not work because of the transition from no_NO to - nb_NO locale, and will be fixed in the next release of - Man-DB. Korean is currently non functional - because of incomplete fixes in the Debian - Groff patch applied in LFS. + Manual pages in languages not in the list are not supported. - Packages may install manual pages into an improperly named directory, - depending on which distributions the author develops the package for. To - assist in the conversion of the manual pages to the proper encoding for the - directory in which they are installed, the convert-mans - script was written. It will convert manual pages to another encoding before - (or after) installation. Install the convert-mans - script with the following instructions: - -cat >> convert-mans << "EOF" -#!/bin/sh -e -FROM="$1" -TO="$2" -shift ; shift -while [ $# -gt 0 ] -do - FILE="$1" - shift - iconv -f "$FROM" -t "$TO" "$FILE" >.tmp.iconv - mv .tmp.iconv "$FILE" -done -EOF -install -v -m755 convert-mans /usr/bin - - - If upstream distributes the manual pages in a legacy encoding, the - manual pages can simply be copied to - /usr/share/man/<language - code>. For example, - German manual pages can be installed with the following - commands: - -mkdir -p /usr/share/man/de -cp -rv man? /usr/share/man/de - - If upstream distributes manual pages in UTF-8 (i.e., for - RedHat) instead of the encoding listed in the table above, they - can either be converted from UTF-8 to the encoding listed in the table - above, or they can be installed directly into - /usr/share/man/<language - code>.UTF-8. - - For example, to install - French manual pages in the legacy encoding, use the following - commands: - -convert-mans UTF-8 ISO-8859-1 man?/*.? -mkdir -p /usr/share/man/fr -cp -rv man? /usr/share/man/fr - - The French manual pages ship with ready made scripts to do the - same conversion. The above instructions are used only as an example for - use of the convert-mans script. - - Finally, as an example installation of UTF-8 manual pages, again, the - French manual pages could be installed with the following commands: - -mkdir -p /usr/share/man/fr.UTF-8 -cp -rv man? /usr/share/man/fr.UTF-8 -
@@ -445,16 +323,6 @@ cp -rv man? /usr/share/man/fr.UTF-8 - - convert-mans - - Reformats manual pages into the chosen encoding. - - convert-mans - - - - lexgrog -- cgit v1.2.3-54-g00ecf