Translations (was Re: Ray's Euro Visits)



Hello all!

On Tue, 16 Jan 2001, Paul wrote:

> > 1. We need to be able to configure the language displayed in the
> > menus, buttons, and boxes of a gui.
> 
> There would appear to be a mechanism used by the various linux distros to do
> this sort of thing. Mandrake has a set of files buried in the install tree with
> a file extension of .po - Perhaps the same system could be used with emc.
> 

I know a bit about the system that you speak -- it's called gettext.
I only know how to use it with C/C++.  I don't know how it works with 
tcl/tk.

This message turned out to be a bit longer than I expected.  I hope you
all don't mind :)


Here is a list of the basic steps you need for translating a program
with gettext:

1. All the text strings in the source code to be translated are
   "marked" with a special macro.

   This macro is usually called either gettext or just an underscore.
   Printing "Hello, World" would look something like this:
   printf( _("Hello, world") ); 

2. A small utility program is used to scan all of the source code and
   extract all of the strings marked by the gettext macro from step #1 
   into a single file. 

   This file is called the template file and contains each of the text
   strings to be translated beside an empty translation.  The utility
   that creates this file is called xgettext.
   
3. Somewhere near the start of the program, gettext needs to be
   initialized and the translations need to be loaded.

   This is accomplished by calling several functions: setlocale,
   bindtextdomain and textdomain.  The setlocale function sets the
   current language, the bindtextdomain function loads the translated
   strings, and the textdomain sets the current "domain" that the
   translations are coming from.

4. The template file created in step #2 is then given to a translator
   and he/she translates all of the empty translations.

   This newly created file is called the ".po" file and has the
   extension .po.  There is one .po file for each language.

   An already existing .po file can be updated with a new template
   by a utility called msgmerge.  It is customary to end this new file 
   with the extension .pox which gets renamed back to .po when the
   translations have been made.

5. The translated file from step #4 which contains all the text
   strings and their translations is run through another utility
   program to generate a binary file for runtime.

   These binary files are called the ".mo" files and have the
   extension .mo.  The utility that generates the .mo files is called
   msgfmt.

   These .mo files -- again, one per language, get installed in their
   respective directories in /usr/share/locale or
   /usr/local/share/locale.

6. The $LANG environment variable is set to the new language, and the
   program is run. It should now be translated!

   This environment variable is usually set up by the distro.

For more information and perhaps a better explanation, see the info
pages on gettext.  You should be able to get to these pages from
gnome-help-browser or kdehelp.


Here are some notes and gotchas about using gettext:

* The bindtextdomain and textdomain need two strings: The name of the
  "package" [for example, "emc"] and the directory that the gettext
  stuffs are in [usually "/usr/share/locale"].  These two strings are
  usually passed in as macros from the Makefile.

  If these strings do not match the current installation the program
  will not get translated.  This can be difficult to debug.

* The Makefile rules can be tricky -- especially considering the
  xgettext utility needs a list of all of the source files.

* By default, the setlocale function call also sets up things like how
  numbers are formatted -- ie, 1000.45 vs 1,000.45 vs 1 000,45.  This
  might cause problems with input routines that do not take this
  variation into consideration.  Yet another friend of the Y2K Bug :)

* For libraries, the textdomain function call from step #2 can not be
  used as it sets a global variable.  Because of this global not being 
  set, the gettext macro will not know where the translations are
  coming from.
  
  Instead of using the gettext macro, a different macro called
  dgettext is used with specifies the domain that the translations are
  coming from each time a string is accessed.

  An easy fix is to set the underscore macro to dgettext instead of
  gettext.  If the underscore macro is used all the time then nobody
  will even have to think about it.

* When using msgmerge to update an already existing .po file, watch
  out for "fuzzy" entries -- ones with the word fuzzy in the comment
  above it.

  These are generated with the original text string changes but is
  close enough to the original that msgmerge can find the matching
  translation.

  These strings will not get translated until they are fixed and the
  fuzzy mark has been removed.

* For constant only initializers, the normal gettext macro can not be
  used because it makes a function call.  Instead, a macro called
  gettext_noop or N_ is used.  This macro does nothing except allow
  xgettext to find the text string.  The real macro is then called
  when the translated string is actually used.

* For platforms that do not support gettext, all of the macros can
  simply be defined to do nothing.


I hope this helps,
John.




Date Index | Thread Index | Back to archive index | Back to Mailing List Page

Problems or questions? Contact