Other neat stuff

O Goireasan Akerbeltz
Jump to navigation Jump to search

Here are some other neat tools and scripts that might come in handy for people working on l10n tools in under-resourced languages.

Alphabetical words

As in, words in which all letters are in the order of the alphabet (like beefily in English). You need a text file with one word on each line, then run the following command:

$ cat FILENAME.txt | while read x; do echo $x `echo $x | sed 's/./\n&/g' | sort -u | tr -d "\n"`; done | egrep '^(.+) \1$'

If you want to allow double letters such as nn, chance | sort -u | to | sort |. If there's a lot of output, paste it into a spreadsheet like LibreOffice Calc or Microsoft Excel and use a sort-by-length function.

Clear translations from a po file

If you have a po file with bad translations and can't get a clean pot file, try this: $ cat old.po | LC_ALL=C sed '1,/^$/!{/^msgstr /,/^$/{/^msgstr /s/.*/msgstr ""/; /^msgstr /!{/./d}}}' | LC_ALL=C sed '${/msgstr\[/s/.*/&\n/}' | LC_ALL=C sed '/^msgstr\[/,/^$/{/./d;/^$/{s/^/msgstr[0] ""\nmsgstr[1] ""\n/}}' | msgattrib --no-obsolete --clear-fuzzy --clear-previous > new.po

or this: $ msgen old.po > new.po

Converting a .lang file to .pot

If the file is UTF-16 encoded and has DOS-style line endings, try:

$ cat English.lang | iconv -f utf16 -t utf8 | tr -d "\015" | sed 's/["\\]/\\&/g' | sed 's/^\([^=]*\)=\(.*\)$/msgctxt "\1"\nmsgid "\2"\nmsgstr ""\n/' > English.pot

Releasing all .ts files in a directory to .qt

$ ls *.ts | xargs lrelease

l10n for Humans
Basics - Projects - Gear - Terminology - Other neat stuff