Other neat stuff

O Goireasan Akerbeltz
Jump to navigation Jump to search

Here are some other neat tools and scripts that might come in handy for people working on l10n tools in under-resourced languages.

Alphabetical words

As in, words in which all letters are in the order of the alphabet (like beefily in English). You need a text file with one word on each line, then run the following command:

$ cat FILENAME.txt | while read x; do echo $x `echo $x | sed 's/./\n&/g' | sort -u | tr -d "\n"`; done | egrep '^(.+) \1$'

If you want to allow double letters such as nn, chance | sort -u | to | sort |. If there's a lot of output, paste it into a spreadsheet like LibreOffice Calc or Microsoft Excel and use a sort-by-length function.

Clear translations from a po file

$ cat old.po | LC_ALL=C sed '1,/^$/!{/^msgstr /,/^$/{/^msgstr /s/.*/msgstr ""/; /^msgstr /!{/./d}}}' | LC_ALL=C sed '${/msgstr\[/s/.*/&\n/}' | LC_ALL=C sed '/^msgstr\[/,/^$/{/./d;/^$/{s/^/msgstr[0] ""\nmsgstr[1] ""\n/}}' | msgattrib --no-obsolete --clear-fuzzy --clear-previous > new.po

l10n for Humans
Basics - Projects - Gear - Terminology - Other neat stuff