.src and .ui gettext migration

So, I'm pretty happy with my latest iteration of migrating to gettext
and I've got it up and running on all supported platforms.

backstory is https://lists.freedesktop.org/archives/libreoffice/2017-Ma
y/077818.html with the modification that we retain a msgctxt, albeit a
shorter one than the current pseudo msgctxt that is generated for the
current .po files

To try and minimize disruption I've a script available in master as
solenv/bin/update-for-gettext to update our current translations to
give them a new msgctxt (and update their keyid comment)

I've successfully run this over the sample tarball of .pos extracted
from pootle that cloph provided. i.e.
python2 /path/solenv/bin/update-for-gettext translations/libo_ui
It shrinks and normalizes the msgctxt and updates the keyid comment for
.src and .ui strings and moves them into a per-module messages.po.

For .src strings the msgctxt typically becomes the 2nd line of the
current msgctxt i.e. the #define of the string or stringarry

For .ui strings the msgctxt typically becomes uifilename|widgetname

(caolanm->cloph: will this script suffice for getting pootle updated ?)

In our code I've autogenerated matching initial contexts for the
matching .ui and .hrc source strings.

(newly added strings can have arbitrary meaningful context strings
e.g. NC_("papersize", "A4") its just the existing ones that will have
ones mechanically generated from the existing metadata available)

Is there any outstanding concerns or questions.

Hi Caolán, *,

[snip]
In our code I've autogenerated matching initial contexts for the
matching .ui and .hrc source strings.

Could you clarify what do you mean by that?

Best regards,
Mihkel
Estonian team

I'll take a pair of real examples:

Currently a .hrc/.src entry of SV_APP_CPUTHREADS

//in vcl/inc/svids.hrc
#define SV_APP_CPUTHREADS                           10800

//in vcl/source/src/app.src
String SV_APP_CPUTHREADS
{
    Text [en-US] = "CPU threads: ";
};

corresponding .po entry where the msgctxt is currently autogenerated on
extraction from .src to .po

msgctxt ""
"app.src\n"
"SV_APP_CPUTHREADS\n"
"string.text"
msgid "CPU threads: "
msgstr "something or other"

turns into

//in our source where the 1st arg is the context and all existing
//entries contexts are simply derived from the define name
#define SV_APP_CPUTHREADS NC_("SV_APP_CPUTHREADS", "CPU threads: ")

corresponding .po entry
msgctxt "SV_APP_CPUTHREADS"
msgid "CPU threads: "
msgstr "something or other"

and the msgctxt is no longer generated on export to .po just taken from
the source entry

The other case is .ui entries, they go from e.g.

//vcl/uiconfig/ui/printdialog.ui
<property name="label" translatable="yes">Options</property>

with a .po entry of

msgctxt ""
"printdialog.ui\n"
"label21\n"
"label\n"
"string.text"
msgid "Options"
msgstr "Optionen"

where the msgctxt is similarly autogenerated on export to .po

to...

//vcl/uiconfig/ui/printdialog.ui
<property name="label" translatable="yes"
context="printdialog|label21">Options</property>

where the context/msgctxt for existing strings has been one-time
derived from the .ui filename and widget/object the string belong to

with a corresponding .po entry of

msgctxt "printdialog|label21"
msgid "Options"
msgstr "Optionen"

Hi Caolán,

So, I'm pretty happy with my latest iteration of migrating to gettext
and I've got it up and running on all supported platforms.

First: yay \ó/

Is there any outstanding concerns or questions.

Questions:
* where does the new implementation live that determines the actual
  "resource" to be used for the current UI language, including possible
  language fallbacks?
* does std::locale that replaces ResMgr and Translate::Create() that
  uses boost::locale::generator with the new
  LanguageTag::getGlibcLocaleString() fully handle BCP47?

Thanks
  Eike

Questions:
* where does the new implementation live that determines the actual
  "resource" to be used for the current UI language, including
possible language fallbacks?

We pass GetUILanguageTag to boost, and that's always one of the
languages we translate to, as opposed to GetLanguageTag which could be
basically anything. So if someone's desktop locale is, say "de_AT",
then boost is just going to get "de_DE" from LibreOffice as the UI
language. boost itself in boost/libs/locale/src/shared/message.cpp (or
somewhere like that) will then try de_DE/module.mo and fallback to
de/module.mo

* does std::locale that replaces ResMgr and Translate::Create() that
  uses boost::locale::generator with the new
  LanguageTag::getGlibcLocaleString() fully handle BCP47?

No, boost::gettext takes a posix locale string as its argument when it
builds a std::locale to use as input to the translate methods http://ww
w.boost.org/doc/libs/1_48_0/libs/locale/doc/html/rationale.html#why_pos
ix_names which is a bit sucky, but I updated liblangtag to handle the
only one of the locales we translate to (ca-valencia) which it didn't
already know how to map to a posix/glibc locale, so we should have a
valid posix/glibc locale string for each of the bcp-47 language tags
that identify a UI translation target

Caolan,

you should test this really extensively, with the Pootle content as well as
with the non-Pootle l10n teams.

I can help you use the Slovenian translation, as I use a separate
localization platform.
Thus we could test if this change leads to translation corruption or loss
of translated content.

Lp, m.

Caolan,

you should test this really extensively, with the Pootle content as well as
with the non-Pootle l10n teams.

Hi,
changing the msgctxt is problematic from Pootle point of view since:

- Pootle will detect these as new strings that will be imported with
no translation, and
- Pootle will mark old strings as obsolete.

Please hold, and work out a plan to push these strings to Pootle in
order to avoid losing translations. How many strings will be affected
by this change?

Thanks

The original email has...
"
To try and minimize disruption I've a script available in master as
solenv/bin/update-for-gettext to update our current translations to
give them a new msgctxt (and update their keyid comment)
...
I've ... run this over the sample tarball of .pos extracted
from pootle that cloph provided. i.e.
python2 /path/solenv/bin/update-for-gettext translations/libo_ui
It shrinks and normalizes the msgctxt and updates the keyid comment for
.src and .ui strings and moves them into a per-module messages.po.

caolanm->cloph: will this script suffice for getting pootle updated ?
"

so that's what I hoping to do here, update the existing translations in
pootle that have the old autogeneratoed msgctxt to have the new
"static" msgctxt so that (most) translations are not considered
obsolete