Question about translating language names

I was translating various language names used in LO and had a question about the English strings, specifically the name of Congo. As a Chinese I know nothing about languages spoken in Africa, so I'm writing to the list, hope someone more knowlegable can answer.

For the languages presented in LO, the parentheses are usually [1] used for notation of the country/region the language is spoken in. And in
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?checksum=c7f58abceaa60b71
Democratic Republic of the Congo, also known as Congo-Kinshasa, is noted as such.

However there is also
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?checksum=4abc603904d1d7fa
where the country notation is only "Congo". Does this mean it's spoken in Republic of the Congo (a.k.a. Congo-Brazzaville)? If yes, it's probably better to be explicitly spelled out, as "Congo" is ambigous.

There are also four other strings with "(Congo)" notations:
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?q=source%3A"(Congo)"&offset=2
which also need clarification.

Thanks in advance,
Ming

1. One exception I am familiar with, is Chinese (simplified) and Chinese (traditional). They probably should be "Chinese, simplified" and "Chinese, traditional" if applied the same notation styles as the other language strings. I fully understand how they've become the way they are now and don't in any means propose changes.

Hi Ming,

I was translating various language names used in LO and had a question about the English strings, specifically the name of Congo. As a Chinese I know nothing about languages spoken in Africa, so I'm writing to the list, hope someone more knowlegable can answer.

For the languages presented in LO, the parentheses are usually [1] used for notation of the country/region the language is spoken in. And in
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?checksum=c7f58abceaa60b71
Democratic Republic of the Congo, also known as Congo-Kinshasa, is noted as such.

However there is also
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?checksum=4abc603904d1d7fa
where the country notation is only "Congo". Does this mean it's spoken in Republic of the Congo (a.k.a. Congo-Brazzaville)? If yes, it's probably better to be explicitly spelled out, as "Congo" is ambigous.

There are also four other strings with "(Congo)" notations:
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?q=source%3A"(Congo)"&offset=2
which also need clarification.

In fact, this is part of the language name and not only the name of the
country. Look here for Kituba:
https://glottolog.org/resource/languoid/id/kitu1245
But what is strange is that Aka has no area attached to it
https://glottolog.org/resource/languoid/id/akaa1242
https://iso639-3.sil.org/code/soh
I wonder where it comes from, I'll ask Eike

Cheers
Sophie

Hi Sophie,

------------------ Original ------------------
Send time: Friday, May 29, 2020 5:54 PM

Hi Ming,

I was translating various language names used in LO and had
a question about the English strings, specifically the name of Congo.
As a Chinese I know nothing about languages spoken in Africa, so
I'm writing to the list, hope someone more knowlegable can answer.

For the languages presented in LO, the parentheses are usually used
for notation of the country/region the language is spoken in.

[...]

In fact, this is part of the language name and not only the name of the
country. Look here for Kituba:
https://glottolog.org/resource/languoid/id/kitu1245

I respectfully disagree. They are just names linguists use to distinguish
dialects/variants of the same language. Linguists may like to call them
different languages like they differentiate all those dialects of English:
https://glottolog.org/resource/languoid/id/stan1293
But to an ordinary layman, they are just the same language, spoken with
slight differences in different countries.

And note that although this website uses names like "Indian English", LO
uses "English (India)" in the style I described.

But what is strange is that Aka has no area attached to it
https://glottolog.org/resource/languoid/id/akaa1242
https://iso639-3.sil.org/code/soh

In my original link
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?q=source%3A"(Congo)"&offset=2
you can see three other languages as well (click the "Next" button):
Koongo, Njyem, and Yombe.

I wonder where it comes from, I'll ask Eike

Eike is definitely the expert on these locale issues. It would be great to
hear from him.

Regards,
Ming

Hi Sophie,

------------------ Original ------------------
From: <sophi@libreoffice.org>;
Send time: Friday, May 29, 2020 5:54 PM
Subject: Re: [libreoffice-l10n] Question about translating language names

Hi Ming,

I was translating various language names used in LO and had
a question about the English strings, specifically the name of Congo.
As a Chinese I know nothing about languages spoken in Africa, so
I'm writing to the list, hope someone more knowlegable can answer.

For the languages presented in LO, the parentheses are usually used
for notation of the country/region the language is spoken in.

[...]

In fact, this is part of the language name and not only the name of the
country. Look here for Kituba:
https://glottolog.org/resource/languoid/id/kitu1245

I respectfully disagree. They are just names linguists use to distinguish
dialects/variants of the same language. Linguists may like to call them
different languages like they differentiate all those dialects of English:
https://glottolog.org/resource/languoid/id/stan1293
But to an ordinary layman, they are just the same language, spoken with
slight differences in different countries.

I agree, but what we take usually is the name as described in the iso code
https://iso639-3.sil.org/code/mkw

And note that although this website uses names like "Indian English", LO
uses "English (India)" in the style I described.

There is no iso code for Indian English so I guess it's just an
indication in this case

But what is strange is that Aka has no area attached to it
https://glottolog.org/resource/languoid/id/akaa1242
https://iso639-3.sil.org/code/soh

In my original link
https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?q=source%3A"(Congo)"&offset=2
you can see three other languages as well (click the "Next" button):
Koongo, Njyem, and Yombe.

I wonder where it comes from, I'll ask Eike

Eike is definitely the expert on these locale issues. It would be great to
hear from him.

Yes, so let see what he'll say :slight_smile:
Cheers
Sophie

Hi,

> However there is also
> https://translations.documentfoundation.org/translate/libo_ui-master/svtoolsmessages/zh_Hans/?checksum=4abc603904d1d7fa
> where the country notation is only "Congo". Does this mean it's spoken in Republic of the Congo (a.k.a. Congo-Brazzaville)? If yes, it's probably better to be explicitly spelled out, as "Congo" is ambigous.

Yes, "(Congo)" is the Republic of the Congo, Congo-Brazzaville.
IIRC those languages/locales simply were earlier supported and just
later the "(Democratic Republic of the Congo)" locales were added.

But what is strange is that Aka has no area attached to it
https://glottolog.org/resource/languoid/id/akaa1242
https://iso639-3.sil.org/code/soh
I wonder where it comes from, I'll ask Eike

Geez.. that's a mess. We currently have

LANGUAGE_USER_AKA 0x0677 axk-CF - no entry in language list
LANGUAGE_USER_AKA_CONGO 0x8277 axk-CG "Aka (Congo)"

But the 'axk' ISO 639 code actually is Yaka, not Aka,
https://iso639-3.sil.org/code/axk

'soh' Aka seems to have only 300 native speakers according to
https://en.wikipedia.org/wiki/Sillok_language and is spoken in Sudan,
we did not want to introduce that in this case and also have no entries
for 'soh'.

I'll adjust that to "Yaka (Congo)" and also add axk-CF to the language
list with "Yaka (Central African Republic)".

Thanks for pointing out.

  Eike

Hi,

LANGUAGE_USER_AKA 0x0677 axk-CF - no entry in language list
LANGUAGE_USER_AKA_CONGO 0x8277 axk-CG "Aka (Congo)"

But the 'axk' ISO 639 code actually is Yaka, not Aka,
https://iso639-3.sil.org/code/axk

And to add to confusion, there's also

LANGUAGE_USER_YAKA 0x0683 iyx-CG "Yaka"

https://iso639-3.sil.org/code/iyx

I'll adjust that to "Yaka (Congo)" and also add axk-CF to the language
list with "Yaka (Central African Republic)".

... which somewhat complicates things, we probably want

LANGUAGE_USER_YAKA_CAR 0x0677 axk-CF "Yaka (Central African Republic)"
LANGUAGE_USER_YAKA_CONGO 0x0683 iyx-CG "Yaka (Congo)"

and remap existing document content from 0x8277/axk-CG to 0x0683/iyx-CG.

I'll try to come up with something.

  Eike

Hi,

> https://iso639-3.sil.org/code/axk
https://iso639-3.sil.org/code/iyx

Oh great, there's even Yaka (Democratic Republic of Congo) with yet
another ISO code 'yaf'
https://iso639-3.sil.org/code/yaf

Fun..

  Eike

Hi Eike,

Thanks for the reply.

------------------ Original ------------------

Hi,

(Jeremy Brown as the original contributor of Congolese locales on Cc)

> LANGUAGE_USER_AKA 0x0677 axk-CF - no entry in language list
> LANGUAGE_USER_AKA_CONGO 0x8277 axk-CG "Aka (Congo)"
>
> But the 'axk' ISO 639 code actually is Yaka, not Aka,
> https://iso639-3.sil.org/code/axk

LANGUAGE_USER_YAKA 0x0683 iyx-CG "Yaka"
https://iso639-3.sil.org/code/iyx

> I'll adjust that to "Yaka (Congo)" and also add axk-CF to the language
> list with "Yaka (Central African Republic)".

... which somewhat complicates things, we probably want

LANGUAGE_USER_YAKA_CAR 0x0677 axk-CF "Yaka (Central African Republic)"
LANGUAGE_USER_YAKA_CONGO 0x0683 iyx-CG "Yaka (Congo)"

and remap existing document content from 0x8277/axk-CG to 0x0683/iyx-CG.

To make matters worse, a whole bunch of CG Congo locales was added with

    commit c69221a2f12331cadee4dbed50de30bf8aa230b0

        Add/modify locales & language list entries for Congolese languages

    https://git.libreoffice.org/core/+/c69221a2f12331cadee4dbed50de30bf8aa230b0

which even (now IMHO wrongly) came with axk-CG locale data and (correct)
iyx-CG locale data. Apparently I tried to disentangle some of the mess
already in

    commit 66308dd049b98476127265e9cc9ac32f19dfccaf

        changes to Congolese locales

    https://git.libreoffice.org/core/+/66308dd049b98476127265e9cc9ac32f19dfccaf

introducing LANGUAGE_USER_AKA_CONGO instead of plain LANGUAGE_USER_AKA
(which mainly would be Central African Republic) (Aka actually is
a synonym of Yaka in this context, not the other Aka language) but
still, now I wonder if there was a reasonable intention of having both
axk-CG and iyx-CG or if that was an oversight.

Any clue, someone? In particular Jeremy?

Thanks
  Eike

Hi,

Hi,

https://iso639-3.sil.org/code/axk

https://iso639-3.sil.org/code/iyx

Oh great, there's even Yaka (Democratic Republic of Congo) with yet
another ISO code 'yaf'
https://iso639-3.sil.org/code/yaf

Fun..

Thanks a lot for chasing that, Eike
Cheers
Sophie

Hi,

(Jeremy Brown as the original contributor of Congolese locales on Cc)

Unfortunately no answer from Jeremy, but having dug deeper into this the
current situation makes sense.

> > LANGUAGE_USER_AKA 0x0677 axk-CF - no entry in language list
> > LANGUAGE_USER_AKA_CONGO 0x8277 axk-CG "Aka (Congo)"

The confusing part here is the name Aka. Aka is the autonym (how people
call the language themselves) for Yaka, the ISO 639-3 name.
This is also https://en.wikipedia.org/wiki/Aka_language
and is spoken in both Central African Republic and Congo.

> > But the 'axk' ISO 639 code actually is Yaka, not Aka,
> > https://iso639-3.sil.org/code/axk
>
> LANGUAGE_USER_YAKA 0x0683 iyx-CG "Yaka"
> https://iso639-3.sil.org/code/iyx

That additionally exists but is not Yaka/Aka but "Yaka (Congo)"

So I'll not remap axk-CG to iyx-CG like the idea was earlier, but keep
the locales and add a language list entry for
axk-CF "Aka (Central African Republic)"
and change iyx-CG to "Yaka (Congo)" thus we can add
yaf-CD "Yaka (Democratic Republic of Congo)"
https://iso639-3.sil.org/code/yaf

  Eike