Workflow based on master

khaledhosny · December 14, 2014, 6:19am

So, what translators are supposed to translate then?

But that was not my point, I was complaining about people who think that
consistency, following linguistic rules and proper typography are usless
cosmetics. Regardless of how localisation will be done or what language is
the source there always will be chsnges of this kind that localisations
need to follow, failure to do so would mean that we have different UX
between localised and unlocalised versions of LibreOffice, which is
unacceptable but many people here seems to be advocating for that.

Regards,
Khaled

khaledhosny · December 14, 2014, 6:35am

I don't understand why people are fixated on the case changes! These are
not the only “cosmetic” changes people were complaining about. The changes
themselves were never may point, but the way people is complaining about
them is what I'm worried about as it shows that some localiser seem to have
total disregard to things that are intrinsic to the quality of any textual
material, let alone localisation.

Regards,
Khaled

Yury_Tarasievich · December 14, 2014, 6:41am

...

But that was not my point, I was complaining about people who think that
consistency, following linguistic rules and proper typography are usless
cosmetics. Regardless of how localisation will be done or what language is

...

Then you completely misunderstood the point of localisers, and defeated a strawman.

It's commendable to strive for proper typography in the source etc.

But the translation may have had the proper typography (for its language context) first time, couldn't it?

However, localisers (why did you quote the term, anyway?) have to redo the work already (properly) done, repeatedly. Reviewing and approving 1k of strings isn't peanuts, whatever one may think.

To put it into context assuredly familiar to you - how would you like to have to redo from scratch one specific curve in the font, verbatim, several times? Would it strike you as a not quite an optimal way to spend the time you dedicate to open projects?

And I have yet to see those technical marvels we've been promised will compensate for this problem (promised with lot of eff-ing at silly localisers, by the way).

Yury

toki · December 14, 2014, 7:31am

So, what translators are supposed to translate then?

Whatever is in that source. That language, with whatever spelling,
grammatical, semantic markup, and presentation markup it includes, is
what everybody uses.

If you want an en_US localization, then you have an en_US l10n team.
If you want an en_UK localization, then you have an en_UK l10n team.

It doesn't matter what language, much less what dialect, or even cant,
of that language is found in that source. The only people that see it,
are people that read the source code, and people that translate from one
language into another language.

Regardless of how localisation will be done or what language is
the source there always will be changes of this kind that localisations
need to follow,

Done in a properly run project, those changes will only affect the
specific localization team. A change made by the en_UK team won't make a
scrap of difference to the the en_US l10n team. Changes made by either
l10n team, won't affect the af_ZA team, or the de_NA team, or the uk_UK
team.

failure to do so would mean that we have different UX between localised and unlocalised versions of LibreOffice,

That implies that LibO is using i18n and l10n tools and techniques that
were obsolete back in 1990.

which is unacceptable but many people here seems to be advocating for that.

The unlocalized version is not used, much less shipped.

The only versions that are shipped, are versions that have an l10n team
behind them.

jonathon

* English - detected
* English

* English

<javascript:void(0);>

toki · December 14, 2014, 8:00am

Look at it this way.

There are 10,000 strings to translate.
At current rates, a professional translator will charge between
US$20,000 and US$250,000, depending upon the target language.

Convert 10,000 strings in the source to sentence case.
LibO is available in 111 languages, dialects, or other localizations.

That would cost a minimum of US$2,220,000 just to ensure that the
already correctly translated text is still correctly translated, if The
Document Foundation was being billed for it.

I don't know if any l10n projects still hire, and pay professional
translators. I do know that without professional translators, at least a
dozen l10n projects would never have had their first release.

jonathon

sophi · December 14, 2014, 8:07am

Hi Yury, all

...

But that was not my point, I was complaining about people who think that
consistency, following linguistic rules and proper typography are usless
cosmetics. Regardless of how localisation will be done or what
language is

...

Then you completely misunderstood the point of localisers, and defeated
a strawman.

It's commendable to strive for proper typography in the source etc.

But the translation may have had the proper typography (for its language
context) first time, couldn't it?

However, localisers (why did you quote the term, anyway?) have to redo
the work already (properly) done, repeatedly. Reviewing and approving 1k
of strings isn't peanuts, whatever one may think.

To put it into context assuredly familiar to you - how would you like to
have to redo from scratch one specific curve in the font, verbatim,
several times? Would it strike you as a not quite an optimal way to
spend the time you dedicate to open projects?

And I have yet to see those technical marvels we've been promised will
compensate for this problem (promised with lot of eff-ing at silly
localisers, by the way).

Hey, those scripts are done by people to help us, so don't shout on
them. We will discuss these heavy changes after the code/string freeze
with developers and designers. We have to find a way that allow to
maintain the sources without impacting the targets when it's not needed,
let's try to find a solution and keep working in a good mood.

Cheers
Sophie

Yury_Tarasievich · December 14, 2014, 8:24am

...

And I have yet to see those technical marvels we've been promised will
compensate for this problem (promised with lot of eff-ing at silly
localisers, by the way).

Hey, those scripts are done by people to help us, so don't shout on
them. We will discuss these heavy changes after the code/string freeze
with developers and designers. We have to find a way that allow to
maintain the sources without impacting the targets when it's not needed,
let's try to find a solution and keep working in a good mood.

But I do not shout at anybody. Do I?..

Just that I'll believe there are positive changes in this part of workflow, when I'll see an announcement going like (hyperbolised): "We've corrected the fixed space use in 10k worth of strings, but don't worry, your translations won't be kicked out of release, if you'll not redo your 10k worth of translations by tomorrow".

Right now, I'd risk stating that for small teams the task begins to look more trouble than it's worth.

Yury

Rimas_Kudelis · December 14, 2014, 8:59am

Hi,

2014.12.14 02:43, jonathon wrote:

Changes made in one dialect of one language should neither affect, nor
effect changes in other dialects of the same language, much less other
languages.

Huh? English (US) is the “source” language,

Treating en_US, en_DE, en_UK, or any variant thereof as the "source"
language is, at best, translation mismanagement.

No. Translation mismanagement was propagating these cosmetic changes to
Pootle without having automatically updated localized Gettext files.

The more fundamental error is assuming that what is in source is
consistently en_US, or any other en_* variant.

It should be. You can look at it the other way around: anything that
gets in the source should consistently be en_US, not just
whatever_lingo_the_developer_had_in_mind.

Rimas

Yury_Tarasievich · December 14, 2014, 9:38am

2014.12.14 02:43, jonathon wrote:

...

The more fundamental error is assuming that what is in source is
consistently en_US, or any other en_* variant.

It should be. You can look at it the other way around: anything that
gets in the source should consistently be en_US, not just
whatever_lingo_the_developer_had_in_mind.

Are the "sources" mentioned by Jonathon and by Rimas one thing or two different things?

I would say the "source of translation" should be only semantically correct, as regarding the functionality of the entity/activity it refers to. The en_US is just a translation in this regard, and activities regading, say, its typography tradition should be excluded from the lifecycle of non en_US translations.

That won't change anything w/r to the Help pages (in which the text itself is the entity), but it definitely relates to UI strings, which are sort of a primary step in localisation process.

Yury

khaledhosny · December 14, 2014, 9:47am

...

But that was not my point, I was complaining about people who think that
consistency, following linguistic rules and proper typography are usless
cosmetics. Regardless of how localisation will be done or what language

is

...

Then you completely misunderstood the point of localisers, and defeated a

strawman.

It's commendable to strive for proper typography in the source etc.

But the translation may have had the proper typography (for its language

context) first time, couldn't it?

However, localisers (why did you quote the term, anyway?) have to redo

the work already (properly) done, repeatedly. Reviewing and approving 1k of
strings isn't peanuts, whatever one may think.

To put it into context assuredly familiar to you

I have been localising software for much longer than I have been making
fonts (or even writing software) and I know that reviewing a few hundred
strings that were trivially changed is not the end of the world. Usually
the tool I'm using (be it Pootle or Virtaal) would present me of
translation memory of this string which will show the old source string and
highlight the differences from the current one, so it is just few seconds
to review, and one can review hundreds of strings this way in a couple of
hours. Believe me, I have done it countess times and I don't understand all
the whining.

Regards,
Khaled

igaidhlig · December 14, 2014, 9:53am

Khaled,

Sgrìobh Khaled Hosny na leanas 14/12/2014 aig 06:35:

but the way people is complaining about them is what I'm worried about as it shows that some localiser seem to have total disregard to things that are intrinsic to the quality of any textual material, let alone localisation.

I resent that, I actually make a living off this and I'm very good at it. But perhaps it's just that a newsgroup as a medium can be somewhat deficient for human communication. Please let me therefore spell it out to you how I respect l10n and how l10n sometimes disrespects me (Sophie, keeping the good mood is hard - if it happens once, it's a mistake, no probs, don't do it again, let's move on. The second time, I scratch my head and ask 'weren't we here before'. The third time I feel like someone is taking the pish).

So... realy examples...

en-US
You opened %d files > You closed %d files
My head: ok, content change, let's fix it, no worries

en-US
[new string]
My head: ok cool, let's localize

en-US
%Y-%m-%d > %Y-%M-%D
gd-GB
%d-%m-%Y > %D-%M-%Y
My head: ok, maybe there is more space or something, let's get it localized

en-US
Opening text documents > Opening Text Documents
gd-GB
A' fosgladh sgrìobhainnean teacsa > A' fosgladh sgrìobhainnean teacsa
My head: Stop messing me around. My locale has strict rules about caps (only proper nouns are capped), you're making unnecessary work for me.

en-US
Opening ~text documents > Opening _text documents
gd-GB
A' fosgladh ~sgrìobhainnean teacsa > A' fosgladh _sgrìobhainnean teacsa
My head: Maybe there is a technical need for this change but can't you automate that? You're making unnecessary work for me, this is not l10n work because as a localizer I have no control over what marks a hotkey. This is a developer problem.

en-US
The "Terms and Conditions" > The “Terms and Conditions”
gd-GB
Na "teirmichean is cumhaichean" > Na "teirmichean is cumhaichean"
My head: Stop messing me around. The choice of formatted vs unformatted " is locale specific stuff and not governed by the en-US source. If a locale decides to go from unformatted to formatted, then that is decided by the locale (e.g. due to change in linguistic practice in the country). This is not governed by why en-US does or does not do. Case in point, some locales use «» so the question of en-US going " > ” is even more pointless.

en-US
The Terms\n\n and Conditions > The Terms\n\n and Conditions
My head: This is one of the most useless ways of making layout anyway because the localizer has to *guess* how much space there is. To make it worse, you now make me sit through the developers playing with \n or \n\n. Would be nice if someone just wrapped that automatically or gave me some guidance as to the space available.

en-US
The "Trems adn Cnditions" > The "Terms and Conditions"
gd-GB
Na "teirmichean is cumhaichean" > Na "teirmichean is cumhaichean"
My head: Ok so the person who texted this string was drunk. Happens. But why do *I* have to look at this again? The en-US typo does not change the content of the source so my translation is ok as it is, it's just a source typo being fixed. Argh.

THAT'S what we're moaning about...

Michael

igaidhlig · December 14, 2014, 10:03am

Khaled,

Well, then your locale is lucky to have you and/or your team. Probably a language with several million speakers. Lucky you. Unfortunately many teams are tiny and rely on the same localizers who usually handle multiple projects. And before you say grow your team, there's only so much growing you can do when less than 60,000 people speak your language, especially if you're passionate about quality.

So perhaps you don't care about someone wasting your time. That's dandy. But that doesn't apply to most of us who are strapped for time. Microsoft does that kind of thing, true enough. But they pay me.

Michael

Sgrìobh Khaled Hosny na leanas 14/12/2014 aig 09:47:

Yury_Tarasievich · December 14, 2014, 10:29am

I consider that haughty disdain somewhat misplaced. Anybody is free to allocate a couple of hours of life as they please. Maybe to spend those on unpaid (quality) translation work.

However, it's not nice to treat "a couple of hours" of somebody else's life as throwaway resource.

It's not the case, too, that "localisers" (I see now your quotes use wasn't accidental) are some low-level plebes, allowed to play at translation and meanwhile ride on the coattails of the sky-high (LibO) popularity. These people do hard work and create the product (or product enhancement, if you please).

I think that attitude originates in widely spread misconception that anybody by virtue of speaking the language is automatically an expert in all issues related, (technical) translation included. Unlimited pool of free labour, as it were.

Mind you, I do not witness this light-hearted attitude to the unsolicited work in the software developers community, present product team included, for a very good reason, and I at least understand this completely.

Yury

ohallot · December 14, 2014, 11:48am

Hi

It should be. You can look at it the other way around: anything that
gets in the source should consistently be en_US, not just
whatever_lingo_the_developer_had_in_mind.

Rimas

You're right and that is the way it has to be.

We face the issue that LibreOffice developers are mostly not English
native speakers, and they are much less often graduated in English
litterature. Mistakes and poor clarity are introduced in their strings
quite naturally and often. That is the way it is and we live with it
since OpenOffice.org.

Reviewing en-US is a good thing once in a while, even if it gives us
more work, and I expect once fixed it will not change anymore.

As a side note, devs don't even write help pages to explain their new
features, and this doesn't help our translation job too.

Rimas_Kudelis · December 14, 2014, 12:19pm

hi,

2014.12.14 13:48, Olivier Hallot wrote:

It should be. You can look at it the other way around: anything that
gets in the source should consistently be en_US, not just
whatever_lingo_the_developer_had_in_mind.

Rimas

You're right and that is the way it has to be.

We face the issue that LibreOffice developers are mostly not English
native speakers, and they are much less often graduated in English
litterature. Mistakes and poor clarity are introduced in their strings
quite naturally and often. That is the way it is and we live with it
since OpenOffice.org.

Which is why I'm advocating string review process, if only it is
possible. It's perfectly normal that some of us have trouble writing
concise and typographically correct English. But isn't it similar to
writing buggy code? We do have code reviews, where someone else with the
right set of knowledge reviews code patches. So why not have a similar
process for string changes? Why not ask somebody who maybe can't code,
but knows English (including typographical stuff) well to review the
strings that are being changed or newly introduced?

Now that I think of it, if we have UX reviews (and if we don't, we
should), strings might just fall into that category.

Reviewing en-US is a good thing once in a while, even if it gives us
more work, and I expect once fixed it will not change anymore.

Reviewing en-US is a good thing for sure. I believe however, that when
we have massive changes, which can be automatically transferred to
localized resources without degrading l10n quality, we should transfer
them automatically (maybe on a per locale opt-in or opt-out basis). What
has been happening lately (and is explained in Michael's examples) is
indeed a major messup (mismanagement, if you like). Few more cases like
this, and LibO will start losing localizers. This thread is a clear warning.

As a side note, devs don't even write help pages to explain their new
features, and this doesn't help our translation job too.

Devs don't have to write help pages. However, when others writes these
help pages, they could be the ones raising a flag about string quality
issues.

Rimas

Tom_Davies1 · December 14, 2014, 1:05pm

Hi
My last "hare-brained" idea was blatantly flawed. Thanks to Yury (i think)
and someone else for shooting it down quickly before it went anywhere!
Sorry about that!

It sounds like there is scope for a lot of automation. There might already
be ways of doing it.

1. Can fuzzy strings be accepted "en masse", preferably by large-scale
selection? (I'm guessing there isn't at the moment)
2. Is there an "undo"?
3. Can individual strings "undo" to get single strings back to being fuzzy?

Also, is there a way of getting an extremely large selection of strings
grouped in some way so that people can see the whole group had 1 specific
change? Even fairly small groupings might help a bit!

Regards from
Tom

Yury_Tarasievich · December 15, 2014, 10:08am

The "fair" way of automating the solution of this problem would encompass analysing the differences between the former and the new variants of source. Only the differences beyond the source grammar (!) and punctuation (including technical use -- for macro vars and such) should ever be marked as requiring revision (fuzzy). Same for the simple moves of strings from one part of source set to another.

Technical use of punctuation should also be auto-corrected, to the point of extending the process to translations.

All this, however, doesn't have any grounding in the current technological setup of OOO localisation, as far as I know it.

Yury

It sounds like there is scope for a lot of automation. There might already
be ways of doing it.

...

Rimas_Kudelis · December 15, 2014, 8:44pm

Hi,

2014.12.14 15:04, Tom Davies wrote:

My last "hare-brained" idea was blatantly flawed. Thanks to Yury (i think)
and someone else for shooting it down quickly before it went anywhere!
Sorry about that!

It sounds like there is scope for a lot of automation. There might already
be ways of doing it.

1. Can fuzzy strings be accepted "en masse", preferably by large-scale
selection? (I'm guessing there isn't at the moment)

2. Is there an "undo"?

3. Can individual strings "undo" to get single strings back to being fuzzy?

Also, is there a way of getting an extremely large selection of strings
grouped in some way so that people can see the whole group had 1 specific
change? Even fairly small groupings might help a bit!

As far as I know, there is no Undo functionality in Pootle. Once you
submit a string, its old version is gone for good. Same goes for
grouping strings by changes: while this might be an interesting idea, I
don't think it is currently possible in Pootle.

And now to my main point (your first question)

As far as I know, Pootle operates on single strings, not on groups. But
even if it did operate on groups, it would be hard to tell whether or
not a particular fuzzy string should be approved without looking at it,
so it wouldn't really help that much. On top of that, it would still
waste someone's time.

I still believe that the right way to implement typography improvements
(and case improvements in most cases) is to transplant such changes to
localized files invisibly. Ideally, it would be done with a prior notice
such as "please upload all your work by day X hour Y so we can base this
migration on your latest work and none of it is being lost".

Same applies to accelerator character change (~ to _).

Even changes like adding a colon at the end of a string (of which people
also complained) should be ported to most locales automatically,
allowing locales to opt-out if they want to do that manually (although I
guess locales should be allowed to opt-out in all cases anyway).

I don't really follow this mailing list too closely, so I might be
wrong, but I would speculate that perhaps the main reason why we ended
up all whining and pointing fingers and sometimes speaking badly of
other people in this thread is the lack of internal communication,
proper planning and preparation. Someone came up with a great idea to
easily improve a massive amount of source strings, but maybe that person
didn't even consider the amount of workload that such change would bring
to the L10n teams. Someone else liked the idea, but did not consider
L10n as well. I can actually relate quite easily to these people – they
are developers, not localizers, it's quite likely that they don't even
know how our L10n process works. So, they liked the idea, and their
changes hit the fan, splashing "typographical nonsense" on everyone.

Had the plans to change such a big amount of strings been communicated
to the localizers well in advance, a red flag would probably have been
raised, and then better preparation (in the form of automatic conversion
scripts) would have been made in anticipation of these changes. Then
everything would be much smoother and we wouldn't waste our time
repeatedly expressing how annoyed and underappreciated we are feeling.

I guess things like these have to be learnt by developers the hard way.
Which is why it's very important that our dissatisfaction with these
issues is communicated to the dev team properly and that they note it
and don't forget it in future. This is also where at least a few L10n
teams working directly on master would help: noticing such unexpected
massive changes early enough should open the door for backing them out
and delaying their re-landing until necessary preparations would be made.

Anyway, I think I've already said all this a couple times before. Time
to sleep.

Cheers,
Rimas

Yury_Tarasievich · December 16, 2014, 6:04am

Will all this (any of this?) be actually implemented?

And...

everything would be much smoother and we wouldn't waste our time
repeatedly expressing how annoyed and underappreciated we are feeling.

...this is real people, not robots for you. They *would* "whine" (who "whined", anyway?) and express various sentiments, instead of just going on with their work quietly.

Preaching to the choir: every project is about people, really. Twice that OSS project. Forget it at project's peril.

Yury

sophi · December 16, 2014, 7:23am

Hi Yury, all,

<lots of true and good things>

Will all this (any of this?) be actually implemented?

And...

everything would be much smoother and we wouldn't waste our time
repeatedly expressing how annoyed and underappreciated we are feeling.

...this is real people, not robots for you. They *would* "whine" (who
"whined", anyway?) and express various sentiments, instead of just going
on with their work quietly.

Preaching to the choir: every project is about people, really. Twice
that OSS project. Forget it at project's peril.

Yes, you're right and that's why we will discuss it with the developers
and the UX team. It's not the time right now, because each team is
reaching deadlines for 4.4. I've already told the board that we have a
workload issue again and sent a mail to devs and UX to be careful with
our workflow. UX has set a page with string changes here:
https://wiki.documentfoundation.org/Documentation/RecentStringChanges

As you said, our project is about people, and lot of different people
are contributing, it's not always easy to pick what one is doing. We
should have a better way to figure it. It's a communication issue, but
on the other hand, any time I try to organize a cross communication
meeting, only the marketing project is participating...
I wrote to Dwayne again, but he is now out of his office until the 5th
of January. For your information, TDF is willing to pay further
developments for Pootle, but the availability of the team is very low.

It's part of my job to smooth things in the project, and I'll really try
to do my best here.
Cheers
Sophie