Pootle migration

Hi,

right now I'm trying to migrate all the data from our current Pootle
server to the new one. This means that if you make any changes in Pootle
past this point, they will not be reflected in the new installation, so
please don't edit your translations in Pootle until further notice, or
at least make yourself a backup to restore from when done.

Once I'm finished setting Pootle up, we'll all have a few days to test
it and decide whether or not its performance is acceptable. If it is,
we'll just continue using it, and if not, we'll see how to proceed from
there.

So, the bottom line is: if you'll find your latest changes missing at
some poing, don't be too surprised – it's sort of expected.

Regards,
Rimas

Hi Rimas

For Portuguese language, pootle doesn´t have the strings merged from
openoffice (ui and help) nor extensions. Is it possible to add them in new
server?

Regards

Hi Sérgio,

2011.03.29 00:12, Sérgio Marques rašė:

For Portuguese language, pootle doesn´t have the strings merged from
openoffice (ui and help) nor extensions. Is it possible to add them in new
server?

Yes. We'll ask Andras to do the merging stuff, and I'll upload the
resulting files to Pootle.

Darn, we (I?) should keep a TODO on the wiki...

Good night, :slight_smile:
Rimas

Yes, when Rimas tells me that new Pootle server is ready, I'll process
all the backlog (Lao, Portuguise etc.). Thanks for your patience. :slight_smile:

Andras

Dear Andras/Rimas,

Please add Gujarati (gu) in your list as well for LibreOffice UI and
Help migration, it isn't available in current pootle server.

Moreover, I would suggest to have all languages with at least
LibreOffice UI and Help configured by default, as this is LibreOffice
pootle translation server.

Thanks!

HI Rimas

Thanks for you reply. When merged, I´ll try to update all strings for 3.4

Regards

May I suggest that you close the old server in this periode?

Cheers,
Leif Lodahl

Hello Andras ,
When the new server is ready Breton should be set in UI 3.4.
Should I do it ? Do I have rights ? Or do you ?
Is the new server already working ?
Let us know when it's ok.
Thanks
Denis

Rimas knows... I'm not involved in the server migration. It would be
great, if we could start to work on 3.4 translations next week.

Andras

2011.03.31. 13:21 keltezéssel, Andras Timar írta:

Hello Andras ,
When the new server is ready Breton should be set in UI 3.4.
Should I do it ? Do I have rights ? Or do you ?
Is the new server already working ?
Let us know when it's ok.
Thanks
Denis

Rimas knows... I'm not involved in the server migration. It would be
great, if we could start to work on 3.4 translations next week.

Hi,

What is the status of this migration? It seems that it happened. But it
was not announced. Please let me know, because we need to start 3.4
translations ASAP.

Thanks,
Andras

Hi,

2011.04.04 15:35, Andras Timar rašė:

2011.03.31. 13:21 keltezéssel, Andras Timar írta:

Hello Andras ,
When the new server is ready Breton should be set in UI 3.4.
Should I do it ? Do I have rights ? Or do you ?
Is the new server already working ?
Let us know when it's ok.
Thanks
Denis

Rimas knows... I'm not involved in the server migration. It would be
great, if we could start to work on 3.4 translations next week.

What is the status of this migration? It seems that it happened. But it
was not announced. Please let me know, because we need to start 3.4
translations ASAP.

Right. I've migrated the database and files, and updated the database at some later point, before switching DNS entries to point to the Virtual machine.

Now we need to import 3.4 files into the new Pootle and then start localizing (aka testing the installation). How can we proceed with this?

Stuff to consider: Pootle may run slower on this machine, and we may experience other problems, at least for now. I wasn't able to convince the admin that Pootle needs more resources, so we have what we have. Maybe if we manage to give enough load to the server, he'll change his mind (or we'll find other ways to deal with the problem).

So, let's start importing files for 3.4, shall we?

Rimas

I think first step should be to migrate those languages that already
have 'full' translations. Rimas, is it possible to copy libo33x_ui and
libo33x_help files to libo34x_ui and libo34_help respectively, and
update them using the libo34x_ui and libo34_help templates? I can add
other languages via the web interface later. I updated the templates
for 3.4 now.

Thanks,
Andras

Hi *,

Stuff to consider: Pootle may run slower on this machine, and we may
experience other problems, at least for now. I wasn't able to convince the
admin that Pootle needs more resources, so we have what we have.

Again, as you apparently still don't understand what I already wrote many times:
* Adding more resources will /NOT/ make pootle run faster than it does now.
The VM already has way more resources assigned than necessary. It is
/idle/ almost all the time.
* The only thing that is slow (when executed the first time) is
generation of the zips. So when you as translator request a zip: Don't
click the link multiple times because you don't immediately get the
zip. It can take 10 seconds for the files to be generated. Again:
* Adding more resources will /not/ make that time shorter. It is a
single-threaded process that can only uses one single CPU, thus
assigning more CPUs won't help at all (the VM has 4CPUs assigned
already)
Requesting that same zip another time (or different zips of the
project belonging to the same language is fast/instant, but requesting
the zip for another language again may take some seconds for the first
request (or again after the files did change in between).
* Pootle has a memory leak when creating the zips. It won't release
memory after processing the files.
This would be the only time where the assigned resources may run out
(the VM has 1GB or RAM assigned): Multiple different languages request
the zip at the same time. Then memory usage increases, memory runs out
and either it is crawling along or the process gets killed.
* I will NOT assign RAM to a VM (and thus block that ram for other
use) to satisfy a memory leak, when that RAM is unused 99% of the
time.
* The effects of the memory leak can be nullified by just restarting
the worker processes more frequently. Thus again:
* Adding more resources will /NOT/ make the VM run faster, it will
/NOT/ allow it to handle more requests

Pootle is idling almost all of the time. There is less than one apache
request per second on average (and for regular requests (i.e.
non-"generate-a-zip" actions) it can easily serve >>50 simultaneous
requests per second.)

Maybe if we
manage to give enough load to the server, he'll change his mind (or we'll
find other ways to deal with the problem).

No, I won't change my mind, but depending on the load/effects of the
memory leak I'll reduce lifetime of the server-processes further.

Again:
The only time where pootle is "slow" is:
* Creation of zips for the first time / after files have changed.
This is CPU intensive process, and the CPU cannot be made faster by
assigning more resources. Live with it. Redesign pootle to use another
method (i.e. a multithreaded one that can use multiple CPUs at once)
or whatever, but this one problem is not solvable by assigning more
resources.

* This also is only noteworthy when the files are big/numerous.
* Requesting the other zips in the same project & language is fast.

So there is no point in clicking all zip-URLs on a page at once (on
the contrary, than the request will all cause CPU to be burnt, while
all that is only needed one single time). If you want to download more
than one zip of a page, click the first one, wait until it is
generated and handed over to your browser, then click as many others
on the same page and get all of them quickly.

* Any other time, doing in-place-translation, just browsing along is
supposed to be fast.
If it is not, it needs to be investigated - i.e. please report it with
a description of what you did, when the problem occurred (and of
course what you definition of "slow" is)

The server is under close monitoring using munin, so bottlenecks
should be easily identified.

But with an (daily) average load of 0.09 and an average of 0.08 apache
requests/second (to which munin also contributes) are no way reason to
assign more resources.

Again:
* The VM has plenty of resources, with much reserves for more traffic/accesses.
* Creation of ZIPs is CPU-intense and thus take some seconds (for the
large packages up to 10 seconds) Be patient when requesting a zip for
the first time.
* When you see "premature end of script headers" or similar error
message - report it. This means the memory-leak did exceed the limits
and the lifetime parameters need to be tweaked.
* If you experience any "slowness" on other operations than requesting
zips: Report them.

(for users in Europe, it might now even be faster, as the data
doesn't need to cross the Atlantic anymore - well, not that humans
will notice that difference, but still :-))

ciao
Christian

Hi Christian,

2011.04.04. 22:28 keltezéssel, Christian Lohmaier írta:

* If you experience any "slowness" on other operations than requesting
zips: Report them.

Many thanks for your insightful explanations. I find the new server more
responsive than the old one. However, some operations are still slow.
When I add a new language to a project (e.g. LibreOffice 3.4.x UI) and
click Overview tab of that new language, I have to wait ~2-3 minutes.
Then I upload translations to that language and I have to wait another
~5 minutes. This is not a convenient way to upload 100 languages. Can it
be done in a batch process from the shell?

Thanks,
Andras

Hi all,

I'm importing files for 3.4 into Pootle now. To make this process faster, please don't work on it just yet. We'll post here when the import is finished.

Rimas

Hi Andras, *,

2011.04.04. 22:28 keltezéssel, Christian Lohmaier írta:

* If you experience any "slowness" on other operations than requesting
zips: Report them.

Many thanks for your insightful explanations. I find the new server more
responsive than the old one. However, some operations are still slow.
When I add a new language to a project (e.g. LibreOffice 3.4.x UI) and
click Overview tab of that new language, I have to wait ~2-3 minutes.

You don't write when exactly you did it, but from munin stats I see
there is high CPU utilization between around the time you wrote that
post (i.e. between 9 and 10). But that high utilization is again only
on one single core, i.e. 100% out of 400% are used only.
This is pootles's processing that is slow here (CPU-bound, thus
nothing fixable unless you rewrite pootle to use multithreaded
approach).

Then I upload translations to that language and I have to wait another
~5 minutes. This is not a convenient way to upload 100 languages. Can it
be done in a batch process from the shell?

Oh, large-scale updates surely should be possible using the shell, but
I don't know really know pootle, thus I cannot tell for sure (but I
guess Rimas is doing just that right now...)

ciao
Christian

Hi *,

I'm importing files for 3.4 into Pootle now. To make this process faster,
please don't work on it just yet. We'll post here when the import is
finished.

This import also is not very efficient resource-wise. It stresses disk
i/o, thus wa time makes up much of the load.

Maybe (I don't know) it is more efficient to run multiple manage
processes in parallel, with the task to update one language only
instead of starting one and giving it the task to update all languages
at once.

ciao
Christian

..or if it does similar weirdo stuff like when creating zip, request
one language only at first, and then see whether processing an
additional one is (much) faster than the initial language)

ciao
Christian

Hi Andras, *,

2011.04.04. 22:28 keltezéssel, Christian Lohmaier írta:

* If you experience any "slowness" on other operations than requesting
zips: Report them.

Many thanks for your insightful explanations. I find the new server more
responsive than the old one. However, some operations are still slow.
When I add a new language to a project (e.g. LibreOffice 3.4.x UI) and
click Overview tab of that new language, I have to wait ~2-3 minutes.

You don't write when exactly you did it, but from munin stats I see
there is high CPU utilization between around the time you wrote that
post (i.e. between 9 and 10). But that high utilization is again only
on one single core, i.e. 100% out of 400% are used only.
This is pootles's processing that is slow here (CPU-bound, thus
nothing fixable unless you rewrite pootle to use multithreaded
approach).

When adding a new language we don't do any caching until you go to the overview page. Thus when first viewing we are doing all the updating for search and check failures. A way around this would be to run refresh_stats manage.py command.

I doubt multithreading would help here because of Python's poor multithreading ability. So more invasive changes to display stats as they are available would probably be the correct way to make the UI more responsive. We're relying on Apache to start each mod_wsgi instance so while this is slow and hogging that one core it wouldn't prevent someone from translating elsewhere in Pootle.

Then I upload translations to that language and I have to wait another
~5 minutes. This is not a convenient way to upload 100 languages. Can it
be done in a batch process from the shell?

Oh, large-scale updates surely should be possible using the shell, but
I don't know really know pootle, thus I cannot tell for sure (but I
guess Rimas is doing just that right now...)

When uploading translations a few things are happening.

   1. Unzipping if needed
   2. Parsing the files that you supplied
   3. Merging your uploads with the current data in the db
   4. Refreshing the stats

We do this to prevent any data loss, but as you can imagine its a lot of work. One thing worth trying on the LO Pootle server is to make use of the C PO parser, its much faster but not as widely tested as the Python PO parser.

You can do this from the command line update_stores followed by refresh_stats. This of course if you know that the files on the files system are the ones that you want as it will override anything in the database.

Sent to the list on behalf of Friedel:

Hi Christian, everybody

Please CC me on any replies as I'm not on the list.

I'm one of the developers in the Translate project, and have been trying
to help Rimas a bit with this deployment of Pootle. Thanks for your work
on the server for Pootle! Please allow me a few comments:

Background:
Pootle simply isn't coded to run in small 16MB processes. It is a full
featured web application written on a heavy framework (Django) in a
programming language that isn't very frugal with memory use (Python). We
didn't specifically optimise for memory use over other things when we
programmed Pootle, and Libreoffice is running a system with over 5000
files for the libo34x_ui project alone. When looking at help, the files
are very big by any standard in the world of FOSS. Of course, the help
has many of these big files per language. This is all fine. Pootle runs
fine with loads like this, as (obviously) visible from the OOo project.

The rest of my comments are inline...

> Subject:
> Re: [libreoffice-l10n] Pootle
> migration
> Date:
> Mon, 4 Apr 2011 22:28:43 +0200
> From:
> Christian Lohmaier
>
Hi *,
>
> >
> > Stuff to consider: Pootle may run slower on this machine, and we may
> > experience other problems, at least for now. I wasn't able to convince the
> > admin that Pootle needs more resources, so we have what we have.
>
> Again, as you apparently still don't understand what I already wrote many times:
> * Adding more resources will /NOT/ make pootle run faster than it does now.
> The VM already has way more resources assigned than necessary. It is
> /idle/ almost all the time.

As far as I know the server hasn't really been used yet, so I guess
we'll be collecting data from now on to see how things go. During the
setup of the server, we make tradeoffs between performance and memory
use. If there is no memory available, we'll obviously try to optimise at
all cost for minimising memory use, and that is what I understand that
Rimas said: things might be slower than necessary, since we are not
optimising for performance, but for memory use.

> * The only thing that is slow (when executed the first time) is
> generation of the zips. So when you as translator request a zip: Don't
> click the link multiple times because you don't immediately get the
> zip. It can take 10 seconds for the files to be generated. Again:
> * Adding more resources will /not/ make that time shorter. It is a
> single-threaded process that can only uses one single CPU, thus
> assigning more CPUs won't help at all (the VM has 4CPUs assigned
> already)
> Requesting that same zip another time (or different zips of the
> project belonging to the same language is fast/instant, but requesting
> the zip for another language again may take some seconds for the first
> request (or again after the files did change in between).
> * Pootle has a memory leak when creating the zips. It won't release
> memory after processing the files.
> This would be the only time where the assigned resources may run out
> (the VM has 1GB or RAM assigned): Multiple different languages request
> the zip at the same time. Then memory usage increases, memory runs out
> and either it is crawling along or the process gets killed.

Some stuff that is slow to load is cached for later use. This is done
for performance optimisation. This is one of the reasons you won't see
the memory use go down immediately after generating a ZIP file. Another
reason is the way the garbage collector works in Python.

Deciding to cache something is a tradeoff. So we can disable or minimize
some of the caching, which will simply make a few things slower,
hopefully not by much, but we're guessing while the server hasn't been
used much yet.

I suggested some customisations to the parse pool (to do exactly this).
That affects the number of cached files and search indexes, both of
which are very large on your server.

> * I will NOT assign RAM to a VM (and thus block that ram for other
> use) to satisfy a memory leak, when that RAM is unused 99% of the
> time.

I believe what you are seeing is the caching, not a memory leak.

We haven't seen the server used much yet. My educated guess from having
worked on a few Pootle installations is that the RAM isn't enough, but
let's keep an eye on things and see how it goes. I assumed we'll want a
nice fast server supporting several concurrent users during the build-up
to a release, but we can still tune things down a bit more, I guess.

> * The effects of the memory leak can be nullified by just restarting
> the worker processes more frequently. Thus again:

...at the cost of making things slower, since more stuff needs to be
loaded in memory afresh every time you restart a process.

> * Adding more resources will /NOT/ make the VM run faster

It most probably will, since we are sacrificing performance to minimise
memory use. For example: we opted for more threads, rather than
processes, that is known not to perform as well in Python, especially
for CPU intensive tasks.

> it will
> /NOT/ allow it to handle more requests

It most probably will, since we reduced the number of processes to
minimise memory use, and slower serving of requests necessarily affects
the number of requests you can serve in any given time.

> Pootle is idling almost all of the time. There is less than one apache
> request per second on average (and for regular requests (i.e.
> non-"generate-a-zip" actions) it can easily serve >>50 simultaneous
> requests per second.)

Let's keep an eye on things when people actually start to use the
server.

> > Maybe if we
> > manage to give enough load to the server, he'll change his mind (or we'll
> > find other ways to deal with the problem).
>
> No, I won't change my mind, but depending on the load/effects of the
> memory leak I'll reduce lifetime of the server-processes further.

I hope you will be reasonable and look at the data as it becomes
available, and at least consider changing your mind. As mentioned, I'm
pretty sure there is no memory leak. If you reduce the lifetime of the
server processes, you are just making performance worse, which is all
that Rimas warned the users for.

> Again:
> The only time where pootle is "slow" is:
> * Creation of zips for the first time / after files have changed.
> This is CPU intensive process, and the CPU cannot be made faster by
> assigning more resources. Live with it. Redesign pootle to use another
> method (i.e. a multithreaded one that can use multiple CPUs at once)
> or whatever, but this one problem is not solvable by assigning more
> resources.

I agree. Generating the ZIP files is slow. Doing it multithreaded, will
limit the performance for more users while doing that.

> * This also is only noteworthy when the files are big/numerous.

Which is the norm on this server, unfortunately.

> * Requesting the other zips in the same project & language is fast.
>
> So there is no point in clicking all zip-URLs on a page at once (on
> the contrary, than the request will all cause CPU to be burnt, while
> all that is only needed one single time). If you want to download more
> than one zip of a page, click the first one, wait until it is
> generated and handed over to your browser, then click as many others
> on the same page and get all of them quickly.

Yes, we have optimised for several cases here that are likely, and I
suggested some workarounds for some of the issues we're likely to hit
with the little bit of RAM as Rimas has already started doing, as far as
I know.

> * Any other time, doing in-place-translation, just browsing along is
> supposed to be fast.

... as long as we're not hitting the current imposed limits of
concurrency.

Let's see how well we can make this work.

Keep well
Friedel