Advice on translating large Odt documents

Hi :slight_smile:  
Sorry, this is a bit off-topic for this list but it seems to be the only place that might have relevant experience and expertise about this issue.

Are there any good tools to help people translate fairly large documents produced by LibreOffice.  There is a group that has some Odt files of up to 60 pages per "chapter" and those chapters get combined to form books of 300-600 pages.

At the moment their only way of translating them means they can't access most of the tools you folks use and it's difficult to find a good work-flow too.

Does anyone here translate office documents or books and have any good suggestions for tools and/or work-flow?  
Regards from 
Tom :slight_smile:

Hi Tom,

Hi :slight_smile: Sorry, this is a bit off-topic for this list but it seems to be
the only place that might have relevant experience and expertise
about this issue.

Are there any good tools to help people translate fairly large
documents produced by LibreOffice. There is a group that has some
Odt files of up to 60 pages per "chapter" and those chapters get
combined to form books of 300-600 pages.

At the moment their only way of translating them means they can't
access most of the tools you folks use and it's difficult to find a
good work-flow too.

They can use OmegaT, it's a very good tool translated in several
languages, see
http://www.omegat.org/

Does anyone here translate office documents or books and have any
good suggestions for tools and/or work-flow?

The workflow depend on the size of the team, numbers of translators,
proof readers, etc.

Kind regards
Sophie

Hi Tom,

We've just successfully passed a proof-of-the-concept testing of translating the LO guides using OmegaT. It works, even in a team setup in which the translators share translation memory over a subversion (or git) repository.

The only problem we have encountered (and solved) is oversegmentation of translation segments owing to direct formatting. Even words were split by this, making translation nearly impossible. We had to remove the formatting (replaced by styles) and then had to clean the files by a script (which removed all direct formatting). The script is available (but it is not an enterprise grade one :slight_smile:
I've written to this list about that some days ago.

Milos

As I have written to this list Dňa 10.10.2013 11:47, Sophie wrote / napísal(a):

Hi :slight_smile:

Thanks Sophie and Milos! :)  That neatly solves the whole problem for me.

If i do have further problems i might ask about them separately but, for me, this case is closed already! :smiley:
Many thanks and regards from

Tom :slight_smile:

Hi Tom,

We've just successfully passed a proof-of-the-concept testing of
translating the LO guides using OmegaT. It works, even in a team setup
in which the translators share translation memory over a subversion (or
git) repository.

The only problem we have encountered (and solved) is oversegmentation of
translation segments owing to direct formatting. Even words were split
by this, making translation nearly impossible. We had to remove the
formatting (replaced by styles)  and then had to clean the files by a
script (which removed all direct formatting). The script is available
(but it is not an enterprise grade one :slight_smile:
I've written to this list about that some days ago.

Milos

As I have written to this list Dňa 10.10.2013 11:47, Sophie wrote /
napísal(a):

Hi Sophie, all,

They can use OmegaT, it's a very good tool translated in several
languages, see
http://www.omegat.org/

Whilst OmegaT seems pretty stable on Linux, my experience with it on Mac has been...less than satisfactory, to the extent that it never seems to be able to stay running more than 10-15 minutes before dying unceremoniously on me.

Alex

Hi Milos,

The only problem we have encountered (and solved) is oversegmentation of
translation segments owing to direct formatting. Even words were split
by this, making translation nearly impossible. We had to remove the
formatting (replaced by styles) and then had to clean the files by a
script (which removed all direct formatting). The script is available
(but it is not an enterprise grade one :slight_smile:

I would be interested in being able to use that script to try and clean up the Base guides I'm attempting to translate at the moment.

Alex

Hi Alex,

you can download the script from http://ubuntuone.com/1LSDBsRaraP5CHDXMRPjSW
Except for it there is also the whole Getting started guide - original files, files with manually removed direct formating and files cleaned by the script.
The styles were modified (colored background) so that one can distinguish between direct formatting and styles.

The script currently does not accept any switches and can be meaningfully used only with the LO guides - some stuff is hardcoded inside. I plan, however, improve it a bit :slight_smile: See the corr.sh script in the archive.

What is important - once the text is cleaned up, it should be sent to the authors replace the originals. Otherwise we will never get rid of the direct formatting mess.

Best
Milos

Dňa 11.10.2013 11:25, Alex Thurgood wrote / napísal(a):

Hi :slight_smile:

I think it is fairly easy to get a Virtual Machine onto Mac and then install some Gnu&Linux in as default a configuration as you have time for, just to run the odd 1 or 2 apps.  If you want an extremely light-weight, minimalist Gnu&Linux then SliTaz is good and is designed in French rather than English (although it's possible to change to quite a few different languages).

Also there is Wine for running Windows apps almost as though they are native apps.

Since Mac and Gnu&Linux don't hog resources so much surely those sorts of options are more viable than they are on Windows.  I'm not sure that really helps but i hope it might.  Also i'm fairly sure Alex has already tried things like that.

Regards from

Tom :slight_smile:

Hi, Alex,

I use OmegaT on OSX without any problems, I am using the latest betas and
it is quite stable, even with a large LO translation memory and LT checking
enabled. What OS, OmegaT, Java versions are you using?

Lp, m.

Hi Martin,

I use OmegaT on OSX without any problems, I am using the latest betas and
it is quite stable, even with a large LO translation memory and LT checking
enabled. What OS, OmegaT, Java versions are you using?

I'll check and get back to you. As for the JVM, in theory it is Apple
provided JDK 1.6_29, I think.

Alex

In data giovedì 10 ottobre 2013 11:47:05, Sophie ha scritto:

Hi Tom,

> Hi :slight_smile: Sorry, this is a bit off-topic for this list but it seems to be
> the only place that might have relevant experience and expertise
> about this issue.
>
> Are there any good tools to help people translate fairly large
> documents produced by LibreOffice. There is a group that has some
> Odt files of up to 60 pages per "chapter" and those chapters get
> combined to form books of 300-600 pages.
>
> At the moment their only way of translating them means they can't
> access most of the tools you folks use and it's difficult to find a
> good work-flow too.

They can use OmegaT, it's a very good tool translated in several
languages, see
http://www.omegat.org/

> Does anyone here translate office documents or books and have any
> good suggestions for tools and/or work-flow?

The workflow depend on the size of the team, numbers of translators,
proof readers, etc.

I can fully support this suggestion... :slight_smile:

Ciao