Translating odf files using omegat

Milos_Sramek · September 13, 2013, 11:52am

Hi,

I would like to translate the book Getting Started with LO 4.0 to Slovak. Translating directly the odt file is perhaps not a good idea, so I would like to use a tool with translation memory - I tried OmegaT. There is, however, a problem with tags:

For example, the sentence

*Quickstarter is installed in the Windows system tray and is automatically loaded during system startup.*

is displayed in the document without any formatting, using just the the Default character style

In OmegaT, however, it is displayed like this:

*<f10>Quickstarter is installed in </f10><f11>the Windows system tray </f11><f12>and is automatically loaded </f12><f13>during system startup.**
**</f13>*/
/(<fxx> are tag replacements and should not be touched/.)
/
This suggests that there are some tags inside this sentence, in spite the fact that I do not see any. Really, in the content.xml file inside the document we can see them:

*<text:span text:style-name="T65">Quickstarter is installed in </text:span><text:span text:style-name="T21">the Windows system tray </text:span><text:span text:style-name="T65">and is automatically loaded </text:span><text:span text:style-name="T21">during system startup. </text:span>*/
/
These tags in fact do not mean anything meaningfull:

*<style:style style:name="T21" style:family="text"><style:text-properties fo:language="en" fo:country="GB"/></style:style>*

Since there are several such tags for nearly each sentence, translating using OmegaT is not possible. I've checked if removing them manually from the content.xml filed changes anything - no, the document remains the same.
So, there is a way how to use OmegaT: one has to clean up the xml code by removing all these useless tags. Does anybody have an idea, how to do that? Do you happen to know such tool?

Thanks
Milos

Tom_Davies · September 13, 2013, 11:58am

Hi :)
It might be worth asking on the L10n (international translators) mailing list and the Documentation Team mailing list.

The international translators focus on the in-built help but still do quite a bit to translate the official guides too. The Documentation Team work on the official guides and might appreciate a bit of help tidying things up or else be able to help show why the tags are there (if there is a reason)
Regards from
Tom

iplaw67 · September 13, 2013, 12:59pm

Hi,

Since there are several such tags for nearly each sentence, translating
using OmegaT is not possible. I've checked if removing them manually
from the content.xml filed changes anything - no, the document remains
the same.
So, there is a way how to use OmegaT: one has to clean up the xml code
by removing all these useless tags. Does anybody have an idea, how to do
that? Do you happen to know such tool?

Having tried to use OmegaT for the very same purpose and not having
found a way around that problem, and similar ones for other ODT
documents I was translating, I have given up on using it at the moment.
There may be others more knowledgable who do know how to solve this problem.

Alex

Felmon_Davis · September 15, 2013, 6:37pm

Hi,

I would like to translate the book Getting Started with LO 4.0 to Slovak. Translating directly the odt file is perhaps not a good idea, so I would like to use a tool with translation memory - I tried OmegaT. There is, however, a problem with tags:

I do not know this program though I took a quick look at the manual; I wondered what happens if you copy and paste your text into it or if you save your file as a text file and load it up into the program? maybe you already tried all of this.

F.

valtermura · September 15, 2013, 7:15pm

In data venerdì 13 settembre 2013 13:52:50, Milos Sramek ha scritto:

Hi,

I would like to translate the book Getting Started with LO 4.0 to
Slovak. Translating directly the odt file is perhaps not a good idea, so
I would like to use a tool with translation memory - I tried OmegaT.
There is, however, a problem with tags:

For example, the sentence

*Quickstarter is installed in the Windows system tray and is
automatically loaded during system startup.*

is displayed in the document without any formatting, using just the the
Default character style

In OmegaT, however, it is displayed like this:

*<f10>Quickstarter is installed in </f10><f11>the Windows system tray
</f11><f12>and is automatically loaded </f12><f13>during system startup.**
**</f13>*/
/(<fxx> are tag replacements and should not be touched/.)
/
This suggests that there are some tags inside this sentence, in spite
the fact that I do not see any. Really, in the content.xml file inside
the document we can see them:

*<text:span text:style-name="T65">Quickstarter is installed in
</text:span><text:span text:style-name="T21">the Windows system tray
</text:span><text:span text:style-name="T65">and is automatically loaded
</text:span><text:span text:style-name="T21">during system startup.
</text:span>*/
/
These tags in fact do not mean anything meaningfull:

*<style:style style:name="T21"
style:family="text"><style:text-properties fo:language="en"
fo:country="GB"/></style:style>*

Since there are several such tags for nearly each sentence, translating
using OmegaT is not possible. I've checked if removing them manually
from the content.xml filed changes anything - no, the document remains
the same.
So, there is a way how to use OmegaT: one has to clean up the xml code
by removing all these useless tags. Does anybody have an idea, how to do
that? Do you happen to know such tool?

Thanks
Milos

Hi Milos

the problem is not OmegaT, the problem are tags inside the odt file. Don't
forget that some tags could be "hidden". Anyway, some suggestions:

- create a copy of the file
- remove all tags from the new file: Ctrl+A / Ctrl+M; save it
- use it for translation > put it in the source folder
- translate it and save (the tmx has been created)

Now put the original file in the source folder and reopen the project: you'll
get the suggestions that can be easily inserted inside the segments.

Subscribe to the OmegaT mailing list (if not already done) and ask there:
http://groups.yahoo.com/group/OmegaT/join

Ciao

iplaw67 · September 16, 2013, 4:39am

Hi Valter,

the problem is not OmegaT, the problem are tags inside the odt file. Don't
forget that some tags could be "hidden". Anyway, some suggestions:

- create a copy of the file
- remove all tags from the new file: Ctrl+A / Ctrl+M; save it
- use it for translation > put it in the source folder
- translate it and save (the tmx has been created)

Now put the original file in the source folder and reopen the project: you'll
get the suggestions that can be easily inserted inside the segments.

My understanding from the OmegaT documentation was that it did indeed insert tags for the representation of certain style formats/attributes that were to be found in a document when importing the source text in order to be able to re-render that information -isn't this what we are talking about here, or is it more a case of the overloading or excessive use of (perhaps redundant) styles in the original document ?

Alex