Printing PO file without tag

Tadele_Assefa · December 8, 2012, 7:27pm

Dear All,

We are translating the Libreoffice to Sidama (in Ethiopia). Our
problem is this; some of our team members are language experts who do
not know tags etc in the po files AND (computer usage infact). As the
Help po files are so large we proposed to print the po files without
the tags as pure text and then after the language guys finish the
translation on paper, we will type and tag the translations. is there
a tool we can use to accomplish this?

Janis · December 8, 2012, 7:43pm

Citējot Tadele Assefa <milkyswd@gmail.com>
Sat, 8 Dec 2012 22:27:49 +0300:

Dear All,

We are translating the Libreoffice to Sidama (in Ethiopia). Our
problem is this; some of our team members are language experts who do
not know tags etc in the po files AND (computer usage infact). As the
Help po files are so large we proposed to print the po files without
the tags as pure text and then after the language guys finish the
translation on paper, we will type and tag the translations. is there
a tool we can use to accomplish this?

if those are classic .po files, you can use Poedit or Virtaal (both available for Linux and Windows). in case of xml - no idea.

Janis

yaron · December 9, 2012, 8:25am

Hey guys!
These are great news!
If you need we can probably compose some sort of regex that eliminated the
tags, after accomplishing that task we can use Poedit to produce an HTML
output.

Kind regards,
Yaron Shahrabani.

Yaron Shahrabani

Kostas_Mousafiris · December 9, 2012, 8:37am

Hi Yaron and good morning to everyone!
Just let me express my appreciation about this idea!
I am also an enthusiastic translator in my mother tongue (Greek), but as
I am not a really tech-savvy person, so ...
I quickly get tired and annoyed with the intense and continuous effort I
have to commit, in order to avoid messing up with those tags...
It would be a real blessing for all volunteers like myself, to be
liberated from this business.
If this "regex" thingy can give us a hand, then kudos!

Constantine

yaron · December 9, 2012, 9:58am

This is what I came up with so far.
There are several disadvantages and I would also like to add an option to
extract the "alt" text and put it in parenthesis (instead of removing it).

Any help is appreciated, feel free to comment and fix, after we will come
up with the appropriate solution we can make a nice script out of it for
every platform required:
http://regexpal.com/?flags=g&regex=<[%2F]%3F[\w\s\%3D\"]*>&input=<data>This%20text<%2Fdata> <just%20some%20tag%3D"altt">

Yaron Shahrabani

Tadele_Assefa · December 10, 2012, 8:21am

Yaron,

If it can be made to cleare the tags, i think your idea of 'regex' is
brilliant. I was thinking of converting the po to csv open them in
text editor and remove similar tags by search and replace, and the
left overs by hand.... which is more of tiresome job for so many
files.

Please expand the regex as much as possible and release it.

Thanks,

yaron · December 10, 2012, 9:29am

I still need some info from the translation maintainers:
What type of tags are there in the translation? (I'm not sure about the
type of tags).

And Tadele: could you please show me some types of the tags you want to
eliminate?

(Please type the entire text and what should be the output).

Kind regards,

Yaron Shahrabani

Tadele_Assefa · January 19, 2013, 7:57pm

Hi Everybody,

Lately, I returned to the 'tag elimination' idea because some of our
translators cannot cop up with the tags they see in Virtaal. (We are doing
translation offline). And with a little regex from stackoverflow
<http://stackoverflow.com/a/8784436/1993440> ,
i produced a python script below and got 'clean' text. Here is my entire
'process':

1) po2csv -i x.po -o y.csv => Get CSV
2) Open y.csv in LO and Delete columns A and C
3) run clear-tags.py => get y-clean.csv
4) Open y-clean.csv in LO, run =MOD(ROW(L2),2) on separate column, Filter
the empty row created by po2csv and delete those.
Here<https://docs.google.com/file/d/0B1xCocIHKT5jS3FELXhCdVJ6Mm8/edit>
is
a sample file i did this way.

As you see this is heavy to do for so many po files. But if this is found
to be good and correct, I think we can make it into a bigger script that
can do this repeatedly for a given directory.

"clear-tags.py"
import re
f = open('y.csv')
text = f.read()
f.close()

clean = re.sub('<[^>]+>', ' ', text)

f = open('y-clean.csv', 'w')
f.write(clean)
f.close()

Chris_Leonard · January 19, 2013, 8:20pm

Tadele,

I'm wondering if pocommentclean from the Translate Toolkit is the
droid you are looking for.

http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/pocommentclean.html

cjl

Zeki · January 19, 2013, 9:38pm

I do not recommend this type of work. XML tags are not that hard and
translating it in pootle is easy.

Also re-placing the tags could be a nightmare, strings with wrong tags are
not implemented in translations.

Regards,
Zeki