Hi Everybody,
Lately, I returned to the 'tag elimination' idea because some of our
translators cannot cop up with the tags they see in Virtaal. (We are doing
translation offline). And with a little regex from stackoverflow
<http://stackoverflow.com/a/8784436/1993440> ,
i produced a python script below and got 'clean' text. Here is my entire
'process':
1) po2csv -i x.po -o y.csv => Get CSV
2) Open y.csv in LO and Delete columns A and C
3) run clear-tags.py => get y-clean.csv
4) Open y-clean.csv in LO, run =MOD(ROW(L2),2) on separate column, Filter
the empty row created by po2csv and delete those.
Here<https://docs.google.com/file/d/0B1xCocIHKT5jS3FELXhCdVJ6Mm8/edit>
is
a sample file i did this way.
As you see this is heavy to do for so many po files. But if this is found
to be good and correct, I think we can make it into a bigger script that
can do this repeatedly for a given directory.
"clear-tags.py"
import re
f = open('y.csv')
text = f.read()
f.close()
clean = re.sub('<[^>]+>', ' ', text)
f = open('y-clean.csv', 'w')
f.write(clean)
f.close()