Pootle is working again + a new check!

Hi,

All of the struggle in the past 2 days was because I wanted to install
a new check to Pootle. It is "Invalid XML" written by Tamas Zolnai.

So, the Pootle downtime was because of a misconfiguration, Pootle went
into an endless loop. Thanks to julen from #pootle channel, it has
been fixed. And now I know how to do it better next time. :slight_smile:

Invalid XML check is really important. It checks XML errors in readme
and help. Segments that contain XML errors are ignored by the build
system. So please double check this, otherwise those sentences will be
in English in your localization. There are 3 known false positives,
that you should ignore (<empty> in Calc help, and <ref> and
<reference> in MediaWiki extension help). Still, it is much better
than "XML tags" check, which has hundreds of false positives in every
language. I would like to introduce more useful checks in the future.

Best regards,
Andras

Hi Andras, Could You tell me whats the problem with this string:

<link href="text/shared/00/00000002.xhp#html" name="HTML">HTML</link> pages
contain certain structural and formatting instructions called tags. Tags
are code words enclosed by brackets in the document description language
HTML. Many tags contain text or hyperlink references between the opening
and closing brackets. For example, titles are marked by the tags <h1> at
the beginning and </h1> at the end of the title. Some tags only appear on
their own such as <br> for a line break or <img ...> to link a graphic.

As páginas <link href="text/shared/00/00000002.xhp#html"
name="HTML">HTML</link> contêm determinadas instruções estruturais e de
formatação designadas por controlos. Os controlos são palavras de código,
delimitadas por parênteses, na linguagem de descrição de documentos HTML.
Muitos controlos contêm referências a hiperligações e texto entre o
parêntese de abertura e o parêntese de fecho. Por exemplo, os títulos estão
assinalados pelos controlos <h1> no início e </h1> no fim do título. Alguns
controlos aparecem isolados, tais como <br> para uma quebra de linha, ou
<img ...> para ligar a um objeto gráfico.

Regards

Nothing is wrong with this. One more false positive. You can remove
the flag by clicking on it at the "Failing checks:" area, and you are
done.

Best regards,
Andras

Great. Thanks.

Hi Andras,

I have a string too:

https://translations.documentfoundation.org/tr/libo36x_help/translate.html#unit=24473556

In this string there should be no errors(in case my eyes see wrong).
However it gives an XML error, also this error is shown on the
error.log for tr which you have sent me.

I see a closing tag for switchinline in outdated German and
English(UK) translations? Could it be an error for English source
string?

Best regards,
Zeki

Hi Andras

Thank you and Julen for your hard work!

Pootle is a critical tool for our small team, and I think it's important to
make you know how much we appreciate the people which "make it happen".

Regards,

Στις 05/12/2012 08:07 μμ, ο/η Andras Timar έγραψε:

Hi,

There are 3 known false positives,
that you should ignore (<empty> in Calc help, and <ref> and
<reference> in MediaWiki extension help). Still, it is much better
than "XML tags" check, which has hundreds of false positives in every
language. I would like to introduce more useful checks in the future.

Hi Andras, Could You tell me whats the problem with this string:

<link href="text/shared/00/00000002.xhp#html" name="HTML">HTML</link> pages
contain certain structural and formatting instructions called tags. Tags are
code words enclosed by brackets in the document description language HTML.
Many tags contain text or hyperlink references between the opening and
closing brackets. For example, titles are marked by the tags <h1> at the
beginning and </h1> at the end of the title. Some tags only appear on their
own such as <br> for a line break or <img ...> to link a graphic.

As páginas <link href="text/shared/00/00000002.xhp#html"
name="HTML">HTML</link> contêm determinadas instruções estruturais e de
formatação designadas por controlos. Os controlos são palavras de código,
delimitadas por parênteses, na linguagem de descrição de documentos HTML.
Muitos controlos contêm referências a hiperligações e texto entre o
parêntese de abertura e o parêntese de fecho. Por exemplo, os títulos estão
assinalados pelos controlos <h1> no início e </h1> no fim do título. Alguns
controlos aparecem isolados, tais como <br> para uma quebra de linha, ou
<img ...> para ligar a um objeto gráfico.

Nothing is wrong with this. One more false positive. You can remove
the flag by clicking on it at the "Failing checks:" area, and you are
done.

Best regards,
Andras

In UI I can see the following critical errors:
1) For invalid XML
Note: The transformation uses the new style of footnotes with <ref> and <references> tags that requires the Cite.php extension to be installed into MediaWiki. If those tags occur as plain text in the transformation result, ask the Wiki administrator to install this extension.
Σημείωση: Η μεταμόρφωση χρησιμοποιεί τη νέα τεχνοτροπία υποσημειώσεων με ετικέτες <ref> και <references> που απαιτούν την επέκταση Cite.php για να εγκατασταθούν στο MediaWiki. Εάν αυτές οι ετικέτες εμφανίζονται ως απλό κείμενο στο αποτέλεσμα του μετασχηματισμού, ζητήστε από τον διαχειριστή να εγκαταστήσει την επέκταση.
Is included the <references> to be ignored?
2)In placeholders
#NAME? - #ΟΝΟΜΑ?, #MACRO? - #ΜΑΚΡΟΕΝΤΟΛΗ?
Action [Time]: [1]. [2] - Ενέργεια [Χρόνος]: [1]. [2]
[None] - [Κανένα] [User] - [Χρήστης]
Are they wrong?

Hi Zeki,

I don't see the XML error in this segment, probably you have corrected
it meanwhile. But I don't see the +C in translation. The shortcut is
Ctrl+Alt+C.

English source never contains XML errors. The help would not compile otherwise.

Best regards,
Andras

Hi Dimitris,

In UI I can see the following critical errors:
1) For invalid XML
Note: The transformation uses the new style of footnotes with <ref> and
<references> tags that requires the Cite.php extension to be installed into
MediaWiki. If those tags occur as plain text in the transformation result,
ask the Wiki administrator to install this extension.
Σημείωση: Η μεταμόρφωση χρησιμοποιεί τη νέα τεχνοτροπία υποσημειώσεων με
ετικέτες <ref> και <references> που απαιτούν την επέκταση Cite.php για να
εγκατασταθούν στο MediaWiki. Εάν αυτές οι ετικέτες εμφανίζονται ως απλό
κείμενο στο αποτέλεσμα του μετασχηματισμού, ζητήστε από τον διαχειριστή να
εγκαταστήσει την επέκταση.
Is included the <references> to be ignored?
2)In placeholders
#NAME? - #ΟΝΟΜΑ?, #MACRO? - #ΜΑΚΡΟΕΝΤΟΛΗ?
Action [Time]: [1]. [2] - Ενέργεια [Χρόνος]: [1]. [2]
[None] - [Κανένα] [User] - [Χρήστης]
Are they wrong?

Pootle's checks are simple (and stupid). For example in some context
text between [ ] brackets is a placeholder, while in other context [ ]
brackets have no special meaning, they are just plain text. In .ulf
files everything between [ ] should be left in English. So do not
translate [Time], [Date] etc. On the other hand, [None] and [User]
come from .src files, and they are translatable. #NAME? and #MACRO?
are spreadsheet error codes, and they are translatable.

It would be good to collect the most annoying false positives and
design new checks (in Wiki for example). When there are 1000 false
positives, noone will go through them. We need to make Pootle check
useful. It involves some programming work, but the main task is the
design. If we find out the rules, anyone with medium level Python
skills can implement them.

Best regards,
Andras

06 Ara 2012 09:26 tarihinde "Andras Timar" <timar74@gmail.com> yazdı:

Hi Zeki,

I don't see the XML error in this segment, probably you have corrected
it meanwhile. But I don't see the +C in translation. The shortcut is
Ctrl+Alt+C.

Oh!

I'm not still get used to the new interface. I've missed the horizontal
toolbar in the source text box.

Sorry for the inconvenience :slight_smile:
Zeki

Στις 06/12/2012 09:44 πμ, ο/η Andras Timar έγραψε:

Hi Dimitris,

In UI I can see the following critical errors:
1) For invalid XML
Note: The transformation uses the new style of footnotes with <ref> and
<references> tags that requires the Cite.php extension to be installed into
MediaWiki. If those tags occur as plain text in the transformation result,
ask the Wiki administrator to install this extension.
Σημείωση: Η μεταμόρφωση χρησιμοποιεί τη νέα τεχνοτροπία υποσημειώσεων με
ετικέτες <ref> και <references> που απαιτούν την επέκταση Cite.php για να
εγκατασταθούν στο MediaWiki. Εάν αυτές οι ετικέτες εμφανίζονται ως απλό
κείμενο στο αποτέλεσμα του μετασχηματισμού, ζητήστε από τον διαχειριστή να
εγκαταστήσει την επέκταση.
Is included the <references> to be ignored?
2)In placeholders
#NAME? - #ΟΝΟΜΑ?, #MACRO? - #ΜΑΚΡΟΕΝΤΟΛΗ?
Action [Time]: [1]. [2] - Ενέργεια [Χρόνος]: [1]. [2]
[None] - [Κανένα] [User] - [Χρήστης]
Are they wrong?

Pootle's checks are simple (and stupid). For example in some context
text between [ ] brackets is a placeholder, while in other context [ ]
brackets have no special meaning, they are just plain text. In .ulf
files everything between [ ] should be left in English. So do not
translate [Time], [Date] etc. On the other hand, [None] and [User]
come from .src files, and they are translatable. #NAME? and #MACRO?
are spreadsheet error codes, and they are translatable.

It would be good to collect the most annoying false positives and
design new checks (in Wiki for example). When there are 1000 false
positives, noone will go through them. We need to make Pootle check
useful. It involves some programming work, but the main task is the
design. If we find out the rules, anyone with medium level Python
skills can implement them.

Best regards,
Andras

Some more problems
In help I can see critical errors like these.
1. In XML Tags
1.1
<ahelp hid=".">Select a field and click < to remove it from the list of primary key fields. The primary key is created as a concatenation of the fields in this list, from top to bottom.</ahelp>
<ahelp hid=".">Επιλέξτε ένα πεδίο και πατήστε < για να το αφαιρέσετε από τη λίστα με τα πεδία του πρωτεύοντος κλειδιού. Το πρωτεύον κλειδί δημιουργείται ως συνένωση των πεδίων σε αυτή τη λίστα, από πάνω μέχρι κάτω.</ahelp>
I suppose the <. Am I leaving it?
1.2 Click the Date Area and move the time and date field. Select the <date/time> field and apply some formatting to change the format for the date and time on all slides. The same applies to the Footer Area and the Slide Number Area.
Πατήστε στην περιοχή ημερομηνίας και μετακινήστε το πεδίο της ώρας και της ημερομηνίας. Επιλέξτε το πεδίο <ημερομηνία/ώρα> και εφαρμόστε κάποια μορφοποίηση για να αλλάξετε τη μορφή για την ημερομηνία και την ώρα σε όλες τις διαφάνειες. Το ίδιο εφαρμόζεται και στις περιοχές υποσέλιδο και αριθμός διαφάνειας.
I suppose <date/time> - <ημερομηνία/ώρα>. Am I leaving it?
2. In Invalid XML
2.1
<link href="text/shared/00/00000002.xhp#html" name="HTML">HTML</link> pages contain certain structural and formatting instructions called tags. Tags are code words enclosed by brackets in the document description language HTML. Many tags contain text or hyperlink references between the opening and closing brackets. For example, titles are marked by the tags <h1> at the beginning and </h1> at the end of the title. Some tags only appear on their own such as <br> for a line break or <img ...> to link a graphic.
<link href="text/shared/00/00000002.xhp#html" name="HTML">HTML</link> Οι σελίδες περιέχουν συγκεκριμένες δομικές και μορφοποιητικές οδηγίες που ονομάζονται ετικέτες. Οι ετικέτες είναι κωδικές λέξεις που περικλείονται από γωνιακές αγκύλες στη γλώσσα περιγραφής εγγράφου HTML. Πολλές ετικέτες περιέχουν παραπομπές κειμένου ή υπερσύνδεσης μεταξύ της αρχικής και της τελικής γωνιακής αγκύλης . Για παράδειγμα, οι τίτλοι επισημαίνονται με τις ετικέτες <h1> στην αρχή και </h1> στο τέλος του τίτλου. Ορισμένες ετικέτες εμφανίζονται μόνο από μόνες τους όπως η <br> για την αλλαγή γραμμής ή η <img ...> για τη σύνδεση με γραφικό.
In <img ...>?, <h1> at the beginning and </h1>? and what am I doing?
3. In placeholders
3.1
Blue Function [Runtime] Συνάρτηση Blue [Χρόνου εκτέλεσης] (problem with brackets, too). There are dozens of them.
4.
The last one refers to help in 3.6 in Greek. I can see the Contents in Greek, but nothing works in Greek with index and Find.

Some more problems
In help I can see critical errors like these.
1. In XML Tags
1.1
<ahelp hid=".">Select a field and click < to remove it from the list
of primary key fields. The primary key is created as a concatenation
of the fields in this list, from top to bottom.</ahelp>
<ahelp hid=".">Επιλέξτε ένα πεδίο και πατήστε < για να το αφαιρέσετε
από τη λίστα με τα πεδία του πρωτεύοντος κλειδιού. Το πρωτεύον κλειδί
δημιουργείται ως συνένωση των πεδίων σε αυτή τη λίστα, από πάνω μέχρι
κάτω.</ahelp>
I suppose the <. Am I leaving it?
1.2 Click the Date Area and move the time and date field. Select the
<date/time> field and apply some formatting to change the format for
the date and time on all slides. The same applies to the Footer Area
and the Slide Number Area.
Πατήστε στην περιοχή ημερομηνίας και μετακινήστε το πεδίο της ώρας και
της ημερομηνίας. Επιλέξτε το πεδίο <ημερομηνία/ώρα> και εφαρμόστε
κάποια μορφοποίηση για να αλλάξετε τη μορφή για την ημερομηνία και την
ώρα σε όλες τις διαφάνειες. Το ίδιο εφαρμόζεται και στις περιοχές
υποσέλιδο και αριθμός διαφάνειας.
I suppose <date/time> - <ημερομηνία/ώρα>. Am I leaving it?

In help source < and > in the text are encoded as &lt; and &gt;. When we
extract the string, we convert then to readable character, and on merge
we convert them back. Pootle cannot understand help XML syntax, it
thinks that <date/time> is a tag. We could teach Pootle to understand
out files better, but then we would loose simplicity and speed.

2. In Invalid XML
2.1
<link href="text/shared/00/00000002.xhp#html" name="HTML">HTML</link>
pages contain certain structural and formatting instructions called
tags. Tags are code words enclosed by brackets in the document
description language HTML. Many tags contain text or hyperlink
references between the opening and closing brackets. For example,
titles are marked by the tags <h1> at the beginning and </h1> at the
end of the title. Some tags only appear on their own such as <br> for
a line break or <img ...> to link a graphic.
<link href="text/shared/00/00000002.xhp#html" name="HTML">HTML</link>
Οι σελίδες περιέχουν συγκεκριμένες δομικές και μορφοποιητικές οδηγίες
που ονομάζονται ετικέτες. Οι ετικέτες είναι κωδικές λέξεις που
περικλείονται από γωνιακές αγκύλες στη γλώσσα περιγραφής εγγράφου
HTML. Πολλές ετικέτες περιέχουν παραπομπές κειμένου ή υπερσύνδεσης
μεταξύ της αρχικής και της τελικής γωνιακής αγκύλης . Για παράδειγμα,
οι τίτλοι επισημαίνονται με τις ετικέτες <h1> στην αρχή και </h1> στο
τέλος του τίτλου. Ορισμένες ετικέτες εμφανίζονται μόνο από μόνες τους
όπως η <br> για την αλλαγή γραμμής ή η <img ...> για τη σύνδεση με
γραφικό.
In <img ...>?, <h1> at the beginning and </h1>? and what am I doing?

False positive, ignore.

3. In placeholders
3.1
Blue Function [Runtime] Συνάρτηση Blue [Χρόνου εκτέλεσης] (problem
with brackets, too). There are dozens of them.

Help has very few placeholders, $[officename] and %PRODUCTNAME come to
my mind. [Runtime] is not a placeholder, it is an adjective.

4.
The last one refers to help in 3.6 in Greek. I can see the Contents in
Greek, but nothing works in Greek with index and Find.

Please report this to bugzilla.

Thanks,
Andras