[ANNOUNCE] Locale data date acceptance patterns, localizers HEADS UP please :)

Hi Eike,

I just noticed that locale data apart from Date Patterns also needs to
be updated for the language (Gujarati - gu_IN) I co-ordinate. Will it be
possible for you to update everything together? I can pass the updated
gu_IN.xml file.

Thanks!

Hi, Eike,

For zh-CN, these patterns are commonly used by typists:

M-D
M.D
M/D
Y-M-D
Y.M.D
Y/M/D

Best wishes,
Dean (@xslidian)

For Belarusian, D.M with no more than two digits per part might do (is the two-digit limit "enforcable"?).

Actually, it'd be better to have possibility of switching off the feature altogether, "across the installation", as the traditional fractional part separator /comma/ tends quite often to be substituted by the /dot/, in these times.

Historically, it was often D/M (D in Arabic numerals, M in Roman).

Yury

In order to get rid of the annoying "accept every input as date that
might resemble some date in almost any locale" behavior I recently
implemented locale dependent date acceptance patterns that need to be

...

Hi Yury,

For Belarusian, D.M with no more than two digits per part might do
(is the two-digit limit "enforcable"?).

The pattern is just a prerequisite, if the number input doesn't form
a valid date it doesn't lead to a date even if the pattern was matched,
so anything like 32.13 already wouldn't be a valid date.

Actually, it'd be better to have possibility of switching off the
feature altogether, "across the installation", as the traditional
fractional part separator /comma/ tends quite often to be
substituted by the /dot/, in these times.

I'm not sure I understand. If you're saying that people tend to input
decimal numbers as #.# instead of #,# then better not define the D.M
date acceptance pattern to prevent confusion, and a #.# input will just
be a textual string. Is that what you meant?

Historically, it was often D/M (D in Arabic numerals, M in Roman).

With roman numbers that wouldn't work. We could define a D/M pattern,
but input would have to be in Arabic numerals.

  Eike

Hi Ankitkumar,

I just noticed that locale data apart from Date Patterns also needs to
be updated for the language (Gujarati - gu_IN) I co-ordinate. Will it be
possible for you to update everything together? I can pass the updated
gu_IN.xml file.

Yes, of course, that's fine. If possible I'd prefer a diff against the
latest revision in the repository though instead of the entire file.
Always better in case other changes went in in the mean time.

  Eike

Hi Serg,

I think for ru_RU we need "D/M/" and "D.M.".

Note that ru-RU doesn't define any D/M/Y format, however, I added the
D/M/ input pattern as requested.
http://cgit.freedesktop.org/libreoffice/core/commit/?id=054b910f72de25e085f1fd54e37118503cd5a527

Thanks
  Eike

Hi Modestas,

please add M-D pattern for Lithuanian (lt). This should be converted into
two-digit numbers and result in YYYY-MM-DD. Thanks.

Done
http://cgit.freedesktop.org/libreoffice/core/commit/?id=87b12c42717a3488cc179d07cf42ace2335aced4

Thanks
  Eike

Hi Michael,

For gd, the patterns are D.M, D/M and D-M (technically gd-GB but I
think right now we've only got gd and aren't separating gd-GB and
gd-CA)

We only have gd-GB.

Note that D.M can't be used as it would be a decimal number. D.M. would
be possible, but I don't know if that's used. I added D/M and D-M for
now.
http://cgit.freedesktop.org/libreoffice/core/commit/?id=197c0ef041fb92ba08d6440258e59485c81214a2

Thanks
  Eike

Hi Freek,

For nl, if needed both nl_NL and nl_BE there are two patterns:

D/M and D-M

Added D-M to nl-NL
http://cgit.freedesktop.org/libreoffice/core/commit/?id=741282c6e8d82c31c534b2e30dbc4ba8ca21299e
and D/M to nl-BE
http://cgit.freedesktop.org/libreoffice/core/commit/?id=abfbda01c8faa06f7c97692c25be0e8419cd2c48

Thanks
  Eike

Hi Dean,

For zh-CN, these patterns are commonly used by typists:

M-D
M.D
M/D
Y-M-D
Y.M.D
Y/M/D

Note that "M.D" is not possible as it would be the input of a decimal
number, "Y.M.D" is fine. "Y-M-D" is the always accepted ISO 8601 input
format and doesn't need to be explicitly mentioned. Added all others.
I also added "M/D" with U+FF0F FULLWIDTH SOLIDUS as that is used as the
locale's date separator. And I added "Y年M月D日" and "M月D日" as well as
they are defined in the date formats, I hope that was correct..
http://cgit.freedesktop.org/libreoffice/core/commit/?id=605707652afebf0e5c90311adcc7767ebe807e45

Thanks
  Eike

Hi Milos,

I found several incorrect items in sk_SK.xml. When corrected, may I send
it to you, too?

Yes, of course.

Is it somehow possible to test such file before sending it?

Only by building LibO, as the xml files are converted to binary data
during the build process. The mere syntactical XML correctness can be
checked with a validator, but that doesn't detect any problems regarding
content, see i18npool/source/localedata/data/locale.dtd

Regarding Acceptance patterns. Currently, we have in Slovak:
"11." converts o 11.02.12
"11.2" converts to 11.02.12
Which is OK, I just would prefer 11.02.2012

Then you need a different default medium format, currently FormatElement
with formatindex="20" has default="true", that needed to be
default="false" and FormatElement with formatindex="21" set to
default="true" instead.

For "11." and "11.2" you'd need to add

<DateAcceptancePattern>D.</DateAcceptancePattern>
<DateAcceptancePattern>D.M</DateAcceptancePattern>

though past user experience was that the D. input converted to date
sometimes was considered annoying, also D.M, but of course that differs
by cultural background.

"11-2" also converts to 11.02.12, but I would prefer 2012-01-11
Would that be possible?

No, only a full ISO 8601 Y-M-D date input will display as YYYY-MM-DD if
the locale's format is different.

Both forms (with . and -) are OK according to
Slovak standard, so it makes sense to have both these possibilities.
In fact, no format is defined with the |<DateAcceptancePattern> |tag.
Which one is used then? The default one?

As written in my blog article the default full date acceptance pattern
is generated from FormatElement formatindex="21", for sk-SK that's
DD.MM.YYYY so the pattern is D.M.Y

In fact, can we have different default format for Cand and for Writer?

No.

This makes sense for me - in Calc I find 11.02.2012 more appropriate,
but in Writer
"11. február 2012" should be prefered.

And the last one: In "11. február 2012", "február" is in the nominative
case. However we should use genitive: "februára". Is it possible to
specify that?
Currently, the misuse the Month tag
<Month>
          <MonthID>feb</MonthID>
          <DefaultAbbrvName>február</DefaultAbbrvName>
          <DefaultFullName>februára</DefaultFullName>
</Month>
but it leads often to nonsense.

That's now possible :slight_smile: see
http://erack.org/blog/archives/2-LibreOffice-possessive-genitive-case-and-partitive-case-month-names.html

  Eike

Hi

I'm currently looking in to the Swedish (sv-SE) date-patterns. I'm not quite sure how to deal
with a few things. In Sweden the official standard for time is "H.MM" (or "HH.MM").
It is also very common and accepted to use the international standard with : as a
separator. It is so common that it actually got changed to this i OpenOffice.org and
inherited into LibreOffice.

Is it possible to accept both . and : as separators?

Common dateformats that LibreOffice doesn't support for Swedish include:
D/M
D/M YYYY

EU-standard (not seen that often but since were in the EU and all EU documents ,
use this format it would be a good thing to support):
D.M
D.M.YYYY

The SS-ISO 8601 standard YYYY-MM-DD is the most frequently used format and
the current standard for Swedish in LibreOffice. Is it possible to support all
these formats?

When it comes to date acceptance I'd say that maybe D/M might be good to accept
besides the ISO standard of course. I do not think that it is a good idea to accept
the D.M though.

I did a quick test modifying the sv_SE.xml but got stuck building afterwards.

I suppose my later questions all depend if there is any possibility to support
multiple separators.

Thank you for any guidence
Niklas Johansson

Eike Rathke skrev 2012-01-12 14:43:

Hello,

I don't understand ...

What I need to change for kab_DZ ???

Actually, it'd be better to have possibility of switching off the
feature altogether, "across the installation", as the traditional
fractional part separator /comma/ tends quite often to be
substituted by the /dot/, in these times.

I'm not sure I understand. If you're saying that people tend to input
decimal numbers as #.# instead of #,# then better not define the D.M
date acceptance pattern to prevent confusion, and a #.# input will just
be a textual string. Is that what you meant?

No, no. Just that these times some people (who've taken pains to learn computer, for example) tend to use *both* traditional /comma/ and non-traditional /dot/ indiscriminately. Sort of, putting /dot/ in decimal separator place, because "it's so "in computer".

E.g., once upon a time I myself wanted a quick spreadsheet or something, and began to enter numbers with /dot/ ("what blessed difference does it make?"). Imagine an annoyment when my inputs were one by one converted to dates (what about user working on an installation without a favourite locale?) And then I had to hunt for the switch-off, because the first place I turned to was in Preferences->OOO->General, and there wasn't anything about this besides the "interpret two-digit year as".

Now, I've noticed that guys from ru_RU team requested "D/M/" and "D.M.", precisely for that reason, I believe (/slash/ is, of course, a traditional symbol for a fraction or division sign; single /dot/ is handier to have just as a fractional part separator).

But, in fact, both forms are somewhat counterproductive, because these 1) are not guessable and will require separate learning by the user (nobody without specifically learning those will enter one of the sequences deliberately), and 2) are anyway two digits away from the short form of the date (DD.MM.YY).

Of course, if the functionality is there, anyway, and has to be "fed" something, even such not-quite-intuitive forms will do. *In fact, I hereby request "D/M/" and "D.M." for the be_BY, please.*

But it would be ever so better to have a possibility for computer to not second-guess at all, as such guesses might even be culturally irrelevant.

Historically, it was often D/M (D in Arabic numerals, M in Roman).

With roman numbers that wouldn't work. We could define a D/M pattern,
but input would have to be in Arabic numerals.

No need to, as it was historical example, anyway.

Yury

Hi Jean-Baptiste,

I think for fr_FR we need "D/M".
Asked for other FR variants on discuss@fr ML.

Any news on that? If not, I'll just add D/M

  Eike

Hi Yury,

Of course, if the functionality is there, anyway, and has to be
"fed" something, even such not-quite-intuitive forms will do.

It doesn't _have_ to be fed something, without a specific pattern only
input of a full date (here D.M.Y) yields a date and input of incomplete
dates will not be possible.

*In fact, I hereby request "D/M/" and "D.M." for the be_BY, please.*

No problem, would do, but ...

But it would be ever so better to have a possibility for computer to
not second-guess at all, as such guesses might even be culturally
irrelevant.

... I'm confused now, does be-BY want incomplete date patterns, yes or
no?

  Eike

Hi Niklas,

[... time separator ...]
Is it possible to accept both . and : as separators?

No, only one is possible.

Common dateformats that LibreOffice doesn't support for Swedish include:
D/M
D/M YYYY

EU-standard (not seen that often but since were in the EU and all EU
documents ,
use this format it would be a good thing to support):
D.M
D.M.YYYY

Which standard would that be? I only know of EN 28601 that followed ISO 8601.

The SS-ISO 8601 standard YYYY-MM-DD is the most frequently used format and
the current standard for Swedish in LibreOffice. Is it possible to
support all
these formats?

You can define as many date formats (to display dates) you want by just
adding them to the locale data file starting with formatindex="50", too
many formats in the dialog may just confuse users though..

For date acceptance patterns, as long as the DMY order stays the same
within one locale also multiple separators can be used, ISO 8601 and its
YMD order is always supported internally, no need to define anything for
that.

  Eike

Hi Aferkiw,

I don't understand ...

What I need to change for kab_DZ ???

kab-DZ uses DD/MM/YYYY as edit date format, so D/M/Y is generated as
date acceptance pattern. If additionally the input of incomplete dates,
consisting of day and month only, shall be accepted, it would need
a defined D/M or D/M/ pattern, probably D/M

  Eike

...

But it would be ever so better to have a possibility for computer to
not second-guess at all, as such guesses might even be culturally
irrelevant.

... I'm confused now, does be-BY want incomplete date patterns, yes or
no?

Yes. Sorry.

And also it wants a possibility to switch off "incomplete date recognition" completely? Is this doable?

Thanks!

Yury

Hi Eike

[... time separator ...]
Is it possible to accept both . and : as separators?

No, only one is possible.

OK, good to know.

EU-standard (not seen that often but since were in the EU and all EU
documents ,
use this format it would be a good thing to support):
D.M
D.M.YYYY

Which standard would that be? I only know of EN 28601 that followed ISO 8601.

Sorry, bad choice of words. The source of information that I used are the recommendations
that Språkrådet (the Swedish language council) gives. In their book "Svenska skrivregler"
they state that the format D.M.YYYY is used in all EU-documents regardless of language.
After some searching on the net I found publications I found this link:
http://publications.europa.eu/code/en/en-4100500en.htm
And a few more (Swedish and English) that said something similar to
"Dates in the text should always be given in their full form (6 June 1992), whereas in footnotes they should always be abbreviated, i.e. 6.6.1992, not 6.6.92"

You can define as many date formats (to display dates) you want by just
adding them to the locale data file starting with formatindex="50", too
many formats in the dialog may just confuse users though..

Thank you, I'll have a closer look. Did a quick test build and was able to add D/M YYYY.
I'll discuss things further on the Swedish list and get back to you as soon as possible.

For date acceptance patterns, as long as the DMY order stays the same
within one locale also multiple separators can be used, ISO 8601 and its
YMD order is always supported internally, no need to define anything for
that.

OK, in other words D/M might be a problem for us since the most common format is YYYY-MM-DD.
I don't think it really matters that much, the most important (commonly used) formats
are already supported.