Accents in spelling check

I was dismayed to discover recently (and surprised that I never noticed
before) that Libre Office disregards the difference between, say, á and
a in spelling checks, and in find and replace. (Dismayed also to
imagine that someone thinks this is a good idea.)

I wonder if anyone knows how (or whether) I could fix this. Or perhaps
it’s not in Libre Office at all but in Hunspell, or otherwise deep down
in the system. I’ve done some searching but can’t find any relevant
information.

In case it matters, I’m using Libre Office 6.0.5.2 on Ubuntu Mate 18.04.

Many thanks for any help.

On Thu, 2 Aug 2018 21:55 Séamas Ó Brógáin, <sob@leabhair.ie> wrote:

I was dismayed to discover recently (and surprised that I never noticed
before) that Libre Office disregards the difference between, say, á and
a in spelling checks, and in find and replace. (Dismayed also to
imagine that someone thinks this is a good idea.)

I wonder if anyone knows how (or whether) I could fix this. Or perhaps
it’s not in Libre Office at all but in Hunspell, or otherwise deep down
in the system. I’ve done some searching but can’t find any relevant
information.

In case it matters, I’m using Libre Office 6.0.5.2 on Ubuntu Mate 18.04.

Many thanks for any help.

--
To unsubscribe e-mail to: users+unsubscribe@global.libreoffice.org
Problems?
https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy

If I enter the string 'éeéeéeéeéeéeéeée' and do Find (or Find and Replace) for either 'é' or 'e', it picks up only the appropriate form. And if I enter the text 'hélp help', then 'hélp' is flagged by the spell checker but 'help' is not. So I don't seem to be seeing what you describe? I'm using LO 5.2.7.2 under Debian Linux with the language set to English (Canada).

- Robert

02.08.2018 u 22:54, Séamas Ó Brógáin je napisao/la:

I was dismayed to discover recently (and surprised that I never noticed
before) that Libre Office disregards the difference between, say, á and
a in spelling checks, and in find and replace. (Dismayed also to
imagine that someone thinks this is a good idea.)

I wonder if anyone knows how (or whether) I could fix this. Or perhaps
it’s not in Libre Office at all but in Hunspell, or otherwise deep down
in the system. I’ve done some searching but can’t find any relevant
information.

In case it matters, I’m using Libre Office 6.0.5.2 on Ubuntu Mate 18.04.

Many thanks for any help.

Well, there's a Hunspell's feature triggered by ICONV replacement table but caveats exists.

For entry

ICONV á a

Hunspell - and LO as it uses Hunepll - will treat aáa the same as aaa and it will not flag it as misspelled.

Letter á should be added into TRY section of Hunspell dictionary file and words with á should make to wordlist so (possibly misspelled) aaa could be corrected to aáa.

You can't get perfect recognition as that mean that every word in your language should be added accented to your dictionary and that's unbearable in terms of maintaining dictionary.

What you could to is add that ICONV replacement table into you dictionary so accented words get ignored and aáa would not be flagged, but for aáb you'll still get suggestion aaa when what you really wont is aáa.

If words in your language in writing are not generally accented, than you need to find a workaround. Accented letter is still a different letter then regular (a ~ á) and Hunspell can't tell if it's a accent or a letter. In Croatian c ~ č ~ ć and non of last two are accented and are legit (potential) spelling mistakes.

Don't know if this helped.

Kruno

03.08.2018 u 08:51, Krunose je napisao/la:

02.08.2018 u 22:54, Séamas Ó Brógáin je napisao/la:

I was dismayed to discover recently (and surprised that I never noticed
before) that Libre Office disregards the difference between, say, á and
a in spelling checks, and in find and replace. (Dismayed also to
imagine that someone thinks this is a good idea.)

I wonder if anyone knows how (or whether) I could fix this. Or perhaps
it’s not in Libre Office at all but in Hunspell, or otherwise deep down
in the system. I’ve done some searching but can’t find any relevant
information.

In case it matters, I’m using Libre Office 6.0.5.2 on Ubuntu Mate 18.04.

Many thanks for any help.

Well, there's a Hunspell's feature triggered by ICONV replacement table but caveats exists.

For entry

ICONV á a

Hunspell - and LO as it uses Hunepll - will treat aáa the same as aaa and it will not flag it as misspelled.

Letter á should be added into TRY section of Hunspell dictionary file and words with á should make to wordlist so (possibly misspelled) aaa could be corrected to aáa.

You can't get perfect recognition as that mean that every word in your language should be added accented to your dictionary and that's unbearable in terms of maintaining dictionary.

What you could to is add that ICONV replacement table into you dictionary so accented words get ignored and aáa would not be flagged, but for aáb you'll still get suggestion aaa when what you really wont is aáa.

If words in your language in writing are not generally accented, than you need to find a workaround. Accented letter is still a different letter then regular (a ~ á) and Hunspell can't tell if it's a accent or a letter. In Croatian c ~ č ~ ć and non of last two are accented and are legit (potential) spelling mistakes.

Don't know if this helped.

Kruno

Most reasonable solution is to locate dictionary on your system and add ICONV replacement table as in

https://github.com/krunose/hr-hunspell/blob/master/hr_HR.aff

so at least you don't get aáa flaged (gets ignored), but again, you still get suggestion aaa if you misspell accented word: aáb > aaa when you'd expect aáa.

Spelling checker is for writing regular texts and not designed for highly technical linguistic text with lot of accents and other special symbols.

Fell free to ask if you have any further questions, but this is best possible solution (but workaround actually).

Kruno

Thank you, all.

The .aff file is somewhat outside my comfort zone, but I’ll certainly
experiment with it.

I may have caused some confusion with my use of the word “accents,”
which I used (for simplicity’s sake) instead of “diacritical marks.”
I’m concerned with words in which the acute accent is not a temporary
mark to show accent or stress but in which áéíóú are distinct letters,
not interchangeable with aeiou. Exchanging one for the other in such a
word is therefore a spelling error.

But I think Kruno has put me on the right track, and that the ICONV
feature in the .aff file is the place to start.

Míle buíochas!

03.08.2018 u 11:29, Séamas Ó Brógáin je napisao/la:

Thank you, all.

The .aff file is somewhat outside my comfort zone, but I’ll certainly
experiment with it.

I may have caused some confusion with my use of the word “accents,”
which I used (for simplicity’s sake) instead of “diacritical marks.”

I’m concerned with words in which the acute accent is not a temporary
mark to show accent or stress but in which áéíóú are distinct letters,
not interchangeable with aeiou. Exchanging one for the other in such a
word is therefore a spelling error.

In that case what you should to is check if áéíó are listed in TRY section of .aff file (usually right at the top) and add any word you need (which uses this letters) to .dic file as regular words (as in your case that's what they really are).

For what I can see from your follow-up is that your dictionary is missing some proper words and you have the same situation as with s ~ š, z ~ ž, c ~ č ~ ć in Croatian. My previous answer was really concerning accents.

Feel free to ask if you need any further advice or assistance.

Kruno

03.08.2018 u 11:48, Krunose je napisao/la:

03.08.2018 u 11:29, Séamas Ó Brógáin je napisao/la:

Thank you, all.

The .aff file is somewhat outside my comfort zone, but I’ll certainly
experiment with it.

I may have caused some confusion with my use of the word “accents,”
which I used (for simplicity’s sake) instead of “diacritical marks.”

I’m concerned with words in which the acute accent is not a temporary
mark to show accent or stress but in which áéíóú are distinct letters,
not interchangeable with aeiou. Exchanging one for the other in such a
word is therefore a spelling error.

In that case what you should to is check if áéíó are listed in TRY section of .aff file (usually right at the top) and add any word you need (which uses this letters) to .dic file as regular words (as in your case that's what they really are).

For what I can see from your follow-up is that your dictionary is missing some proper words and you have the same situation as with s ~ š, z ~ ž, c ~ č ~ ć in Croatian. My previous answer was really concerning accents.

Feel free to ask if you need any further advice or assistance.

Kruno

Just make sure you're not editing dictionary for one-time situation as in that case my advice would be to just bare with it. If words you plan to add are proper words in your language and you feel dictionary is not complete in that regard, you should contact whoever is maintaining dictionary for your language so it can be updated.

Kruno

Well, I simply reinstalled a back-up .aff file that I had, and it seems
to work as it should (and as it did––I think). Perhaps I had a
corrupted .aff file.

But I’m intrigued by some of the options in the .aff file, and I intend
to go into it more deeply as soon as I get the time.

In the meantime, many thanks for all the help.