Complex Text Layout Confusion

For the record, I’m using Writer Version: 4.3.3.2, Build ID:
9bb7eadab57b6755b1265afa86e04bf45fbfc644 running on Ubuntu 14.04 LTS.

This is a question about the dialog box resulting from Tools > Options >
Language Settings > Languages in LibreOffice, and very specifically the
check box for “Complex text layout (CTL)” under “Default languages for
Documents” as well as its interaction with the check box “For the current
document only.”

The help facility and “official” documentation shy away from any discussion
of how all this works, although the LibreOffice 4.2 Writer Guide (page 67)
does explicitly say “If you want the language setting to apply to the
current document only, instead of being the default for all new documents,
select For the current document only, which is consistent with what I would
expect it to mean.
What seems to happen, however, is that regardless of whether I check “For
the current document only,” the status of the check box for “Complex text
layout (CTL)” then seems to be set for any and all documents I open (whether
in the same of different sessions) thereafter. What am I missing? The box
doesn’t seem to do anything.

The documentation also states that this check box enables “support for CTL
(complex text layout) languages such as Hindi, Thai, Hebrew, and Arabic.” I
can’t comment on Hindi, Hebrew, or Arabic, but when using Thai, all this
setting does is permit me to “correct” what seems to be odd behavior with
Writer, but I’ll save that for last.

Several things seem clear when using Thai: the status of the Complex Text
Layout check box (checked or unchecked) doesn’t seem to have any effect on
correct positioning of single and multiple diacritics (see the snippet below
– every text editor I have does this correctly as well, although obviously
there needs to be one or more suitable fonts installed); – clicking within
any Thai text correctly identifies and displays the language in use in the
bar at the bottom of the screen; – double-clicking within any Thai text
correctly selects individual words, such as ใคร from within the phrase
โม่มีใครอยู่ทีนั่น (Thai doesn’t use spaces between words); – and finally,
longer sections of Thai text wrap lines at word breaks (these last two of
course also require an installed Thai dictionary). So, is CTL irrelevant?

Now to the odd and rather annoying behavior:

If the Complex Text Layout check box is UNCHECKED, and my default paragraph
style indicates that the FreeSerif font is in use, when I begin typing
English, FreeSerif is indeed used. If I switch to Thai (I use iBus as an
input method), Writer doesn’t use the perfectly good Thai characters (u+0e00
etc) from within FreeSerif, but instead indicates that it is using
Liberation Serif which, interestingly, contains no Thai characters at all.
This can be confirmed by placing the cursor within the Thai text and using
the Insert > Special Character command. So what font is it actually using? I
exported the file as both an fodt and a pdf to see what I could find.

In the fodt, the only clue I can find is in the following section:
<office:font-face-decls>
  <style:font-face style:name="FreeSerif" svg:font-family="FreeSerif"
style:font-family-generic="roman"/>
  <style:font-face style:name="Liberation Serif"
svg:font-family="&apos;Liberation Serif&apos;"
style:font-family-generic="roman"/>
  <style:font-face style:name="FreeSans" svg:font-family="FreeSans"
style:font-family-generic="swiss"/>
  <style:font-face style:name="Liberation Sans"
svg:font-family="&apos;Liberation Sans&apos;"
style:font-family-generic="swiss"/>
  <style:font-face style:name="DejaVu Sans" svg:font-family="&apos;DejaVu
Sans&apos;" style:font-family-generic="system" style:font-pitch="variable"/>
</office:font-face-decls>

The only font which isn’t explicitly used in this document is Deja Vu Sans;
could Writer be getting the Thai glyphs from there? No, the only Thai glyph
in Deja Vu Sans is for the Baht (Thai currency) glyph, so that isn’t the
font being used.

Hmmm....

In the pdf export, Adobe Reader informs me that Kinnari is embedded. Kinnari
is a perfectly good Latin/Thai font, but certainly not the best match for
FreeSerif that I have installed (one can’t argue matters of taste, of
course, but the Thai characters in FreeSerif seem to be a decent enough
match to the Latin characters in FreeSerif).

If the Complex Text Layout check box is CHECKED, and I set the CTL font also
to FreeSerif, things seem to work as I would expect, but where does the
Writer come up with its default Lohit Hindi font (which also has no Thai
glyphs and isn’t even installed on my system)? And why do I need to set
anything at all?
I can’t find any straightforward explanation of how all this is supposed to
work (or actually works), so while attempting to figure this out (I do
actually mix multiple languages within single documents and it’s a pain), I
began keeping notes, having a vague idea that I could pass them on to Jean
Hollis Weber and her crew to incorporate into the documentation, but I’ve
become hopelessly lost.
The ability to specify a substitute for a font that doesn’t contain the
Unicode planes you need (or contains ones you don’t care for) is good and
useful, but I can’t come to grips with the idea that if I have a font that’s
been chosen to meet the needs of a particular document and contains all the
required glyphs in all the required languages, I can’t just start typing
away, switching keyboard layouts as I wish.
I fully understand, of course, that languages and fonts are not related,
some Unicode planes are utilized by more than one language, and that
therefore Writer may need help to figure out what language is intended, but
for the most part (at least in my experience, this is fairly rare).
Can anyone explain what’s going on? I hate to file a bug, even about the
behavior of the “For the current document only” check box if it’s simply
that I may be making this all too complicated, but it’s starting to become
annoying.

Thanks ahead of time for any comments or guidance.

Hi :slight_smile:
Blimey!! Is it possible to break it down into smaller "bite sized"
problems? Also any chance of doing summaries of the main points?

I am copying this to the international translators mailing list but i doubt
anyone is going to be able to read through the whole thing there.
Hopefully some people might be attracted to one or two points that are
their own pet issues too and may be able to help a bit!!

Good luck all!
Regards from
Tom :slight_smile:

Ahhh, Tom ... if I could break it apart, I would.

The problem is that there seems to be no documentation covering this
facility/subject/whatever. The result is that I'm chasing my tail and can't
figure out which are separate pieces and which are simply the same thing in
another guise. Or how they're inter-related.

I suspect that, buried in the message I wrote, is an actual bug, but it's
ephemeral enough that I can't figure out which is the salient symptom and
what other aspects are merely red herrings. So I thought if this got into
the hands of someone who actually knew how all this "hung together" I'd
better include enough detail to suggest that I did make every attempt (well,
everyone that I could think of) to figure out what was going on.

I have, in the past, presented a few separate ("broken apart"?) pieces, but
the responses (in the few cases where there were any) were always along the
line of "try setting this or that," which of course didn't seem to "fix"
anything, but simply altered the symptoms. Font substitution is of course
well understood, as is language selection, and so forth. The problem
(whether it's my problem or LO's) seems to be in how they all tie together:
thus, my effort to throw in everything I could think of.

I realize that makes reading it in one swell foop (as some say) rather
tedious, but my reasoning is that if someone wrote all the code for this
stuff, someone somewhere therefore likely understands what's going on. If,
on the other hand, multiple people wrote the different pieces without
realizing how they all interact, there may be some internal inconsistencies
that need to be sorted out. So, rather than a "bug" per se, there may be
some overall architectural issue that needs attention.

So much improvement has been made over the past two decades to improve the
ability to simultaneous use multiple languages, scripts, fonts, and so forth
in the same document, that perhaps it's time to figure out just where things
stand to see if LO is keeping up. That's sort of what I'm attempting to do -
so eventually it isn't necessary to jump through all the hoops currently
required unless one is attempting to do something really off-the-wall.

BUT: thanks for reading and passing the message to those who are likely more
in tune with what I'm asking. I had also thought of copying it to the Thai
forum (since that's the example I'm using), but apparently, no one but me
has visited that forum in several years, so I didn't bother.

Have a great day ...

Frank

My initial thought is that the bug is the Thai specific one, that was allegedly fixed back around OOo 2.0 - 2.2 My laptop is acting up, otherwise I'd test it out.
(The fix is to reinstall Linux. Which i hope I'll be able to do by the end of the month.)

The developer's documentation for OOo 1.x describes how it is supposed to work.
There are some developer specifications for 2.x that describes the changes.
I have no idea where to find either set, though.

My other question is what fonts are listed in the default style?

As far as fallback fonts go, the specifications are _not_ implemented.
That part of LibO needs a complete rewrite, to perform to specification.
There are a couple of known bugs there, that could be your problem.

jonathon