RTF files rendering: huge differencies in LO 3.4.2 and MSO 2010 -- bug?

Hi all!

There's a trouble opening simple RTF file in LO 3.4.2. It is opened, but
it's rendering is incorrect. Same file is correctly opened by MSO 2010. May
someone provide feedback on the issue?

For reference:

Image ofLO 3.4.2 with opened file 96.rtf
http://nabble.documentfoundation.org/file/n3355881/file_96_in_libreoffice_342.jpg

Image of MSO 2010 with opened file 96.rtf
http://nabble.documentfoundation.org/file/n3355881/file_96_in_mso_2010.jpg

Original RTF file:
http://nabble.documentfoundation.org/file/n3355881/96.rtf 96.rtf

Another problematic RTF file (incorrectly rendered table in LO 3.4.2):
http://nabble.documentfoundation.org/file/n3355881/requisites_table_in_russian.rtf
requisites_table_in_russian.rtf

Have anybody encountered with the issue?ark@yandex.ru

I'm puzzled.

Tom, are you saying the differences that are being talked about here are attributable to differences between the ODF 1.1 and ODF 1.2 specification? (I.e., Microsoft Office supports the former, and not the latter at this time.)

Or are they discretionary differences in implementations, where later versions of LibreOffice do it better than earlier OpenOffice.org/LibreOffice versions?

There doesn't seem to be any consideration that the problem could be in conversion of RTF on input. Is it known how the RTF was produced? Could it be an RTF version problem?

Arkady, can you tell us how the RDF was produced?

- Dennis

Hi :slight_smile:
I don't know what the problem is or whether it's a format problem or not.  I have noticed that images saved in doc or docX format often move around when opened in a different office suite.  LibreOffice ones opened in MS Office often look a bit wrong.  MS Office ones opened in LibreOffice or even a different version of MS Office tend to look a bit wrong.  I tried Odt 1.1 and then opened in Word and that seemed about fine but i've only tried a tiny sample.

I noticed the anchor point seemed to make a difference but lost track of which ones seemed better in terms of compatibility.  Writer using Odt format seemed to have a lot more flexibility in placing images.  Draw would probably be a lot better but MS Office doesn't really have anything similar

Images in documents are one problem area and a good reason for spreading LibreOffice around to other people so that they can produce documents that look about the same to everyone.  Either that or use Pdf even tho most people can't edit them. 
Regards from
Tom :slight_smile:

Hi all!

There's a trouble opening simple RTF file in LO 3.4.2. It is opened, but
it's rendering is incorrect. Same file is correctly opened by MSO 2010. May
someone provide feedback on the issue?

RTF has always been an issue in OOo and LO. While some recommend using
it, OOo & LO opinion has always been otherwise. I'm not sure exactly
why, as Microsoft has had a published RTF spec for many years. But the
spec seems to be a moving target[1]. Perhaps it's because Microsoft
doesn't always follow their own spec? Samples:

1.
<http://social.msdn.microsoft.com/Forums/ar/innovateonoffice/thread/1cc049b9-d63e-4f5e-b98e-e48e8ee78e94>

[1]
http://www.microsoft.com/download/en/details.aspx?DisplayLang=en&id=7105
[Word 2003: Rich Text Format (RTF) Specification, version 1.8]
http://www.microsoft.com/download/en/details.aspx?id=10725
[Word 2007: Rich Text Format (RTF) Specification, version 1.9.1]

For reference:

Image ofLO 3.4.2 with opened file 96.rtf
http://nabble.documentfoundation.org/file/n3355881/file_96_in_libreoffice_342.jpg

Image of MSO 2010 with opened file 96.rtf
http://nabble.documentfoundation.org/file/n3355881/file_96_in_mso_2010.jpg

Original RTF file:
http://nabble.documentfoundation.org/file/n3355881/96.rtf 96.rtf

My guess is that because it's a drawing it's an issue.

http://en.wikipedia.org/wiki/Rich_Text_Format
<quote>
However, RTF drawing objects are not supported in many RTF
implementations, such as OpenOffice.org[53], LibreOffice, KWord,
Abiword[54] or IBM Lotus Symphony (up to version 1.3 only some limited
support[55]; improved in later versions). When a RTF document with
drawing objects is opened in a software that does not support RTF
drawing objects, they are not displayed at all. Some implementations
will also not display any texts inside drawing objects.[56][57]
Similarly, when a document with drawing objects is saved as RTF in a
software that does not support RTF drawing objects, these are not
preserved in the RTF file. (For example, OpenOffice.org supports drawing
objects in some file formats (e.g. in ODF, SXW, DOC), but do not support
RTF drawing objects.)
...
Each of RTF implementations usually implements only some versions or
subsets of RTF specification. Many of the available RTF converters
cannot understand all new features in the latest RTF specifications.
</quote>

I tested with MS Word 2003 and it opens fine. However with OOo versions
(linux and Windows) 3.3.x - 3.4-dev, and LO 3.3.4 and LO 3.4.3 the issue
is as you show in your .jpg.

Another problematic RTF file (incorrectly rendered table in LO 3.4.2):
http://nabble.documentfoundation.org/file/n3355881/requisites_table_in_russian.rtf
requisites_table_in_russian.rtf

You are correct; the second table isn't rendered as a table (ending with
the last data ...@genesis.ru), but is instead converted from table to text.

You have a valid bug/issue, but my guess is that you'll be waiting a
*very* long time before the issue(s) get resolved (if at all). I'm not
posting this to discourage you from using LO, but I would discourage you
from using RTF in general.
...

Hi :slight_smile:
Hmm, sorry for the double post.

When you save with LibreOffice it doesn't seem to make much difference which format you use as long as the recipient opens the document in any version of LibreOffice.  Even other programs such as google-docs seem to be fine with it.  It's just when you try to open with MS Office that images get moved around.

With Word there can often be differences between the way MS Office 2003, 2007 or 2010 displays documents even if the original document was saved in MS Office in the first place.  You seem to need to be using the same version as whoever you collaborate with.

Of course there will always be some variation depending on printer settings such as page size (A4 vs US letter etc).

Rtf 'should' be the best cross-suite format.  I thought that was one reason MS designed it?  But it's so rarely used that problems with implementation or in the original spec have not been entirely fixed.  Oddly the .doc format seems to be the best format for exchanging files between MS Office and other programs because it's been used so much for so long that most issues have been dealt with.

One reason for preferring Rtf instead of Doc is the amount of clutter stored in Doc making file sizes quite a lot larger in most cases.  Sometimes there is hidden info in there about who first created the document, what type of machine, some revision/changes, info about fonts that have been used, personal info that may have been stored in a documents properties.  A lot of people don't start a fresh new document but just take an old one and delete almost everything to start something that looks fresh but still contains lots of hidden info.  Rtf still contains quite a lot about fonts but removes personal info that Doc would keep. 
http://en.wikipedia.org/wiki/Rich_Text_Format
http://msdn.microsoft.com/en-us/library/aa140277(v=office.10).aspx

Regards from
Tom :slight_smile:

Hi :slight_smile:
Best answer so far by a long way.  Thanks NoOp :)  The wikipedia page was interesting.  Apparently MS are stopping developing the spec for rtf and MS Office 2010 no longer supports it!

Doc (without the X at the end) is better. 
File - "Save As ..." - "MS Word (97/2000/Xp)"
It's about the same on most word-processors.  I think in MSO 2007 & 2010 it's the golden globe at top left instead of "File".  OpenDocument Formats are better but almost no-one can use the most recent spec (unless they have a non-MS suite) so Doc wins at the moment. 
Regards from
Tom :slight_smile:

Hi all

> Hi all!
>
>
> There's a trouble opening simple RTF file in LO 3.4.2. It is opened, but
> it's rendering is incorrect. Same file is correctly opened by MSO 2010. May
> someone provide feedback on the issue?

RTF has always been an issue in OOo and LO. While some recommend using
it, OOo & LO opinion has always been otherwise. I'm not sure exactly
why, as Microsoft has had a published RTF spec for many years. But the
spec seems to be a moving target[1]. Perhaps it's because Microsoft
doesn't always follow their own spec? Samples:

The problem with rtf is that is the specification is revised with each
new MSO version. Thus which version is being used could be an issue. The
latter versions support more features.

Just for note:

Bug 41109 (https://bugs.freedesktop.org/show_bug.cgi?id=41109) was posted on
bugzilla.

Hi :slight_smile:
Thanks Arkady :slight_smile:
Regards from
Tom :slight_smile:

Gary,

Thanks for the links and analysis.

I haven't attempted the obvious test, which is to save the
RTF that fails in LO back to RTF and then compare to see what
got left by the side of the road.

My superficial impression of the 96.rtf and the RTF specifications
revealed to me that there are extensive provisions in the RTF
specification (and, specifically, how the 96.rtf is coded), to
accommodate all manner of up-/down-level adjustments between
different versions and capabilities of software. I wonder if
this is not being handled well in the currently-implemented
import-export features, but I did not dig in deep enough to
determine that one way or the other.

- Dennis

DEEPER ANALYSIS

I did waste a fun evening becoming re-acquainted with RTF though.
My first romance with the format was around 1989 by tricking
Borland Paradox (MS-DOS version) to produce text files of reports
that were actually RTF documents that I could import into a Xerox
workstation desktop publisher and make nice, paginated documents
from. (I was compiling a glossary built in a database.)

In examining the actual RTF of the 96.rtf example, it was very
interesting to see how little of the RTF file is actually needed
to accomplish the result. (There is a ton of overhead material.)

I also started looking through the RTF specifications, using RTF
Specification 1.9.1 (Office 2007 level). It reminded me what a
fascinating format the underlying RTF structure is. And there's
sample code in the specification, although I am attempted to see
how fast I could make a processor on my own that serves as the basis
for an RTF forensic analysis and validation tool.

The number of control-word (sort of like an XML element tag) details
is immense, of course, but very little is needed to make a simple
document.

One important feature: Since the *1987* specification the "\*"
prefix feature has been used to identify control words whose data
should be completely ignored if the control word is not recognized
or supported. For a control word that is not recognized and that
lacks the prefix, the content is presumably to be kept in-line
assuming it is in a place where content is being expressed. Drawing
objects tend to be introduced by {\*\do ...}, for example, and those
are used in important ways in 96.rtf.

There are other interesting facets in the specification. For
example, an RTF can have non-Unicode and Unicode-based alternatives
of the same content, for selective use depending on the capabilities
of the processor that is consuming the RTF.

In addition, there are Word97-2007 shape objects in {\shp ...} and
those also figure significantly in 96.rtf. These also permit an
optional Word 6.0/95 alternative {*\shprslt ...} and I see that
those are present as well in 96.rtf.

Finally, although some of OOXML is mapped into RTF (for example,
MATHML), other parts of OOXML that are newer than the binary formats
are included as XML and the OOXML specification applies. (The way
XML is embedded in RTF is a bit gnarly - the XML is coded in hex
streams so the RTF parser is not confused.) This may be one way that
it has not been necessary to update the RTF specification for Office
2010.

There are many other provisions for up- and down-level compatibility
and soft adjustment to the capabilities of a given RTF consumer. It
is rather remarkable though it depends on the quality of the producer
that such material is included and of the consumer that such material
is exploited.