MS Word .docx formatting in Libre Office

Hi,
Am a newbie to Libre Office so apologies if this has been answered elsewhere.

When I load .docx files into Office 3 for view/edit the formatting is shot to pieces and the Table of Contents is not picked up when included in the original.
It appears that .doc files are loading ok.
I thought that .docx files were supported by LO. Apologies if this assumption is incorrect.

Is there anything that I can do to address these formatting issues, or must I convert these .docx files to .doc format before reading them in LO?
Thanks

Hi :slight_smile:
Yes, it does crop up quite often but it is still a legitimate question.

Ideally convert things to doc format if that is possible. DocX is designed to
be incompatible with non-MS products imo. However the LibreOffice 3.4.2 often
does a better job of read/writing docXs than the 3.3.3 but having both is
possible if you follow this guide
http://wiki.documentfoundation.org/Installing_in_parallel
Still neither is perfect at it so trying to stick with Doc is better.

Regards from
Tom :slight_smile:

Some people use big words for small things, that's why things like
”fully compatible” is mentioned way too often everywhere. I just
wonder what, using the same terminology, ”partly compatible” would
mean. Maybe that it recognizes the file suffix and then a dialogue
pops up telling you to it can't be opened…?

Just never use anything else than ODF and everything will be fine. You
can export to PDF without any problems in most cases, though. If
people can not read them, they can install LibreOffice or
OpenOffice.org and it won't cost them anything else than disk space
and time. They don't have to buy anything.

Kind regards

Johnny Rosenberg
ジョニー・ローゼンバーグ

There are considerable issue with OOXML/MOX[1] files:
https://bugs.freedesktop.org/buglist.cgi?quicksearch=docx

Then again, Microsoft also has issues as well:
http://en.wikipedia.org/wiki/Office_Open_XML
<http://office.microsoft.com/en-us/word-help/open-a-word-2007-document-in-an-earlier-version-of-word-HA010044473.aspx>

Your best bet is to ask users to use .doc rather than .docx (preferably
.odt, but we all know that isn't realistic).

[1]
http://wiki.documentfoundation.org/LibreOffice_OOXML

Hi :slight_smile:

Hi :slight_smile:
Yes, it does crop up quite often but it is still a legitimate question.

Ideally convert things to doc format if that is possible. DocX is designed to
be incompatible with non-MS products imo. However the LibreOffice 3.4.2 often
does a better job of read/writing docXs than the 3.3.3 but having both is
possible if you follow this guide
http://wiki.documentfoundation.org/Installing_in_parallel
Still neither is perfect at it so trying to stick with Doc is better.

Regards from
Tom :slight_smile:

________________________________
From: Rob Harriman <rob.harriman@btinternet.com>
To: users@global.libreoffice.org
Sent: Fri, 12 August, 2011 19:59:56
Subject: [libreoffice-users] MS Word .docx formatting in Libre Office

Hi,
Am a newbie to Libre Office so apologies if this has been answered elsewhere.

When I load .docx files into Office 3 for view/edit the formatting is shot to
pieces and the Table of Contents is not picked up when included in the original.
It appears that .doc files are loading ok.
I thought that .docx files were supported by LO. Apologies if this assumption is
incorrect.

Is there anything that I can do to address these formatting issues, or must I
convert these .docx files to .doc format before reading them in LO?
Thanks

-- For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

My experience with docs and xlsx documents has been those that are not
very complex are opened in 3.4.2 very accurately and saving back works
well. I recommend that you save and ODF copy just in case. Complex
documents are trickier, it depends on the features used and whether
there are macros. Macros are at best very erratic, often they do not
translate easily into StarBasic.

Hi :slight_smile:

Hi :slight_smile:
Yes, it does crop up quite often but it is still a legitimate question.

Ideally convert things to doc format if that is possible. DocX is designed to
be incompatible with non-MS products imo. However the LibreOffice 3.4.2 often
does a better job of read/writing docXs than the 3.3.3 but having both is
possible if you follow this guide
http://wiki.documentfoundation.org/Installing_in_parallel
Still neither is perfect at it so trying to stick with Doc is better.

Regards from
Tom :slight_smile:

________________________________
From: Rob Harriman <rob.harriman@btinternet.com>
To: users@global.libreoffice.org
Sent: Fri, 12 August, 2011 19:59:56
Subject: [libreoffice-users] MS Word .docx formatting in Libre Office

Hi,
Am a newbie to Libre Office so apologies if this has been answered elsewhere.

When I load .docx files into Office 3 for view/edit the formatting is shot to
pieces and the Table of Contents is not picked up when included in the original.
It appears that .doc files are loading ok.
I thought that .docx files were supported by LO. Apologies if this assumption is
incorrect.

Is there anything that I can do to address these formatting issues, or must I
convert these .docx files to .doc format before reading them in LO?
Thanks

...

My experience with docs and xlsx documents has been those that are not
very complex are opened in 3.4.2 very accurately and saving back works
well. I recommend that you save and ODF copy just in case. Complex
documents are trickier, it depends on the features used and whether
there are macros. Macros are at best very erratic, often they do not
translate easily into StarBasic.

I consider a simple text box not to be very complex:
https://bugs.freedesktop.org/show_bug.cgi?id=34617

Open LO. Make a simple text box with some text in it. Save as an .docx &
let us know how that works out for you.

...

There are considerable issue with OOXML/MOX[1] files:
https://bugs.freedesktop.org/buglist.cgi?quicksearch=docx

Then again, Microsoft also has issues as well:
http://en.wikipedia.org/wiki/Office_Open_XML
<http://office.microsoft.com/en-us/word-help/open-a-word-2007-document-in-an-earlier-version-of-word-HA010044473.aspx>

Your best bet is to ask users to use .doc rather than .docx (preferably
.odt, but we all know that isn't realistic).

[1]
http://wiki.documentfoundation.org/LibreOffice_OOXML

Sorry, forgot to mention that there is a commercial package that does a
very nice job with .docx. Even the export to .odt included the notes
from the file that I tested:

<http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=11942>

http://www.softmaker.com/english/ofl_en.htm

You can try a working linux demo for 30 days:
http://www.softmaker.de/cgi-local/of10trial.py

I just installed (deb version) and it does a pretty nice job. Looks
quite like LO/OOo/StarOffice. Interesting that comes out of Germany -
the home of StarOffice & they've been around for quite some time. Also
interesting (albeit old - 2006) interview:
<http://www.consortiuminfo.org/standardsblog/article.php?story=2006070509040463>
Louis Suarrez-Potts.

I'm not so sure the bug report being discussed is anything so simple, but I took on your little experiment.

CONCLUSION

It would seem, in this simple case, that no one has much to be proud of concerning interop between the different native formats. It is odd that the worst case here is the DOCX output from LibreOffice read back into LibreOffice. But Word 2010 doesn't round-trip to and from ODT a lot better (although it is a lot easier to adjust manually if desired).

To get this right, there needs to be a more-systematic effort that determines what discretionary provisions are being handled differently and what other features at the format level would be more successful.

DETAILS

I started in LO 3.3.2 Writer by inserting a frame around a paragraph of text, then saving it as a Microsoft Office Word 2007 DOCX format. The text box is centered at the top of the page and I adjusted its width so the text "This is going to be a simple text box" was in two lines. The box was vertically more than two lines though. (The automatically-created initial frame was narrower horizontally.)

The DOCX opened in Microsoft Office Word 2010 just fine, exactly as formatted in LO3.3.2 with the extra space at the bottom of the text box but with a thinner border line (looks like 0.5pt instead of 1pt).

However, when I opened it in LibreOffice 3.3.2, it produced a different, page-wide paragraph with a border around it. I can find no text frame.

Because I didn't keep a copy in ODT format, I did it again,

The LibreOffice 3.3.2 ODT Saves and Reloads just fine, just as I formatted it. I also see that the 1pt line is preserved and is heavier than I see in Word 2010. I examined the ODT Zip Package and confirmed there is a <draw:frame> containing a <text:box>. In addition, it sets a minimum height of the box. When I open that ODT in Word 2010, it opens fine except the text box is shrunk vertically and the outline is not so heavy.

I also saved the second ODT from LO 3.3.2 in OOXML DOCX (not the Word 2007 DOCX flavor), just to see what difference there is, if any.

It opens back in LibreOffice 3.3.2 once-again with the frame lost and there being a single-line paragraph with a 1pt border around it (full page width, one line high). I inspected the ODT Zip package and confirmed that LO turned the text frame into a paragraph with a solid border.

Word 2010 still has the centered box with the original height (blank space below the text) but it looks like the border is an 0.5 pt line.

For fun, I used Word 2010 to save the opened .DOCX document as a Word 97-2003 DOC and as an ODF Text .ODT document. Word 2010 reopens the .DOC just fine. On re-opening the .ODT, Word 2010 shows a drawn text frame that has been reduced vertically to surround the text.

On reopening the ODT from Word 2010 in LibreOffice 3.3.2, the frame is preserved but the border shrinks vertically to just fit the two lines of text, although I can see that the text box is vertically deeper than that still. The border is heavier again.

When I open the DOC from Word 2010 in LibreOffice 3.3.2, the frame is preserved and its border is the heavier one that we started out with. The border is heavier again.

Must be getting old/tired... last line was meant to read:
"comments from Louis Suarrez-Potts are interesting".

"Text Box":
View>Toolbars>Drawing
Click on the 'T'|write something|exit the textbox
etc.

...

Hi :slight_smile:
+1
I didn't know about the links and don't support the idea of using proprietary
products if it's possible to avoid them. I agree it would be great to use .odt
but that it is more realistic to use .doc.

You can set the default format to .Doc and similar by clicking on
Tools - Options - +Load/Save - General
Then there are 2 drop-downs beside each other at the bottom. With "Document
Type" set at "Text Document" roll the other one back-up one place to "Microsoft
Word (97/2000/Xp)". For spreadsheets and presentations roll-back 2 places to
avoid the "templates" options.
Regards from
Tom :slight_smile:

Hi Tom,
Thanks to you and the others who responded to my query. It made interesting reading.
As suggested below I have installed Libre Office 3.4.2 in parallel with the default version of LO supplied under Ubuntu 11.04.
I understand that things aren't always going to be right with .docx files so I have taken the thesis document I'm currently interested in and have converted it to .doc as suggested.
Now when I look at this in LO (either version) it is not picking up all of the Table of Content entries. The original TOC was set up to pick up all Heading 1 through Heading 4 entries.
I have tried using Tools - Outline Numbering - Numbering to define Heading 1 through Heading 4 entries, but this appears to have made no difference. Can anyone suggest what I should be using here please, or how I use this process to get these headings picked up?
Also does this mean that I have to set this each time I look at different documents or is there a way of setting LO so that it automatically picks up all headers e.g 1 through 9 as a matter of course.
Basically I am not really sure what Tools - Outline Numbering is supposed to do, or what the thinking is behind how TOCs are handled in LO.
Can anyone enlighten me please?
Thanks
Rob

Hello Rob,

Hi Tom,
Thanks to you and the others who responded to my query. It made
interesting reading.
As suggested below I have installed Libre Office 3.4.2 in parallel with
the default version of LO supplied under Ubuntu 11.04.
I understand that things aren't always going to be right with .docx
files so I have taken the thesis document I'm currently interested in
and have converted it to .doc as suggested.
Now when I look at this in LO (either version) it is not picking up all
of the Table of Content entries. The original TOC was set up to pick up
all Heading 1 through Heading 4 entries.
I have tried using Tools - Outline Numbering - Numbering to define
Heading 1 through Heading 4 entries, but this appears to have made no
difference. Can anyone suggest what I should be using here please, or
how I use this process to get these headings picked up?
Also does this mean that I have to set this each time I look at
different documents or is there a way of setting LO so that it
automatically picks up all headers e.g 1 through 9 as a matter of course.
Basically I am not really sure what Tools - Outline Numbering is
supposed to do, or what the thinking is behind how TOCs are handled in LO.
Can anyone enlighten me please?

If I remember correctly, the default setting in LibreOffice (and OpenOffice.org) is that only headings 1 to 3 are considered for the table of contents. It is possible to change this. You will find a much better description for this in here:
http://wiki.documentfoundation.org/cgi_img_auth.php/9/9b/0212WG3-TOCsIndexesBiblios.odt

Using Tools > Outline Numbering > Numbers only affects how your headings are displayed, e. g. do they have a number or not.
Something like
1 First chapter
2 Second chapter
2.1 First sub chapter
2.2 Second sub chapter

and so on... It does not have any effect on which headings appear in your TOC.

Hope I could help a bit.

Sigrid

I've not seen tables-of-content move between Microsoft Word and ODF documents very well, in either direction. (But I have not dug into it enough to conquer whatever the limitations are, either.)

Say more about what you are doing and where you need to end up.

Are you attempting to edit a dissertation using LibreOffice exclusively? Then you might want to have it as an .odt (not .doc) until the last possible minute. If you need to turn it in as a .doc, you might need to find a copy of Microsoft Office for a final edit, adjustment of the table of contents, etc.

Or are you at the front end where you are starting with a model in .doc and want to replicate its table of content styles, etc., for continuing in LibreOffice?

- Dennis

Hello all,

Is there a way to open a PDF file in Draw, or another LibreOffice program, without having the program use OCR? I have some PDF files that are basically just pictures of book pages and when LibreOffice tries to parse the mediocrely-scanned text it becomes unreadable.

Thanks in advance for any help!
Derek

Hi :slight_smile:
Many apologies. I had assumed you were sharing much shorter documents with
office users in an office type environment. Since you are writing a longer work
and can retain control of it then sticking to odt is probably best.

I think it might be a good idea to read about using styles
http://wiki.documentfoundation.org/Documentation#Getting_Started_with_LibreOffice

and also Chapter 12 in the Writer Guide
http://wiki.documentfoundation.org/Documentation#LibreOffice_Writer_Guide

When you want other people to proof-read it or make suggestions you can always
save-as .doc or export to pdf. Pdf keeps things looking exactly the same but
it's harder for people to edit, which might be a double win.

I have Bcc'd this to one of the main people in the documentation team in case
she has better guidance but she is extremely busy right now and Dennis is a star
anyway (and others in the users list too of course).

Regards from
Tom :slight_smile:

Hi

  Excuse me, but it is important to know what do you need to do with PDF
file: Edit, copy something...? It is because you would use other
programs specialized for that. OCR program is used for images txt
documents for example. If your PDF document haven't image txt datas you
don't need OCR program to access the PDF document.

Regards,

Jorge Rodríguez

Hi Jorge,

Basically I want to do some minor editing of the pictures in the PDF. Delete some pages, move some from one document to another, delete parts of pages. The problem is that LibreOffice is trying to interpret the text in the images as text and that makes it unreadable. I just want to edit them as images.

Thanks,
Derek

Hi Derek,

Hi Jorge,

Basically I want to do some minor editing of the pictures in the PDF. Delete some pages, move some from one document to another, delete parts of pages. The problem is that LibreOffice is trying to interpret the text in the images as text and that makes it unreadable. I just want to edit them as images.

Thanks,
Derek

> Hi
>
> Excuse me, but it is important to know what do you need to do with PDF
> file: Edit, copy something...? It is because you would use other
> programs specialized for that. OCR program is used for images txt
> documents for example. If your PDF document haven't image txt datas you
> don't need OCR program to access the PDF document.
>
> Regards,
>
> Jorge Rodríguez
> _____________
>
>
>> Hello all,
>>
>> Is there a way to open a PDF file in Draw, or another LibreOffice program, without having the program use OCR? I have some PDF files that are basically just pictures of book pages and when LibreOffice tries to parse the mediocrely-scanned text it becomes unreadable.
>>
>> Thanks in advance for any help!
>> Derek
>
> --
> Atentamente,
>
> Jorge Rodríguez
>
>
> --
> For unsubscribe instructions e-mail to: users+help@global.libreoffice.org
> Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
> Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
> List archive: http://listarchives.libreoffice.org/global/users/
> All messages sent to this list will be publicly archived and cannot be deleted

Here is a link to an extension to open PDF files in LO:
http://extensions.services.openoffice.org/en/project/pdfimport

Why do you need to do that in LibreOffice? There are other software
out there, you know… Tried PDF edit? If you have Ubuntu it's in the
repositories (or ”sudo apt-get install pdfedit” in a terminal).

Regards

Johnny Rosenberg
ジョニー・ローゼンバーグ