How should I import the contents of a *.pdf file into LibreOffice?
Thanks, Tom (moderator)
How should I import the contents of a *.pdf file into LibreOffice?
Thanks, Tom (moderator)
Open LibreOffice, and from within LibreOffice open the PDF file using
the File > Open menu. The PDF should open with Draw, but if damaged it
will open with Writer as a text file.
LOSING JUSTIFY
With at least one example, I lost "justify" with opening a PDF in LibreOffice Draw. The original PDF was in two columns with each justified. In Draw, it was left aligned, and some of the lines in the left column overlapped the right.
I got this using "Print/export" > "Download as PDF" from "Effective defense and ISIL" in Wikiversity (https://en.wikiversity.org/wiki/Effective_defense_and_ISIL) with LibreOffice 5.0.3.2 and 5.0.4.2 just now.
FORCE READING IN WRITER?
Is there a way to force LO to open it in Writer?
Alternatively, is there other software (preferably free and open-source) that can read the text (and numbers) and make them available to Writer or Calc that's easier than copying and pasting from Draw?
The text in Draw is all single lines, which makes it inconvenient to work from.
Thanks,
Spencer Graves
"Spencer Graves":
Is there a way to force LO to open it in Writer?
No. PDF documents are graphic files and cannot be easily converted to plain text.
is there other software (preferably free and
open-source) that can read the text (and numbers) and make them
available to Writer or Calc
pdftotext from xpdf package is quite good.
Spencer Graves-2 wrote
FORCE READING IN WRITER?
Is there a way to force LO to open it in Writer?
Yes and actually this is trivial to do. From Tools -> Options -> General:
select the "Use LibreOffice dialogs" option. With that enabled, when the
dialog opens--on the File type:-- dropdown list rather that default "All
files" scroll and select the "PDF - Portable Document Format (Writer)
(.pdf)" entry. Then choose the PDF document to be opened into Writer. It
will be filter imported into a Writer module document rather than the
default Draw module.
The text in Draw is all single lines, which makes it inconvenient
to work from.
LibreOffice is *not* a PDF editor and makes no claim to be such--for that
matter neither does Adobe Acrobat. The loss of layout and linkage PDF on
import to LibreOffice simply occurs because a general PDF import filter is
used. It is not designed nor is it intended to parse the PDF back into its
original source document or web page structure. Rather, the PDF import
filter is intended to render the various pages as reasonable facsimile of
the Postscript document layout contained in the PDF. Most text flow and
paragraph structure is not described in the PDF and can not magically be
recreated when the PDF is read for import.
So, unfortunately a fair amount of restructuring the result is always going
to be needed to work with the document. Often, a copy paste into a new
document with style applied is going to be more efficient.
The LibreOffice hybrid-PDF format can be used to embed the ODF source
document into the PDF to exchange document structuring of the original. But
that won't help for this use case.