convert xlsx to CSV

I'm trying to convert a file from .xlsx to CSV. After googling until my eyes were bleeding, the below is the best I could come up with. I've tried various incarnations of the below this is just the latest. Any ideas what's wrong here? The input file clearly exists. ubuntu 16.04 LibreOffice 5.1.6.2 10m0(Build:2)

Thank you in advance!

What happens if you just open the file in LibreOffice?

It opens just fine.

Some things seems wrong on that command line:

- On most system you can directly call "libreoffice" instead of providing
the full path
- I'm not sure the -env:UserInstallation part is needed, unless you have
some specific requirements
​- ​the \(encoded\):UTF8 part is not linked to anything, and thus is used
as an input filename. This is most likely not what you want.
- the --infilter might not be needed, as xlsx files should have enough
informations about themselves to load properly.

I was able to convert an xlsx to a csv in UTF-8 using the following simple
command:

$ libreoffice --headless --nolockcheck --convert-to csv
--infilter=CSV:44,34,76,1 a.xlsx

The "76" is responsible for generating an UTF-8 CSV output. If that is not
one of your requirements, you can slim this down even more:

$ libreoffice --headless --nolockcheck --convert-to csv a.xlsx

And if your document isn't open by someone else at the same time, you can
even remove the nolockcheck.

Sorry, I misread. I thought you were trying to read a CSV. LibreOffice
can read and write CSV. To save, just use Save As and select CSV for
the filter.

Thank you for responding. Yes it opens without issue.

Yet another look at the conversion details made me notice I needed changes. Below are two incarnations. For one thing I had left out column formatting, which I guess is required? No difference in any event.

apb@yellow:/usr/local/src/greetonix/src$ /usr/lib/libreoffice/program/soffice.bin -env:UserInstallation=file:///tmp/libreoffice-1 --headless --nolockcheck --convert-to txt:Text \(encoded\):UTF8 --infilter=MS Excel 97:44,34,76,1,1/2/2/2,1033,true,false KGI_Discontinued.xlsx
Error: source file could not be loaded
convert /usr/local/src/greetonix/src/KGI_Discontinued.xlsx -> /usr/local/src/greetonix/src/KGI_Discontinued.txt using filter : Text
Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///usr/local/src/greetonix/src/KGI_Discontinued.txt> failed: 0xc10)

Also tried:

apb@yellow:/usr/local/src/greetonix/src$ /usr/lib/libreoffice/program/soffice.bin -env:UserInstallation=file:///tmp/libreoffice_1 --infilter='MS Excel 97:44,34,76,1,1/2/2/2,1033,true,false' --headless --nolockcheck --convert-to 'csv:Text - txt - csv (StarCalc):44,34,76,1' KGI_Discontinued.xlsx
*Error: source file could not be loaded*

apb@yellow:/usr/local/src/greetonix/src$ ll KGI_Discontinued.xlsx
-rw-rw-r-- 2 apb apb 88334 Aug 18 16:00 KGI_Discontinued.xlsx

Which is an improvement, but not the solution. All I really did was rearrange the order of the parameters, change a hyphen to an underscore in the tmp file path and change double quotes to single quotes. I think I recall reading somewhere that order matters so that's what likely eliminated the "verify input parameters" message. You may also notice I changed output from utf8 to csv. I need *both* utf8 & csv.

Maybe I need yet another rearrangement to eliminate the file load error?

Thank you!

http://mooedit.sourceforge.net/
allow to "save as" changing the codage.

Je la 28/08/2017 19:21, A skribis :

Have you looked at the 'unoconv' command? Specifically designed for
conversion of office documents from one format to another.

(I've come to the thread late, so apologies if this has already been
suggested.)

Regards,
Tony.

Thank you. I may end up doing that, but I'm trying to avoid it. I picked one direction (before I knew about unoconv) and I want to see it through to the end. It's supposed to work. I may be forced to revisit this solution if all else fails.

​Out of curiosity, did you try with my solution? Or did I miss something?​

I'm trying to convert a file from .xlsx to CSV. After googling until my
eyes were bleeding, the below is the best I could come up with. I've tried
various incarnations of the below this is just the latest. Any ideas
what's wrong here? The input file clearly exists. ubuntu 16.04
LibreOffice 5.1.6.2 10m0(Build:2)

Thank you in advance!

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

apb@yellow:/usr/local/src/greetonix/src$ /usr/lib/libreoffice/program/soffice.bin
-env:UserInstallation=file:///tmp/libreoffice-1 --headless --nolockcheck
–convert-to csv:Text \(encoded\):UTF8 --infilter=MS Excel
97:44,34,76,1,1033,true,true,false,false KGI_Discontinued.xlsx
Error: source file could not be loaded
convert /usr/local/src/greetonix/src/KGI_Discontinued.xlsx →
/usr/local/src/greetonix/src/KGI_Discontinued.csv using filter : Text
Error: Please verify input parameters… (SfxBaseModel::impl_store
<file:///usr/local/src/greetonix/src/KGI_Discontinued.csv> failed: 0xc10)
apb@yellow:/usr/local/src/greetonix/src$

Some things seems wrong on that command line:

- On most system you can directly call "libreoffice" instead of providing
the full path

Yes, but then I get /usr/bin/libreoffice since /usr/bin is in my path. /usr/bin/libreoffice is a symlink to ../lib/libreoffice/program/soffice*

Note the file sizes and timestamps. I don't know what the difference is between the two or which is correct to use.

ll /usr/lib/libreoffice/program/soffice*
-rwxr-xr-x 1 root root 6012 Apr 28 07:59 /usr/lib/libreoffice/program/soffice*
-rwxr-xr-x 1 root root 6304 Apr 28 12:36 /usr/lib/libreoffice/program/soffice.bin*
-rw-r--r-- 1 root root 789 Apr 28 13:32 /usr/lib/libreoffice/program/sofficerc

- I'm not sure the -env:UserInstallation part is needed, unless you have
some specific requirements

Yes. I typically have LO open 24/7 with other documents.

​- ​the \(encoded\):UTF8 part is not linked to anything, and thus is used
as an input filename. This is most likely not what you want.

Could you elaborate what you mean by not linked to anything? This is probably the key to what I'm doing wrong. To what would I link it, and how?

- the --infilter might not be needed, as xlsx files should have enough
informations about themselves to load properly.

OK, I'm all for simplicity. But I couldn't find anything in the docs that specifies that, which is why I used it. Plus the fact that all of the examples scattered around the Net use it, else I likely wouldn't have been able to decipher the docs.

I was able to convert an xlsx to a csv in UTF-8 using the following simple
command:

$ libreoffice --headless --nolockcheck --convert-to csv
--infilter=CSV:44,34,76,1 a.xlsx

I tried exactly as you stated (I of course replaced with the proper file to convert) but for me, that had no result. There were no messages of any kind, and the file was not created.

infilter does refer to the source file, yes? As my input/source file is .xlsx, I tried changing CSV in your --infilter, to MS Excel 97, but that made no difference in the result.

The "76" is responsible for generating an UTF-8 CSV output. If that is not
one of your requirements, you can slim this down even more:

I do in fact prefer UTF-8 CSV output.

Hi :slight_smile:
Prolly best to avoid one direction surely? Anyway didn't they split up and
go different ways?
Regards from
Tom :slight_smile:

Thank you. I may end up doing that, but I'm trying to avoid it. I picked
one direction (before I knew about unoconv) and I want to see it through to
the end. It's supposed to work. I may be forced to revisit this solution
if all else fails.

OK, I looked at the link... It says, "medit is a programming and around-programming text editor".

Thank you, but I'm not looking for a text editor.

Read .xlsx, write/convert to .csv format with utf8 characters. From the command line.

Thank you, but I don't understand.

Would it be possible for list users to use "Reply List" and not "Reply All"? The latter provides two identical messages; one to the list which is of course copied to me plus another sent to me directly. Thank you for your consideration.

Hi :slight_smile:
Prolly best to avoid one direction surely? Anyway didn't they split up and
go different ways?
Regards from
Tom :slight_smile:

I have no idea. The point I'm trying to make is that I investigated one option. unoconv was not that option.

I am trying to avoid unoconv. I picked one direction (before I knew about unoconv) and I want to see it through to the end before I traipse off in another direction. I may be forced to revisit unoconv if all else fails. Thank you for suggesting it.

Some things seems wrong on that command line:

- On most system you can directly call "libreoffice" instead of providing
the full path

Yes, but then I get /usr/bin/libreoffice since /usr/bin is in my path.
/usr/bin/libreoffice is a symlink to ../lib/libreoffice/program/soffice*

Note the file sizes and timestamps. I don't know what the difference is
between the two or which is correct to use.

ll /usr/lib/libreoffice/program/soffice*
-rwxr-xr-x 1 root root 6012 Apr 28 07:59 /usr/lib/libreoffice/program/s
office*
-rwxr-xr-x 1 root root 6304 Apr 28 12:36 /usr/lib/libreoffice/program/s
office.bin*
-rw-r--r-- 1 root root 789 Apr 28 13:32 /usr/lib/libreoffice/program/s
officerc

​The "libreoffice" command/link is there to avoid issues when/if the way
LibreOffice is installed change. Suppose in a future release the binary is
renamed to loffice, you could still use the ​libreoffice command without
issue.
The "soffice" script manage some extra command line parameters and launch
the actual LibreOffice program; but for your purpose it should make no
difference using one or the other, except that directly calling stuff in
/usr/lib/libreoffice is not as futureproof.

- I'm not sure the -env:UserInstallation part is needed, unless you have

some specific requirements

Yes. I typically have LO open 24/7 with other documents.

​It should not​ matter when doing a convert-to. This command is useful if
you want to launch an instance of LibreOffice using a different place to
store your configuration/etc. You can have many document open in
LibreOffice and use the command line at the same time.

​- ​the \(encoded\):UTF8 part is not linked to anything, and thus is used

as an input filename. This is most likely not what you want.

Could you elaborate what you mean by not linked to anything? This is
probably the key to what I'm doing wrong. To what would I link it, and how?

​The command line is parsed argument by argument. An argument is a single
string, and arguments are separated by a space.​ Sometime an argument
expect a parameter in place of the next argument. This is common practice
for command lines.

In your case, you had: --convert-to csv:Text \(encoded\):UTF8
Which mean that you have three strings: "--convert-to", "csv:Text" and
"(encoded):UTF8". "--convert-to" is the argument, "csv:Text" is a parameter
to the argument, and "(encoded):UTF8" is "nothing", meaning it is
interpreted as an input filename, so your initial command was trying to
open a file named "(encoded):UTF8".

- the --infilter might not be needed, as xlsx files should have enough

informations about themselves to load properly.

OK, I'm all for simplicity. But I couldn't find anything in the docs that
specifies that, which is why I used it. Plus the fact that all of the
examples scattered around the Net use it, else I likely wouldn't have been
able to decipher the docs.
​​

I was able to convert an xlsx to a csv in UTF-8 using the following simple

command:

$ libreoffice --headless --nolockcheck --convert-to csv
--infilter=CSV:44,34,76,1 a.xlsx

I tried exactly as you stated (I of course replaced with the proper file
to convert) but for me, that had no result. There were no messages of any
kind, and the file was not created.

infilter does refer to the source file, yes? As my input/source file is
.xlsx, I tried changing CSV in your --infilter, to MS Excel 97, but that
made no difference in the result.

--infilter does not necessarily related to the input; order is important.​

It's hard to say what went wrong, but here's my own result (with
LibreOffice version):

$ ls -l

total 16

-rw-rw-r-- 1 cleyfaye cleyfaye 5374 août 29 10:05 a.xlsx

$ file a.xlsx

a.xlsx: Microsoft OOXML

$ libreoffice --headless --nolockcheck --convert-to csv

--infilter=CSV:44,34,76,1 a.xlsx

$ ls -l

total 28

-rw-rw-r-- 1 cleyfaye cleyfaye 25 août 29 10:13 a.csv

-rw-rw-r-- 1 cleyfaye cleyfaye 5374 août 29 10:05 a.xlsx

$ file a.csv

a.csv: UTF-8 Unicode text

$ cat a.csv

"a",

"b",

,"c"

,"héhé"

$ libreoffice --version

LibreOffice 5.3.1.2 30m0(Build:2)

​There's no need for other parameters to do an xlsx->csv(utf8) conversion.​
If that simple command doesn't work, maybe there's another issue.
You could even remove the --nolockcheck if you're sure that the file isn't
open anywhere else and remove the --headless if you're not running this
command on a server, it should still work.

The "76" is responsible for generating an UTF-8 CSV output. If that is not

one of your requirements, you can slim this down even more:

I do in fact prefer UTF-8 CSV output.

For information, the "44,34,76,1" thing come from this page (the CSV part
is still applicable to LibreOffice):
https://wiki.openoffice.org/wiki/Documentation/DevGuide/Spreadsheets/Filter_Options

Dohhh again !

Some things seems wrong on that command line:

- On most system you can directly call "libreoffice" instead of providing
the full path

Yes, but then I get /usr/bin/libreoffice since /usr/bin is in my path.
/usr/bin/libreoffice is a symlink to ../lib/libreoffice/program/soffice*

Note the file sizes and timestamps. I don't know what the difference is
between the two or which is correct to use.

ll /usr/lib/libreoffice/program/soffice*
-rwxr-xr-x 1 root root 6012 Apr 28 07:59 /usr/lib/libreoffice/program/s
office*
-rwxr-xr-x 1 root root 6304 Apr 28 12:36 /usr/lib/libreoffice/program/s
office.bin*
-rw-r--r-- 1 root root 789 Apr 28 13:32 /usr/lib/libreoffice/program/s
officerc

​The "libreoffice" command/link is there to avoid issues when/if the way
LibreOffice is installed change. Suppose in a future release the binary is
renamed to loffice, you could still use the ​libreoffice command without
issue.
The "soffice" script manage some extra command line parameters and launch
the actual LibreOffice program; but for your purpose it should make no
difference using one or the other, except that directly calling stuff in
/usr/lib/libreoffice is not as futureproof.

I use the full path because it's the only one that gives output as you can see:

apb@yellow:/usr/local/src/greetonix/src$ /usr/lib/libreoffice/program/soffice -h
apb@yellow:/usr/local/src/greetonix/src$ /usr/bin/libreoffice -h
apb@yellow:/usr/local/src/greetonix/src$ /usr/lib/libreoffice/program/soffice.bin -h
LibreOffice 5.1.6.2 10m0(Build:2)

Usage: soffice [options] [documents...]
[snip]

- I'm not sure the -env:UserInstallation part is needed, unless you have

some specific requirements

Yes. I typically have LO open 24/7 with other documents.

I don't recall for sure, but if I'm not mistaken, the conversion silently fails without it. I resolved that one awhile ago, I no longer remember the details. I suppose I should mention I had this working at one point. I moved the file to a subdir, made minor changes in a script to accommodate the subdir and it broke. I put it all back the way it was and it still refused to work and that's where I am now. Baffled.

​It should not​ matter when doing a convert-to. This command is useful if
you want to launch an instance of LibreOffice using a different place to
store your configuration/etc. You can have many document open in
LibreOffice and use the command line at the same time.

​- ​the \(encoded\):UTF8 part is not linked to anything, and thus is used

as an input filename. This is most likely not what you want.

Could you elaborate what you mean by not linked to anything? This is
probably the key to what I'm doing wrong. To what would I link it, and how?

​The command line is parsed argument by argument. An argument is a single
string, and arguments are separated by a space.​ Sometime an argument
expect a parameter in place of the next argument. This is common practice
for command lines.

In your case, you had: --convert-to csv:Text \(encoded\):UTF8
Which mean that you have three strings: "--convert-to", "csv:Text" and
"(encoded):UTF8". "--convert-to" is the argument, "csv:Text" is a parameter
to the argument, and "(encoded):UTF8" is "nothing", meaning it is
interpreted as an input filename, so your initial command was trying to
open a file named "(encoded):UTF8".

I copied it from a web page, unfortunately I can't find the page again now. Since you're saying it's not needed (as demonstrated by you below), then I guess there's no need to explore this particular argument at this time.

I was able to convert an xlsx to a csv in UTF-8 using the following simple

command:

$ libreoffice --headless --nolockcheck --convert-to csv
--infilter=CSV:44,34,76,1 a.xlsx

I tried exactly as you stated (I of course replaced with the proper file
to convert) but for me, that had no result. There were no messages of any
kind, and the file was not created.

infilter does refer to the source file, yes? As my input/source file is
.xlsx, I tried changing CSV in your --infilter, to MS Excel 97, but that
made no difference in the result.

--infilter does not necessarily related to the input; order is important.​\

If order is important, then I presume the docs are clear on what the ordering should be. Can you link to the docs where it elaborates on the ordering priorities please?

$ libreoffice --version

LibreOffice 5.3.1.2 30m0(Build:2)

​There's no need for other parameters to do an xlsx->csv(utf8) conversion.​
If that simple command doesn't work, maybe there's another issue.

I guess there must be another issue. I'm all ears for ideas on determining what that issue is.

You could even remove the --nolockcheck if you're sure that the file isn't
open anywhere else and remove the --headless if you're not running this
command on a server, it should still work.

The "76" is responsible for generating an UTF-8 CSV output. If that is not

one of your requirements, you can slim this down even more:

I do in fact prefer UTF-8 CSV output.

For information, the "44,34,76,1" thing come from this page (the CSV part
is still applicable to LibreOffice):
https://wiki.openoffice.org/wiki/Documentation/DevGuide/Spreadsheets/Filter_Options

Yes, thank you. I studied it in detail before posting. Very dense page. Took a very long time to figure out how to construct a filter string, especially token 5. There was zero information regarding what was required and what was optional. Your strings imply everything after token 4 is optional, but the docs fail to state that. So it's a great big frustrating mystery as to what's required and what's not.