Mixing LibreOffice Calc formats with ASCII text

Here are three lines copied from a LibreOffice Calc page
03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p $159.64 HOUS Waste Management
03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks

The first two columns are dates formatted using numbers/Date/Format 01/01/04 Default to English (USA) Format code MM/DD/YY. They are formatted that way for the entire length of possible columns. It's two columns to allow for sent and deposited dates.

The third column is empty. The fourth column is a type of payment, here Electronic Payment. The fifth column is one letter, here Paid, the sixth column F is formatted as number-currency. the rest is two more comment columns formatted as text.

I'm going to copy those three rows and install the contents right here:
03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p $159.64 HOUS Waste Management
03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks

Now I'm going to copy the information from the Ubuntu text editor "gimp" and paste them back into the spreadsheet. I select the cell at the upper left just below where I started. I copy from the text editor and ask the spreadsheet to re-enter the data.

I get an "Import" window. It suggests separator options fixed width and tab. I don't understand the "fixed width". But it does show the data with little arrows separating the columns. If I copy from the "Text Import" window I get something that copies back in the text editor exactly as I would have expected. I will spare you seeing the same thing as above.

I click the button that seems to be correct for closing the Import window. The spreadsheet seems to have placed the entire block data into the three cells in the A column. Selecting the three A cells and doing a copy and replace into the text editor I see this:

"03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular" (All in column A)
"03/24/17 03/24/17 EPMT p $159.64 HOUS Waste Management"
"03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks"

I get the same result if I first select a columns A through H before I do the paste.

Note that the entire lines have been honored with quote marks. The $ signs are part of the text but it matters not much because it's in the wrong column anyway. Note that the spaces are still the tab characters that they have been all along.

I have been trying a bunch of schemes that involve copying lines of data from a bank into cells over on the right side and then attempting to use formulas that move the data to the columns I need. I see one of those = signs in the front of the dates that just makes the date into text instead of the coded date it was. The $ sign gets left in the currency formatted column which declares it's just text and will fail to add with other data already present.

Using a multiple step procedure involving the Value() function and removing the $ signs can be made to work but I have to move the data into the spreadsheet one column at a time. Perl5 can handle the modifications to remove the $ and = signs when the actual source is a bank but I still have to use =right(8) on the dates to persuade the spreadsheet to accept what it put out in the first place. The likes of 03/24/17 as text with nothing at the ends always gets something added that makes it into text. (Right now I'm safe from some bank offering 3/24/17 without the leading zero. But. . .)

I'm pretty good with perl5. Does anybody have some experience in getting bank information into text that LibreOffice Calc will accept? It just takes too long to do everything one column at a time. Is the whole problem a bug in the spreadsheet code?

Here are three lines copied from a LibreOffice Calc page
03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p $159.64 HOUS Waste
Management
03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks

The first two columns are dates formatted using numbers/Date/Format
01/01/04 Default to English (USA) Format code MM/DD/YY. They are formatted
that way for the entire length of possible columns. It's two columns to
allow for sent and deposited dates.

The third column is empty. The fourth column is a type of payment, here
Electronic Payment. The fifth column is one letter, here Paid, the sixth
column F is formatted as number-currency. the rest is two more comment
columns formatted as text.

I'm going to copy those three rows and install the contents right here:
03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p $159.64 HOUS Waste
Management
03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks

Now I'm going to copy the information from the Ubuntu text editor "gimp"

Do you mean Gedit? Gimp is not a text editor.

and paste them back into the spreadsheet. I select the cell at the upper
left just below where I started. I copy from the text editor and ask the
spreadsheet to re-enter the data.

I get an "Import" window. It suggests separator options fixed width and
tab. I don't understand the "fixed width". But it does show the data with
little arrows separating the columns. If I copy from the "Text Import"
window I get something that copies back in the text editor exactly as I
would have expected. I will spare you seeing the same thing as above.

I click the button that seems to be correct for closing the Import window.
The spreadsheet seems to have placed the entire block data into the three
cells in the A column. Selecting the three A cells and doing a copy and
replace into the text editor I see this:

"03/24/17 03/24/17 EPMT p $16.96 COMM Consumer
Cellular" (All in column A)
"03/24/17 03/24/17 EPMT p $159.64 HOUS Waste
Management"
"03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks"

I get the same result if I first select a columns A through H before I do
the paste.

Are those spaces between each column or are they TAB (↹) characters?

Note that the entire lines have been honored with quote marks. The $ signs
are part of the text but it matters not much because it's in the wrong
column anyway. Note that the spaces are still the tab characters that they
have been all along.

I have been trying a bunch of schemes that involve copying lines of data
from a bank into cells over on the right side and then attempting to use
formulas that move the data to the columns I need. I see one of those =
signs in the front of the dates that just makes the date into text instead
of the coded date it was. The $ sign gets left in the currency formatted
column which declares it's just text and will fail to add with other data
already present.

Using a multiple step procedure involving the Value() function and
removing the $ signs can be made to work but I have to move the data into
the spreadsheet one column at a time. Perl5 can handle the modifications to
remove the $ and = signs when the actual source is a bank but I still have
to use =right(8) on the dates to persuade the spreadsheet to accept what it
put out in the first place. The likes of 03/24/17 as text with nothing at
the ends always gets something added that makes it into text. (Right now
I'm safe from some bank offering 3/24/17 without the leading zero. But. . .)

I'm pretty good with perl5. Does anybody have some experience in getting
bank information into text that LibreOffice Calc will accept? It just takes
too long to do everything one column at a time. Is the whole problem a bug
in the spreadsheet code?

I actually use a macro to automatically copy my clipboard to the first free
row of a certain spreadsheet. I go to my bank's web page, hit Ctrl+a
Ctrl+c, then I hit a fat button in my spreadsheet and it puts everything at
the right place. But the information on my bank's page has TAB separators,
so it was very easy. In your case it seems like there are spaces.

However, I experimented a bit with your data. This is what I did, I don't
know if it's relevant in your case:

1. I copied your three lines from this message into the clipboard (select
the text → Ctrl+c).
2. I then opened a new, empty spreadsheet in LibreOffice Calc. Actually I
did it before step 1, but that shouldn't matter…
3. Now I right clicked a cell, A1 in this case, and clicked Paste Special
(or similar – I use the Swedish user interface, I didn't bother to switch
to English…)
4. I can now choose between ”Unformatted text” and ”HTML”. I chose the
first one, ”Unformatted text” and hit OK.
5. Now the Import dialogue pops up.
6. Since your date column are a bit odd here, I selected English (USA), I'm
not sure that's necessary, I think it is in my case but not in yours
(default date format here is the ISO-8601 ”YYYY-MM-DD”).
7. I select Space as separator and I also select ”Merge delimiters” and
”Detect special numbers” .
8. After I hit OK, all the data end up in their different rows and columns,
like expected.

One odd thing is that even though I selected English (USA) in atep 6, the
currency column was formatted as my local currency, not USD. I don't know
if that's a bug or if I did something wrong, but it's easy to correct. Just
pre-format those columns or format them afterwards.

Kind regards

Johnny Rosenberg

Here are three lines copied from a LibreOffice Calc page
03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p $159.64 HOUS Waste Management
03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks

And what exactly are you trying to do with them? It would help to start by explaining that!

The first two columns are dates formatted using numbers/Date/Format 01/01/04 Default to English (USA) Format code MM/DD/YY. They are formatted that way for the entire length of possible columns. It's two columns [blah, blah]. The third column is [blah, blah]. The fourth column is [blah, blah]. The fifth column is [blah, blah], the sixth column F is [blah, blah]. the rest is two more [blah, blah].

I'm going to copy those three rows and install the contents right here:
03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p $159.64 HOUS Waste Management
03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks

Hold on: how is that any different from your first copy above?

Now I'm going to copy the information from the Ubuntu text editor "gimp" and paste them back into the spreadsheet.

You say you copied the information from a spreadsheet. (But I'm beginning to think that's not what you are trying to do.) You are now trying to paste it into a spreadsheet. You can copy information from a spreadsheet cell range from one place to another, from one sheet to another, or even from one document to another directly.

I select the cell at the upper left just below where I started. I copy from the text editor and ask the spreadsheet to re-enter the data. I get an "Import" window.

Good: that's what you need.

It suggests separator options fixed width and tab. I don't understand the "fixed width".

If you are importing plain text, you may have information in fixed-width columns. In that case, you will want to choose "Fixed width" and move the column boundaries around appropriately in the Fields view at the bottom.

But it does show the data with little arrows separating the columns.

Are those tab characters? It's worth saying that no-one can really see what your data really looks like from what appears in your mail message - which has only multiple blank characters.

I will spare you seeing the same thing as above.

Phew!

I click the button that seems to be correct for closing the Import window. The spreadsheet seems to have placed the entire block data into the three cells in the A column. Selecting the three A cells and doing a copy and replace into the text editor I see this:
"03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular" (All in column A)
[etc.]

So you've got your import options wrong? If your data is really separated by tabs, you would want to select "Separated by" and "Tab". You can experiment with the options, watching what happens in the Fields display until you see what you need.

I see one of those = signs in the front of the dates that just makes the date into text instead of the coded date it was.

The equals sign indicates a formula. By itself it does not make the change you suggest.

... the Value() function ...
Perl5 can handle the modifications ...
... I still have to use =right(8) ...

Are these more red herrings? Why not simply paste the data?

Does anybody have some experience in getting bank information into text that LibreOffice Calc will accept?

Aha! - at last, some indication of what you are trying to do. Is the original information - your first table - a plain text file? Or is the information in a *single* spreadsheet column, not the multiple columns you suggest?
o Copy the information from your text editor (or wherever it first appears).
o In your spreadsheet, use Edit | Paste Special... (or right-click | Paste Special..., or Ctrl+Shift+V).
o Select "Unformatted text".
o In the Text Import window, tinker with the options until you see what you need in the Fields display. You will probably want Space (or Tab?) and "Merge delimiters" selected.
o It is sometimes necessary to select the headers of various columns in the Fields display and to use the "Column type" drop-down to modify the data type. But this shouldn't be necessary in your case.

Is the whole problem a bug in the spreadsheet code?

Nope.

See also Chapter 1 of the Calc Guide.

I trust this helps.

Brian Barker

Hi,

I tried your example with the Fedora text editor and I was able to
reproduce the behaviour you are describing. The way out is to tell Calc
what type of data you have in the columns before you hit the "OK"
button of the import panel. To do so, make sure the arrows are placed
at the right place (you can move them with the mouse or click on them
to make them disappear) and then click on where it says "Standard" in
the column headings and select the type of data you have from the
pulldown just above (by default it should say "Standard"). My only
issue was with the numbers (I use a "," as a decimal separator) so I
had to perform a search and replace to change the decimal points to a
comma (you can do this in the text editor or in Calc) - you must do
this if you are not using the locale-defined separator in your text
because otherwise Calc will see the "numbers" as pure text.

To answer your other question (fixed-width vs. Tab), "fixed-width" is a
way for you to define where the fields are in your text (this is why
you have the little arrows); tab-delimited (you can also select other
delimiters) implies the fields on a line are separated by a tab (\t)
character (or comma, semi-colon, space, etc from the selection).

I hope this helps.

Rémy Gauthier.

Here are three lines copied from a LibreOffice Calc page
> 03/24/17    03/24/17        EPMT    p    $16.96    COMM Consumer

Cellular

> 03/24/17    03/24/17        EPMT    p    $159.64    HOUS    Waste

Management

> 03/24/17    03/29/17        EPMT    p    $330.65    COMM    Pair

Networks

The first two columns are dates formatted using numbers/Date/Format 
01/01/04 Default to English (USA) Format code MM/DD/YY.  They are 
> formatted that way for the entire length of possible columns. It's

two

columns to allow for sent and deposited dates.

> The third column is empty. The fourth column is a type of payment,

here

> Electronic Payment. The fifth column is one letter, here Paid, the

sixth

> column F is formatted as number-currency. the rest is two more

comment

columns formatted as text.

> I'm going to copy those three rows and install the contents right

here:

> 03/24/17    03/24/17        EPMT    p    $16.96    COMM Consumer

Cellular

> 03/24/17    03/24/17        EPMT    p    $159.64    HOUS    Waste

Management

> 03/24/17    03/29/17        EPMT    p    $330.65    COMM    Pair

Networks

> Now I'm going to copy the information from the Ubuntu text editor

"gimp"

> and paste them back into the spreadsheet. I select the cell at the

upper

> left just below where I started. I copy from the text editor and ask

the

spreadsheet to re-enter the data.

> I get an "Import" window.  It suggests separator options fixed width

and

tab. I don't understand the "fixed width". But it does show the data 
with little arrows separating the columns. If I copy from the "Text 
Import" window I get something that copies back in the text editor 
> exactly as I would have expected. I will spare you seeing the same

thing

as above.

I click the button that seems to be correct for closing the Import 
> window. The spreadsheet seems to have placed the entire block data

into

> the three cells in the A column. Selecting the three A cells and

doing a

copy and replace into the text editor I see this:

"03/24/17    03/24/17        EPMT    p    $16.96    COMM Consumer 
Cellular"    (All in column A)
"03/24/17    03/24/17        EPMT    p    $159.64    HOUS    Waste 
Management"
> "03/24/17    03/29/17        EPMT    p    $330.65    COMM    Pair

Networks"

> I get the same result if I first select a columns A through H before

I

do the paste.

Note that the entire lines have been honored with quote marks. The $ 
> signs are part of the text but it matters not much because it's in

the

> wrong column anyway. Note that the spaces are still the tab

characters

that they have been all along.

> I have been trying a bunch of schemes that involve copying lines of

data

> from a bank into cells over on the right side and then attempting to

use

Wow. The Sunday response is warmly appreciated.

Re-copying from the spreadsheet into gedit
03/24/17 03/24/17 EPMT p $16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p $159.64 HOUS Waste Management
03/24/17 03/29/17 EPMT p $330.65 COMM Pair Networks

The copy and paste used tabs between items. There are no spaces except in column H. Column B is empty and there are two tabs. Any multiple spaces are probably coming through mail processing.

Selecting A418, a bare area, with a simple paste, the text import window comes up.
Unicode (UTF-16)
Default-English (USA)
From row 1 -+ (The -+ pair seems to be part of the background. Mousing them changes the "1" to another integer. I left it as 1.
Separator options Fixed width (set by the Text import window)

Separator options show "fixed width" and "tab"
I'm going to change the fixed width to "tab separated". The tab spot got brighter in the Text import window.
The little arrows disappeared and I now see columns easily including the blank C column. The import window will respond to a larger rectangle using the mouse.

There is also something called "text delimiter" which has been set to a " character. I'm going to turn that off. But I can't, it's either ' or ".

Other options are "quoted field as text" and "detect special numbers". I have no idea but they are blank.

There is another option "Field column type" that doesn't seem to do anything.

Poking "OK"

It almost works! The only remaining problem seems to be that the sum of the dollar amounts is zero. But taking the $ sign off is simple and the worksheet will put it back on if I repeat the currency display format.

03/24/17 03/24/17 EPMT p 16.96 COMM Consumer Cellular
03/24/17 03/24/17 EPMT p 159.64 HOUS Waste Management
03/24/17 03/29/17 EPMT p 330.65 COMM Pair Networks

That worked OK.

OK. I'll try to find a newer version of LibreOffice. I thought I would be notified by Ubuntu who wants to change stuff every morning.

In any case I think I'm happy. Thanks to all who helped.
And yes. I would like to see "yyyy MM dd hh mm ss" here in the US of A. My real work is in physics where the MSB always comes first.

Wow. The Sunday response is warmly appreciated.

No probs!

I'm going to change the fixed width to "tab separated".

As advised. Good.

There is also something called "text delimiter" ...

If your input data had text which looked like numbers (or whatever) but needed to be interpreted as text, the *original data* might have had quoted strings. In that case, you would need to tell LibreOffice whether those quotes were single or double.

There is another option "Field column type" that doesn't seem to do anything.

As previously explained, if you first select the header of a column in the Fields display, you can then choose a type. This is useful, for example, if dates are in dd/mm when you need them in mm/dd and the like.

The only remaining problem seems to be that the sum of the dollar amounts is zero.

That means that you have imported the currency strings as text, not values. To avoid this, tick "Detect special numbers". (Some people think this should be ticked by default.)

In any case I think I'm happy.

Good-oh!

Have you read "Opening CSV files" in Chapter 1 of the Calc Guide?

Brian Barker