free text editor (was: [libreoffice-users] delete hidden formated text in odt)

Thanks for this information;
          some on this list might enjoying knowing about it.

       Since I think you sent this to me by mistake, am sending on to the
list;
          I am not interested in changing.

Do you have some advice how to remove the hidden (hard-)formatted text in

a

simple way?

File > Save :: Save as type = Text (.txt) OR Text - Choose Encoding (.txt)
Thus, LO can be used to read, create and edit unformatted text files.

Hallo,

> Do you have some advice how to remove the hidden (hard-)formatted text in

a

> simple way?

File > Save :: Save as type = Text (.txt) OR Text - Choose Encoding (.txt)
Thus, LO can be used to read, create and edit unformatted text files.

Yes, you are right, but it does not solve my problem. The text file will
contain the hidden formated text.

To make it very clear:
I do not care about the format information in the file.
I do not want the content (the hidden formated characters) in the file before
shipping.

Using linux, the command line program xpath might do the job. It is capable of
removing certain xml tags in a file like

<text:p text:style-name="P2">blabla</text:p>

Regards

Walther

Walther -

Could you open your content.xml with Emacs and record a macro to find and remove the hidden text? (I'm not sure that qualifies as simple.)

- Robert

Perhaps LO Find & Replace using Regular expressions would work.
Not an LO solution but SED can be useful for such tasks.
Can you post a sample of the file contents?

Notepad was still in Windows distributions the last time I was aware.
Windows 7 included it, and i think Windows 10 still includes Notepad.
If not, there is a free program named Notepad++ .

Another fairly independent way to distribute documents is by outputting to
PDF. Every platform that I know of supports PDF.

I tend to have NoteTab and Notepad++ installed on most of the Windows systems - Win7 through Win10.

Both works fine for me. I used one or the other when I was hand coding web pages back in the WinXP days. This was before I switched to Ubuntu
Linux.

Hi Robert

Walther -

Could you open your content.xml with Emacs and record a macro to
find and remove the hidden text? (I'm not sure that qualifies as
simple.)

Unfortunately, I am not used to emacs, although I know it is a very powerful
editor.

Walther

Walther Koehler wrote:

> Hi Robert
>
> > Walther -
> >
> > Could you open your content.xml with Emacs and record a macro to
> > find and remove the hidden text? (I'm not sure that qualifies as
> > simple.)
> Unfortunately, I am not used to emacs, although I know it is a very powerful
> editor.

Here is a simple recipe that you can use with any text editor (including sed):
Please do this on a copy of your original document, or make a backup copy first.

Unzip from your document the file content.xml:

unzip HiddenText.odt content.xml

Edit content.xml, replace all occurrences of text:display="none" with fo:color="#800080"
This replaces the hidden text with a colored text. I have chosen magenta (#800080) here; you should choose a color that is not used in your document.

sed -e 's/text:display="none"/fo:color="#800080"/' content.xml > content2.xml

Rename the new file to content.xml

mv content2.xml content.xml

Replace content.xml in the ODT file

zip HiddenText.odt content.xml

Now open the file in LibreOffice, and replace the magenta-colored text with empty text.

Edit > Find & Replace
Other Options > Format > Font Effects > Font color > Magenta

I found that Replace All did not work. It said that it replaced the text but it didn't (LO 5.1.2.2 on Mac OS X). The alternative Replace and Find extension, however, did work. There you choose
[:::CharColor=8388736::] as search criterium (colors are given in decimal).

Hallo,
thanks for your reply.
But I dont want to use another editor.

Walther

Walther Koehler wrote:

> > Walther Koehler wrote:
> > > Thank you.
> > > Your procedure is principally what I need.
> > > However, quite complicated for routine work.
> > >
> > > Walther
> >
> > The first part (replacing hidden text with colored text) could be done with
> > a script.
> yes, and the whole procedure could be packed in a basic makro.
> The script can be called by a shell command within a makro, the replace
> functions realized with dispatcher commands.
>
> Let us try it.
>
> Walther

Even simpler:

Here is a Basic macro that just walks through the document, and deletes all hidden text. No unzipping, editing, etc. It just works inside the ODT document in LO.

It only considers normal plain text, i.e. not inside tables, sections, frames, footnotes, etc.
If you want that, these have to be specially coded.

REM ***** BASIC *****

Sub Main

Dim oEnum 'com.sun.star.container.XEnumerationAccess
Dim oPar
Dim oSecEnum 'com.sun.star.container.XEnumerationAccess
Dim oParSection

oEnum = ThisComponent.Text.createEnumeration()
Do While oEnum.hasMoreElements()
    oPar = oEnum.nextElement()
    If oPar.supportsService("com.sun.star.text.Paragraph") Then
        oSecEnum = oPar.createEnumeration()
        Do While oSecEnum.hasMoreElements()
            oParSection = oSecEnum.nextElement()
            If oParSection.TextPortionType = "Text" AND oParSection.CharHidden Then
                 oParSection.setString("")
            End If
        Loop

    End If
Loop

End Sub

Are you saying that there are still 'hidden formated text' things in the textfile that is created when one does do 'File/SaveAs/.txt' ??

?

Luuk wrote:
>
> Are you saying that there are still 'hidden formated text' things in the
> textfile that is created when one does do 'File/SaveAs/.txt' ??
>
Save as .txt des save all text, including hidden text. Just try it out.

I think i have a different meaning of 'hidden text', than the average reader here .... :slight_smile:

Hello Luuk : )

> Save as .txt des save all text, including hidden text. Just try it out.

I have, Piet.

I think i have a different meaning of 'hidden text', than the average
reader here .... :slight_smile:

It would seem so.
Re-inventing the wheel isn't either smart nor efficient:

I wrote a file containing just:

"This an ODT file written in LibreOffice 5.1.
"

I saved it in LO native format: an_odt_file.odt
Then I saved it as text (plain, flat, ascii): an_odt_file.txt

I listed the directory where I saved them and:

  bytes date file
  8628 may 5 20:07 an_odt_file.odt
      49 may 5 20:07 an_odt_file.txt

I opened the txt file with various text editors (nano, leafpad, gedit,
even notepad through wine) and have not been able to find any "hidden
text" ... Just 49 bytes including spaces and Line Feed.
What do people understand by "hidden text"? The coding put there
by a word processor?
Saving a file as text is just that: saving it as text (plain, flat, ascii)
with nothing else.

M$Win seems to produce strange things in how some users conceive files:
macros, scripts, programs, etc. just to strip the good old codes WordStar,
WordPerfect and many others nowadays simply _hide away_ so the user cannot
see them. But they are there, doing their tasks.

Text is text. Just like this I've just written is text and nothing more.

Felipe : )

Felipe -

The hidden text being discussed here is described at https://help.libreoffice.org/Writer/Hiding_Text
That doesn't seem to be what you're talking about?

- Robert

Does this sound like a workable approach if you were familiar with
Emacs? If you were able to identify exactly what needs to be removed
from content.xml, and sent me a fragment of a file to illustrate it, I
might be able to specify a sequence of commands that you could use.

I posted the tag, which I think is the relevant, before:

>>> <text:p text:style-name="P2">blabla</text:p>

I assume "P2" refers to printable, suppressing output on paper.
if you have a simple command line

Hi Piet,

superb, a makro! I will try it later on.

Walther

Hallo,

look at your *.txt file: the previously hidden formated text appears as plain
string.

enjoy LO

Robert Funnell wrote:

> Felipe -
>
> The hidden text being discussed here is described at
> https://help.libreoffice.org/Writer/Hiding_Text
> That doesn't seem to be what you're talking about?
>
> - Robert
>
That is missing one important case, that I think we were talking about.

It is normal text, that has been formatted with
Format > Character > Font Effects > Hidden.