formatting emails

Interesting problem: An email text is received which you need to format for publication.
The email uses a carriage return at the end of each line, and a double carriage return
for paragraph spacing. What I'd like to do is remove the cr's at the end of each line, so
the text can be justified, and turn the double cr's into paragraph controls, or just leave
as double cr's. Obviously all this can be done a line at a time by hand, but it's a pain.
Is there any way someone can suggest to automate this?

Thanx--doug

Hi :slight_smile:
Which OS?

How about search&replace the double first and then deal with the remaining singles?
Regards from
Tom :slight_smile:

Well, if I knew how to replace the doubles with a paragraph marker, I could do that.
Then if I knew how to delete the singles, I could do that also.
I'm not trying to be a wiseguy, I don't know how to do that.
The os is PCLinuxOs.

--doug

Search and replace #1 - search for double carriage return and replace
with manual page break

Search and replace #2 - search for carriage return and replace with space

Search and replace #3 - search for manual page break and replace with
(double) carriage return

Am 23.03.2012 20:59, Doug wrote:

Interesting problem: An email text is received which you need to format
for publication.
The email uses a carriage return at the end of each line, and a double
carriage return
for paragraph spacing. What I'd like to do is remove the cr's at the end
of each line, so
the text can be justified, and turn the double cr's into paragraph
controls, or just leave
as double cr's. Obviously all this can be done a line at a time by hand,
but it's a pain.
Is there any way someone can suggest to automate this?

Thanx--doug

Open...

File type: Text Encoded
Choose encoding and line feed.

What I had done previously was to highlight the email text and paste it.
Now instead I tried to follow your instructions.

I found the file-type Text Encoded.
I couldn't find "encoding" or "line feed" either as one or two lines.

I assume for this purpose, I should save the email to a file, which I did.
I chose file type "text encodied" and searched out the file (.eml) which
looked like raw html when I imported it. I deleted all the html stuff on
top, and was left with a file that could not be justified or anything, and
had all the apostrophes and quotation marks replaced by some letter
and symbol crap.

Obviously, I'm not doing what you suggest, but I guess I just don't
understand.

--doug

Yes. First recombining the lines into paragraphs:
o Go to Tools | AutoCorrect Options... | Options, and tick "Combine single line paragraphs if length greater than 50%".
o Go to Format | AutoCorrect > | Apply.

Notes:

o You may need to reduce that value of 50% to make the facility work properly in your case. Click on the text of the option and the Edit... button below will light up and enable you to edit the 50% value to a suitable smaller one.

o Applying AutoCorrect will make other changes to your text - which you may not want. You can avoid this in two ways: either switch off unwanted changes in the AutoCorrect dialogue or - more easily - use Format | AutoCorrect > | Apply and Edit Changes instead. Then choose Edit Changes when challenged. Now click the Comment column header in the Accept or Reject AutoFormat Changes panel; this sorts all the changes of a particular type together. Select all the "Combine paragraphs" lines together (using Shift-click or Shift-arrow in the usual way) and click Accept. Now click Reject All to reject all other changes.

Now to remove empty paragraphs:
o In the Find & Replace dialogue, click More Options, and then tick "Regular expressions".
o Search for ^$ - that's circumflex-dollar - and replace with nothing.

I trust this helps.

Brian Barker

This is quickly performed by using regular expressions, either in LO
or more easily in any decent text editor.

On a separate issue, what is the syntax in LO to select paragraph
breaks (pilcrow (¶) sign) in the 'find and replace' dialogue window?

"$", see LibreOffice Help (F1) -> Index -> regular expressions.

Nino

As far as I know, that can't be done (only ↵, that is Shift+¶, can be
found), and that's why Search And Replace won't work in this case. I
think there is a workaround, but I don't remember how…

Kind regards

Johnny Rosenberg
ジョニー・ローゼンバーグ

There isn't one. LibreOffice doesn't think that way: you don't search for the paragraph break as such. Instead, you do what you need using ^ (circumflex) to lock your search term to the beginning of a paragraph or $ (dollar) to lock it to the end of a paragraph. Perversely, you can search for a line break (as entered with Shift+Enter) using \n (backslash-en). Oh, but if you use \n in a "Replace with" expression, it means a paragraph break instead.

I trust this helps.

Brian Barker

See if this works using Find and Replace with Regular Expressions
checked (ticked):
1) Enter ^$ into the Search for box.
2) Enter " zzz " (space then zzz then space) in the Replaced with box.
3) Click the Replace All button. This replaces the second paragraph
break
   used for spacing between paragraphs.
4) Enter $ into the Search for box.
5) Enter a space in the Replaced with box.
6) Click the Replace All button.
7) Enter " zzz " in the Search for box.
8) Enter \n in the Replaced with box.
9) Click the Replaced All button.
     The first three steps replaces the empty paragraphs used for
spacing with the word "zzz" (space before and after the zzz allows a
later search for it). The middle three steps replaces the rest of the
paragraph breaks used as line breaks with a space. (This guarantees a
space between the last word of one line and the first word of the next
line.) The last three steps replaces the word "zzz" with a paragraph
break.
     One more search might need to be made: replace any double spaces
with a single space.

--Dan

Dan,

a very good step-by-step instruction. Did id make it into ask.libreoffice.org?
IMHO you should enter it there as its visibility there can be better enhanced
than in the list by tagging/honoring.

(just my 2¢ suggestion)

Nino

Subsequent posts indicate why it is much easier to solve this question
by use of a text editor instead of LO.

For example, it seems that double carriage return is equivalent to an
empty line. As for the line feed a simple remove may be achieved in
jedit:

find
\n

replace with
[empty field]

would remove new lines and replace with a space ( ) character.

... and perhaps even in the shell using html2text, grep, sed, awk...
(as the OP asked for *automation* of the process)

:wink:

But still we should show that it's also doable with LibreOffice (and it's not
that much more complicated).

Nino

No, and it is not likely to make it there because I don't like the
idea of an Open ID (personal reasons). Well, it would be alright if they
included my user name/password for either the LibreOffice wiki or
Alfresco used by the documentation team writing guides for LO. (But then
this is OT.)
    However, anyone else who can sign/in to the site have my permission
to post it for others to see.

--Dan

I wrote a simple macro (just a draft, feel free to modify it in
anyway) that does it all. It's based on Dan's post above:

REM ***** BASIC *****

Sub Main
  Dim sFind As String, sReplace As String

  sFind="^$"
  sReplace="¶"
  SearchAndReplace(sFind, sReplace)
  
  sFind="$"
  sReplace=" "
  SearchAndReplace(sFind, sReplace)
  
  sFind="¶"
  sReplace="\n"
  SearchAndReplace(sFind, sReplace)
  
  sFind=" "
  sReplace=" "
  SearchAndReplace(sFind, sReplace)

  sFind="^ "
  sReplace=""
  SearchAndReplace(sFind, sReplace)

  sFind=" $"
  sReplace=""
  SearchAndReplace(sFind, sReplace)
End Sub

Sub SearchAndReplace(a As String, b As String)
  Dim Descriptor As Object
  Descriptor=ThisComponent.createReplaceDescriptor()
  With Descriptor
    .SearchString=a
    .ReplaceString=b
    .SearchRegularExpression=True
    .SearchAll=True
  End With
  ThisComponent.replaceAll(Descriptor)
End Sub

You might find this helpful:
<http://extensions.openoffice.org/en/project/AltSearch>
Help file (the same one that's included with the extension):
http://www.volny.cz/macrojtb/HelpAltSearch_en.html
When you open the extension, you can click on
'Regular/Extended/Properties' and you'll be presented with basic
choices. Example: 'Search for: Regular'
...
Series of empty paragraphs ^$\p*
...
Manual line break /n
...
Click the one(s) you want to search on.
Click on the box next to 'Replace:' and you'll be presented with common
choices.

There is also a batch mode where you can edit & save the search/replace
parameters. See 'Batch mode using: [ Batch >> ]' in the help file.