Filtered html document

Good evening everybody

I am trying to make something available through Kindle.
For some reason Amazon recommends preparing the original document as
".doc" or ".docx" file
and in a final step:
save as -> "Filtered html document"
(in order to remove MS office codes)

What would be the "equivalent" for saving a file in LO when I, as it
happens to be the case, do not like working with Word?
And would that option produce the results Amazon is looking for?

Thank you in advance.
Thomas

Thomas Blasejewicz wrote:

Good evening everybody

It's morning. :wink:

I am trying to make something available through Kindle.
For some reason Amazon recommends preparing the original document as
".doc" or ".docx" file
and in a final step:
save as -> "Filtered html document"
(in order to remove MS office codes)

What would be the "equivalent" for saving a file in LO when I, as it
happens to be the case, do not like working with Word?
And would that option produce the results Amazon is looking for?

Have you tried saving in HTML? It's one of the formats that you can
choose from.

Hi :slight_smile:
+1
Go straight to html and miss out the intermediate step of using MS formats.

If they really insist on you using an MS format and then convert that into html rather than just going straight to html then Doc without the X is the best bet but i can't see why they would want that.  It would be interesting to hear what happens when you give them the html.  Do they really need that middle step?!

Regards from
Tom :slight_smile:

(as for the "good evening", here in Japan is it late evening;
just don't mind that; itis onlythat I prefer to use a polite opening over exploding with: "I have a problem ...")

(2013/03/20 20:20), Tom Davies wrote:

Hi :slight_smile:
+1
Go straight to html and miss out the intermediate step of using MS formats.

If they really insist on you using an MS format and then convert that into html rather than just going straight to html then Doc without the X is the best bet but i can't see why they would want that. It would be interesting to hear what happens when you give them the html. Do they really need that middle step?!

Well, I do not know. I am not that good with computers.
But the "official guide" from Amazon = "Building your book for Kindle" (PDF file) says so.

Do I understand you correctly ...
I write my text, with pictures and whatever it takes, and then save it as
>> HTML Document (Writer) html <<
from the drop-down list?
This would make me a lot more comfortable than fumbling around with Word, clutching the instruction manual in one hand ...

Thank you.

For my Kindle App on my Android based tablet, I make the document as a small size page PDF file, instead of any other format. That was the orly way it would read a document from the "external" micro SD card.

So, want I am thinking is, you may need to take your Kindle document and format the page and font sizes to a specific size of a Kindle document. I know that there are page tags for HTML.

Can you find out what the difference is between Filtered HTML and "standard" HTML? Also is it version 4 or the newer ver 5 HTML format needed.

I would love to know what the final format for Kindle documents are, then I do not have to deal with all of the conversion process I have been using to get my Kindle app to work with a document on the microSD card. With the PDF file, I am given a choice between Kindle and Adobe Reader.

The simplest method would probably be to use a simple .txt [as
notepad] for the document;
           then import it to the kindle program.
       That's worked for me :wink:

Good evening everybody

I have played around a lot with converting documents to Kindle format. I have used a couple different methods.

1. With LO, I use the "Save as" command to save the document in an HTML format. Do NOT use the Export to HTML feature as, for whatever reason, the resulting HTML file is not nearly as clean as when using the "Save as" command. Once saved in the HTML format, I then load it into Calibre, an open source e-book converter. Calibre works really well to convert the HTML document into a MOBI Kindle format, and it has many features I'm only beginning to explore. Using this method, I get a fairly good Kindle format, with a couple exceptions. When LO converts to HTML, it inserts an extra space between paragraphs, even though I don't have it in the LO document. This is not unusual with HTML documents. Also with numbered and bulleted lists, it inserts both an <li> and a <p> tag for each paragraph. It is apparently good HTML coding, and displays properly in a Browser, but when it gets converted to MOBI format and downloaded to the Kindle, you get something like this:

I published a shell script to automate conversion with LO from
{.doc,.odt...} to CLEAN HTML here:

http://www.techrepublic.com/blog/opensource/how-to-convert-doc-and-odf-files-to-clean-and-lean-html/3708

HTH
Marco

(2013/03/21 15:07), M. Fioretti wrote:

1. With LO, I use the "Save as" command to save the document in an
HTML format. Do NOT use the Export to HTML feature as, for whatever
reason, the resulting HTML file is not nearly as clean as when using
the "Save as" command. Once saved in the HTML format, I then load
it...

I published a shell script to automate conversion with LO from
{.doc,.odt...} to CLEAN HTML here:

http://www.techrepublic.com/blog/opensource/how-to-convert-doc-and-odf-files-to-clean-and-lean-html/3708

That sounds very promising ... BUT ...
as an ordinary mortal man I am NOT CAPABLE of understanding that article.
The technical language eludes me completety.
So does the technique itself. I have absolutely no idea as to what to do with that "script".

Is there any chance of selecting a certain file, click somewhere and wait until the conversion process completes (automatically)?

I agree. Clearly he understands what needs to be done, but what he has written is for people with a high level understanding in programing. Such is not the case with the average person subscribed to this list who would like to use the script. Two things seem to be missing: a more in depth explanation of the parts of script, and examples where .odt and .doc files are converted to clean HTML (one for .odt and one for .doc).
      Examples:

soffice --headless --convert-to output_file_extension[:output_filter_name] [--outdir output_dir] files

soffice --headless (This part I understand.)

--convert-to output_file_extension[:output_filter_name] [--outdir output_dir] files

I have no idea what the components of this are. What part goes with what? The only thing that I do understand is that the things contained in brackets are optional. What is this?:

output_file_extension[:output_filter_name]

What is this? What is its purpose?

[--outdir output_dir]

What is the purpose for ending the entire command line with the term "files"? What files? Can several files be listed? Can * be used in place of "files" to batch convert all the files in a folder? Examples please!
Another problem line in the article:

  convert_doc_to_html.sh SOURCE_DIR TARGET_DIR

As I understand script files, "convert_doc_to_html.sh" is the name of a script file. Source directory and target directory of what? Here a simple explanation would be helpful. For example, add this to the line:

(SOURCE_DIR is where the file to be created is located, and TARGET_DIR is where you want the converted HTML file to be created.

Another suggestion: Describe the script file before listing the code for it. Include directions for creating a temporary folder (directory) to contain the .doc or .odt files to be converted. This way lines 4 and 5 can be kept as it: the folder is after all temporary. Also include directions for creating the folder to contain the converted HTML files.

Include more detailed instructions on how to create the /tidy_options.conf/ file and where to save it.

I must admit that having to reread your article several times while writing this email has given me a better understanding of what you wrote. It has taken that long for me to be able to piece together what you wrote. Even so, I may still miss some parts because I do not understand even some of the fundamentals of programming languages. (I wonder how many others don't either.)

--Dan

yes, you're so right - these computer guys tend to speak their own
language;
            in person, I stare at them in disbelief and they'll many times
'speak English' ...
               on this list - and in writing - I can't do that ;-(
                   so I ignore, hoping to find someone who can speak both
computer jargon as well as simple English :wink:

       I learned from the beginning these 'manuals' were not made for the
layman :wink:

       Shucks, changing the time on a vehicle once was simple .. then they
came up with a menu ... then they hide this menu so it's not findable;
reading the manual merely goes 'round 'n 'round saying go to menu or click
on this button or ... ... ... but nary a word as to how to find these
buttons :wink:

       And I've stayed confused as to where the vehicles' horn has been
located - once was in the center of the wheel;
            or bright lights button - once was on the floor board, now can
be anywhere;
                and to have the cruise and radio buttons on steering wheels
must have been the design of some non-driver :wink:

       I think I belong in a previous generation -
            where horsepower meant using horses ...
                where communication meant face to face ...
                    where reading was holding the actual book in a
comfortable chair ... ... ...

anne-ology wrote:
<snip>

       I think I belong in a previous generation -
            where horsepower meant using horses ...
                where communication meant face to face ...
                    where reading was holding the actual book in a
comfortable chair ... ... ...
  

+2
Its not a generation, it is a state of mind.
Girvin Herr
<snip>

and my state of mind is ... ... ... [image: Inline image 1]

       yikes, there's a mouse on the desktop; what should I do [image:
Inline image 2]
           no mouse ever sat here next to the typewriter; but this
glorified typewriter seems infested with bytes [image: Inline image 2]

       and all those CDs aren't recognized at the bank ...
           and the monitor doesn't seem to see any typos ...
               and I liked that computer scene in 'Bridget' ...
                   oops, seems the less memory there is, whenever the
memory increases ...
                       ah ah ah choo - there goes another virus flitting
through the internet waves ...

[here's a list compiled a while back -
https://www.dropbox.com/s/oqborwd6xqyckgl/terms.txt]

Thomas wrote:

Is there any chance of selecting a certain file, click somewhere and

wait until the conversion process completes (automatically)?

That is essentially what Calibre does. < http://calibre-ebook.com.>

It converts a variety of formats, including EPUB, HTML and ODT directly to Kindle Mobi format. The results are varied depending on how complex the original document is. But, at least it's easy to use.

Virgil