I agree. Clearly he understands what needs to be done, but what he has written is for people with a high level understanding in programing. Such is not the case with the average person subscribed to this list who would like to use the script. Two things seem to be missing: a more in depth explanation of the parts of script, and examples where .odt and .doc files are converted to clean HTML (one for .odt and one for .doc).
Examples:
soffice --headless --convert-to output_file_extension[:output_filter_name] [--outdir output_dir] files
soffice --headless (This part I understand.)
--convert-to output_file_extension[:output_filter_name] [--outdir output_dir] files
I have no idea what the components of this are. What part goes with what? The only thing that I do understand is that the things contained in brackets are optional. What is this?:
output_file_extension[:output_filter_name]
What is this? What is its purpose?
[--outdir output_dir]
What is the purpose for ending the entire command line with the term "files"? What files? Can several files be listed? Can * be used in place of "files" to batch convert all the files in a folder? Examples please!
Another problem line in the article:
convert_doc_to_html.sh SOURCE_DIR TARGET_DIR
As I understand script files, "convert_doc_to_html.sh" is the name of a script file. Source directory and target directory of what? Here a simple explanation would be helpful. For example, add this to the line:
(SOURCE_DIR is where the file to be created is located, and TARGET_DIR is where you want the converted HTML file to be created.
Another suggestion: Describe the script file before listing the code for it. Include directions for creating a temporary folder (directory) to contain the .doc or .odt files to be converted. This way lines 4 and 5 can be kept as it: the folder is after all temporary. Also include directions for creating the folder to contain the converted HTML files.
Include more detailed instructions on how to create the /tidy_options.conf/ file and where to save it.
I must admit that having to reread your article several times while writing this email has given me a better understanding of what you wrote. It has taken that long for me to be able to piece together what you wrote. Even so, I may still miss some parts because I do not understand even some of the fundamentals of programming languages. (I wonder how many others don't either.)
--Dan