about unicode txt documents:

hi every one.
a person gave me some txt documents stored with microsoft word with
the unicode encoding.
but when i activate libreoffice open dialog box and select txt choose
encoding, libreoffice has not unicode alone and has only unicode utf-8
for this reason i cant open them properly and my documents which
stored with microsoft word in unicode encoding, its completely
unreadible via libreoffice unicode utf-8 and the characters are not
display in persian!
what should i do to solve them?
the problem is very critical for me and my documents are very valuable for me.
thanks for your help and god bless you all.

-- O people! there has come to you indeed an admonition from your Lord
and a healing for what is in the breasts and a guidance and a mercy
for the believers.
Say: In the grace of Allah and in His mercy-- in that they should
rejoice; it is better than that which they gather.
holy quran, chapter 10 verses 57 and 58.

please visit al-islam.org

​Are you sure that the file is actually unicode ? MS Word have a history of
not doing what he's told with file encodings...

If it is, you might want to try UTF-16, some quick search (
http://www.herongyang.com/Unicode/Word-Save-File-in-Unicode.html) suggest
that the label-less "unicode" settings can be UTF-16 with BOM​. Now, BOM
support in LibreOffice is another matter. The UTF-16 encoding should be
right below UTF-8; at least it is with LibreOffice 5.0.5.2

(Also, your signature is ill-formed; it should begin with "-- ", two dashes
and a space, followed by a newline. This way people that don't care about
signature can easily filter them out).

Nasrin:

Just in case: your default font is one that has the Persian characters,
right?

If you try "Insert | Special Character ..." and don't see any of the
characters you need when you scroll down, you'll need to change the font for
your Default Style. Free-* and Liberation-* fonts might not be to your
liking, but they have all the characters you'll need to test.

hi.
yes. my documents are in persian language and the encoding is unicode.

Hi Nasrin:

Your comment, the "encoding is unicode," is rather meaningless, since
Unicode simply provides a long and unique number for most every glyph used
to write something down. The actual "encoding" you care about is the
specific method used to represent these unique numbers, which is typically
UTF-8 for most uses, but can be UTF-16 or UTF-32 for special cases.

If you go to BUG #92655
(https://bugs.documentfoundation.org/show_bug.cgi?id=92655), and then to the
second comment (marked as Comment #1) and download the attachment (117160)
listed there, you'll find a 32 page pdf document I created some time back
titled "Exploring Complex Text Layout." This document covers a lot of what
you need to understand to use other scripts in LibreOffice Writer,
particularly if you are using more than one script/language in the same
document.

Since you are presumably writing Farsi, a language which uses the Arabic
script, you'll be interested (maybe) that this script, along with Thai,
Hindi, and Hebrew, is used in some of the examples in my pdf.

Unfortunately, I am unfamiliar with Farsi, so my Arabic script examples are
in the Arabic language (well, one flavor of that), but I'm sure you'll find
the discussion informative, as it covers a lot of the niceties such as
contextual alteration of the character forms, kashideh justification, issues
with using right-to-left scripts in Writer, and so forth.

Beginning on page 33 of the document, there is an explanation of how Unicode
values may be converted into one, two, three, or four bytes in standard
UTF-8 encoding, and why these options are all needed. While UTF-32
characters are always 32 bits (4 bytes) long, UTF-8 character sizes can vary
depending on what character set is in use *for any given individual
character*. While the غ or ي characters are each two bytes in length, a
space or a carriage return are still only one byte in length. Although it
seems on the surface to be a complicated way of doing things, it's actually
a really cool way of achieving a rather difficult objective.

There is also a section in the pdf about various ways of entering
characters; having a keyboard mapped to another language is great until you
need to switch back and forth on a regular basis.

The bottom line is that the font that is in use must contain the characters
from Unicode block 0600-06ff in order to reproduce Persian/Farsi, which is
why I suggested the fonts that I did, as I know that those include these
characters.

You are stepping into a very interesting area, particularly as LibreOffice
is not particularly good with right-to-left languages if you attempt some
things that are trivial in English (e.g. rotating text in a table cell among
other things). If you are interested in things to look out for with Writer,
go back to the same bug referenced above and download the first attachment
#117159 for a tour of the issues you might face.

I hope this helps you get an idea of what you are stepping into.

Good Luck,

Frank