problem removing duplicate string in calc

I have a column of terms that I pasted from two text files. I have tried to
remove duplicate terms from the list but have the continued problem of
having some of the duplicated terms not removed. One example is the term
"Adrenal hyperplasia" and the other duplicate is "adrenal hyperplasia". I
have used the standard filter with the filter being "Col A = Not Empty" and
have checked the buttons for "remove duplicates". I have left unchecked the
button for "Case sensitive" assuming that that would allow for 'Adrenal' to
be the same as 'adrenal'. However, that appears not to be working and I
don't understand why. If anyone could help me understand what I am doing
incorrectly, I would appreciate it.

Both lists were taken from simple text files and pasted into Calc. I
combined them into one long column and then tried to remove the duplicates
from it.

Thank you in advance.
Leon

Hi Leon

leon244 wrote

However, that appears not to be working and I don't understand why. If
anyone could help me understand what I am doing incorrectly, I would
appreciate it.

The good news is that you aren't doing something incorrectly. The bad news
is that you just found a Bug in the Advanced filter. I just did a quick test
and previous versions of LibreOffice and OpenOffice have the same (or
worse!) problems. Your only option to solve the problem is to use Excel or
maybe some other spreadsheet. Unfortunately Gnumeric doesn't have a case
sensitive option.

Could you please report your finding at
https://bugs.freedesktop.org/enter_bug.cgi?product=LibreOffice

Cheers,
Pedro

This does not address your actual query, but a workaround might be to create a new column with =LOWER(An), =UPPER(An), or =PROPER(An). This will create copies of your text items where the examples you give would become identical. You could then filter based on that column instead of the originals. The extra column could be hidden or deleted later if you prefer. Indeed, you might be happiest using one of these forms of the converted data in place of the originals.

I trust this helps.

Brian

Brian,
Thank you, I will try this. I think it is the answer
Leon

Pedro,
Thank you. I have reported the bug as you suggested.
Leon

Just a follow up. This worked with one extra step. After creating the
column as function LOWER(cell content) the resulting column is all
functions and would not filter. By copying over as text it then filtered
correctly. (Probably and easier way to do this, but it was quick and
simple). The extra columns are easily discarded or hidden.
Thanks for the workaround. Actually quite simple. I should have thought
of it.
Ciao
Leon

Hi Leon

leon244 wrote

Thank you. I have reported the bug as you suggested.

Thank *you*. Could you please report on this thread the number of the bug?

Apparently this bug is already known (since you reported a new one, then
it's a duplicate)
https://bugs.freedesktop.org/show_bug.cgi?id=70925

BTW I'm glad Brian's workaround solved your problems. In fact changing the
case to your liking is better than using the filter which will simply pick
the first occurrence regardless of the case. Alternatively (when the filter
works again) you can fix the case only for the shorter output list :slight_smile:

Cheers,
Pedro