Number of locale users?

Is there some way to see number of users using your locale?
For example, how many users use LibreOffice in HR locale.
Would be nice to know for example:
100 users using LO in Croatia
From that
58 in croatian UI
22 in english
5 in serbian
5 in bosnian
etc...

What kind of data do you get from automatic update pings?

Best regards,
Mihovil

Hi :slight_smile:
That sort of data is notoriously inaccurate in OpenSource.

It's not as though we can track number of licenses sold or anything like
that. Such things are often considered intrusive! it might be good to
know but on the other hand we respect people's privacy far more.

One person might download an update many times before it gets applied.
Others might download it once and then carry it to many machines (across a
network for example).

A classic example is that every desktop machine that runs on Linux gets
counted as a Windows machine, by MS. Very few begin as Macs and very very
very few were sold with Linux pre-installed by their OEMs.

The FSF have pushed for an annual day for people to get refunds on unused
Windows licenses. It does happen but most people can't be bothered. MS
make it difficult to reclaim and even if you manage it the amount of money
bears no resemblance to the cost of a new license to an average consumer.

Presumably something similar would happen with MS Office. MS Office is
supplied on new machines and therefore the user bought MS Office = so they
count as an MS Office user even if they never use it.

It's not fair nor rational but it's just the way it is.
Regards from
Tom :slight_smile:

Firefox counts the number of times the local software pings the Mozilla servers for updates to the blacklist. I've found this a fairly reliable figure because I can tell (sadly) when the Gaelic college is on holidays. Number drops by about 20 users and it always coincides with holidays.

So perhaps the thing we could count would be the pings and installed version sends to check if there is an update available. That way it doesn't matter how many times you re-use the installation file as you're counting the installations which are pinging for updates. Dunno if that includes the locale data but I'd be interested too. Even if it isn't 100% accurate.

Michael

11/10/2014 17:11, sgrìobh Tom Davies:

Exactly what I was talking about. :slight_smile:

11.10.2014. u 19:21, Michael Bauer je napisao/la:

May I ask why you want to violate people’s privacy just to get a number?

2014.10.12 22:42, Adolfo Jayme Barrientos wrote:

May I ask why you want to violate people’s privacy just to get a number?

May I ask you where in your opinion the line lies between people's
privacy and plain statistics?

Rimas

I'm not sure what info from update pings is Mozilla collecting, but I don't see any private informations there.
Country of origin, current version and UI/keyboard locale doesn't seam like private info to me.

I'm guessing weekly update pings in LibreOffice are already sending version info now, and with IP from which ping is comming you can know from which country it's coming. Only thing missing is locale info.
Anyway, users can always disable automatic update check which will disable all this.

Why do you think that will violate user privacy?

Best regards,
Mihovil

P.S. Please quote email you are reffering to.

12.10.2014. u 21:42, Adolfo Jayme Barrientos je napisao/la:

Apart from the fact we're not talking about private information (such as name, age, address, phone or even IP etc) but just 'locale(s) installed' and 'country', perhaps 'version', this is a useful metric to measure uptake. For example, until we got usage/locale data on Adaptxt (a predictive texting tool) and Firefox, we had not been aware that there is a massive discrepancy between people willing to use a tool FOR Gaelic as opposed to tools IN Gaelic. It also allows us to monitor roughly the success of different campaigns to increase uptake.

Michael

12/10/2014 20:42, sgrìobh Adolfo Jayme Barrientos:

Hi all,

I'm not sure what info from update pings is Mozilla collecting, but I
don't see any private informations there.
Country of origin, current version and UI/keyboard locale doesn't seam
like private info to me.

I'm guessing weekly update pings in LibreOffice are already sending
version info now, and with IP from which ping is comming you can know
from which country it's coming. Only thing missing is locale info.
Anyway, users can always disable automatic update check which will
disable all this.

You should ask on the private marketing list, I know the marketing team
is following download stats, but I'm not sure they do that per language.

Cheers
Sophie

I think they are using RedMine now which requires one more unused account...
Next suggestion, implement one account for all TDF/LO services. :slight_smile:

Best regards,
Mihovil

13.10.2014 u 8:43, Sophie je napisao/la:

I'm thinking we should file a feature request bug and see what happens?

Michael

13/10/2014 07:43, sgrìobh Sophie:

Hi *,

Is there some way to see number of users using your locale?

Nope, at least not yet/not for historic data.

What kind of data do you get from automatic update pings?

As windows build includes all languages, and windows users are by
magnitude the biggest part of users (and most linux users very likely
use their distro's version), there's no way to distinguish between
what languages are actually used.

info in an update ping is the based on the UpdateUserAgent string
that's taken from versionrc/version.ini and looks like this for
example:

LibreOffice 4.1.0.4

→ Version used (human readable)

(89ea49ddacd9aa532507cbf852f2bb22b1ace28;

→ the commit hash the build was based on

Windows; x86;

→ OS and architecture/variant

BundledLanguages=en-US af am ar as ast be bg bn bn-IN bo br brx bs ca
ca-XV cs cy da de dgo dz el en-GB en-ZA eo es et eu fa fi fr ga gd gl
gu he hi hr hu id is it ja ka kk km kn ko kok ks ku lb lo lt lv mai mk
ml mn mni mr my nb ne nl nn nr nso oc om or pa-IN pl pt pt-BR ro ru rw
sa-IN sat sd sh si sid sk sl sq sr ss st sv sw-TZ ta te tg th tn tr ts
tt ug uk uz ve vi xh zh-CN zh-TW zu)"

→ List of languages, in the case of Windows build → all (doesn't
distinguish between actually installed languages/languages in the
installer)

The BundledLanguages parameter is actually pretty useless in this regard.

However the request is also sent with a HTTP_ACCEPT_LANGUAGE header
that corresponds to the UI-language, but that's not logged anywhere,
so cannot create data from that right now (would be possible to add in
future though)
http://opengrok.libreoffice.org/xref/core/extensions/source/update/feed/updatefeed.cxx#359

The only approximation we have in regards to what languages are used
are the download numbers for the helppacks on windows and/or the
languagepacks on Mac/Linux - but helppacks are often not downloaded at
all (either because not available for a language, or because people
just use wikihelp instead), and Mac/Linux has too few market share
compared to windows to be a trustworthy indicator.

So only part we can do is to use geolocation of the requesting IP to
see where in the world LO is used (and has update-check enabled) - but
not in what language the UI is used.

So using the update-check pings, we could evaluate the Accept-language
header to create ratios (but of course no absolute numbers, as there
are no unique IDs and one client pings multiple times, and the
poll-ratio also is configurable) that probably would satisfy the
request.

i.e. it would be possible to create in future reports for
* geolocation
* UI-locale from request-header

but not
* list of actually installed locales
for that the BundledLanguages parameter would need to change or
additional one would have to be introduced.

ciao
Christian

Hi Christian,

If I understood you correctly, you already have all data, you just didn't setup infrastructure to collect them (IP geolocation and UI locale from header).
When/If you ever decide to collect that data, I don't think real user number can diviate much from that.
Default setup are weekly pings. My guess is that very few people change that, it cannot be more then 1-2%.
If you calculate your data on weekly (monday to sunday) basis, after few months of collecting data, you should know inside few % of statistical error what is your user count.

Only big difference in numbers can be with large deployments behind firewall (government?), which will block all ping update requests.

Anyway, do I need to suggest this somewhere or open an bug?

Best regards,
Mihovil

13.10.2014 u 14:34, Christian Lohmaier je napisao/la:

Hi Mihovil,

I think they are using RedMine now which requires one more unused
account...
Next suggestion, implement one account for all TDF/LO services. :slight_smile:

Already on the to-do-list and work already begun.

Regards,

Dennis Roczek

Hi all,
Wouldn't the most obvious thing be to add a Piwik server to our web
infrastructure and start picking up some detailed statistics about both
downloads but also user behaviour in general?

Cheers,
Leif Lodahl

We have decided to avoid releasing download figures, and use update
figures instead (which need some number crunching by developers: last
figure was 80 million if unique IPs pinging for updates, but goes back
to late spring). Download figures are not representative of the level of
adoption, and in addition are extremely easy to tweak.

SourceForge, for instance, is advertising based, which makes it tempting
to raise download numbers in order to ask for higher rates. The fact
that most projects hosted on SourceForge sport above average download
figures - for what is worth - raise a question mark

Back at OOo, download figures proved to be wildly inaccurate.

Hi :slight_smile:
80 million sounds pretty good. I tend to try to stop most machines from
updating unattended for most things and anyway most probably appear to come
from a single ip address. Also i might download an update in one place and
then use it on many different machines in several different places as not
all have internet access.

I suspect it's similar for many other people too. So i suspect the 80
million figure is a nice safe low estimate. That makes it easier to quote
without qualms :))

I also think it safely avoids counting people who really want to remain
anonymous and probably avoids causing any undue issues wrt privacy for the
rest of us too.
Regards from
Tom :slight_smile:

I know at least one installation with more then 200 desktops using LibreOffice and connecting to internet trough same IP.
Unique IP counting isn't flowless, but it's better then downloads.
Please take my consideration for counting locale UI headers for each update request and making that data available to locale teams.

Best regards,
Mihovil

14.10.2014 u 15:01, Italo Vignoli je napisao/la:

Hi Leif, *,

Hi all,
Wouldn't the most obvious thing be to add a Piwik server to our web
infrastructure

Already have piwik running, soo

and start picking up some detailed statistics about both
downloads but also user behaviour in general?

downloads counting is hart to map to the actual user-base (but for
downloads we have other parsing in place.

From my point of view that would be the most simple solution.

For getting a value of what UI langauges are actually used: yes,
that's more or less the only way that's possible at all without
reinventing the wheel or changing the update-query code in LO itself
(and thus would miss all older versions).

And hooking it up to piwik is what I meant with "evaluate the
Accept-language header"

ciao
Christian