Archive for јануар, 2010

Спирит

недеља, јануар 31st, 2010

(Кликните на слику да бисте је видели у пуној ширини.)

Спирит

26. јануара, 2274. марсовског дана мисије, НАСА је обзнанила да је Спирит „непокретна истраживачка станица“, за коју се очекује да остане у радном стању још неколико месеци док нагомилавање прашине на њеним соларним панелима не проузрокује коначан прекид рада.

(Преузето са http://xkcd.com/695/ под условима Кријејтив комонс Ауторство-Некомерцијално 2.5 Unported лиценце.)

При преводу ми је помогао Горан Обрадовић; није превођено на викију за превођење стрипова јер још увек не подржава све што је потребно.

Идила Османског царства

субота, јануар 30th, 2010

Османско царство је било дом многим народима и верама — културни мозаик који су расцепкали национализам и рат у 20. веку.[1]

И шта човек да каже на ово? Да ли су ови озбиљни? Да ли очекују да ће им ово проћи? Могу ли нешто још глупље да смисле? Могу ли још ниже да падну? Како их није срамота?

Скоро 50% веба је у Уникоду

петак, јануар 29th, 2010

Гугл је на свом званичном блогу недавно објавио да се Уникод приближава 50%-ој заступљености на вебу. Ту је и график који показује како је нагао његов раст:

Слика: Гугл

Слика: Гугл

Мада су се појавили коментари да је „кривац“ подразумевано подешавање сервера̂, а велики део садржаја је и даље уствари ASCII, по мени је чак и то добро: кад год неки Амер буде имао потребу за é или нечим сличним, користиће Уникод (а не ISO-8859-1). А драго ми је и што видим да људи имају мозга да учине Уникод подразумеваним подешавањем :) Превласт Уникода ће значити и да ће се појављивати и више алата за рад са Уникодом… Добра вест, из сваког разлога :)

Још једна занимљива ствар која се може видети са графика је да „егзотични“ кодни распореди (где спада и ћирилица) опадају спорије него ASCII или ISO-8859-1. Претпостављам зато што се опадање због Уникода поклапа са растом коришћености.

(Овде сам хтео да напишем „иначим растом коришћености“. Постоји ли уопште подесна реч која би се могла употребити на овај начин?)

Црвени пауци

четвртак, јануар 28th, 2010
Црвени пауци

То су шестоноги пауци.

(Преузето са http://xkcd.com/8/ под условима Кријејтив комонс Ауторство-Некомерцијално 2.5 Unported лиценце.)

Webcomic Translation wiki grows!

субота, јануар 23rd, 2010
Ny bil

(Taken from http://xkcd.com/570/ under the conditions of Attribution-Noncommercial 2.5 Generic license.)

Translation: Jon Harald Søby on the WebcomicTranslation wiki.

Нови ауто

четвртак, јануар 21st, 2010
Нови ауто

Нека фирма је заиста пронашла начин за повећавање пениса, али нема начина да допре до потенцијалних муштерија.

(Преузето са http://xkcd.com/570/ под условима Кријејтив комонс Ауторство-Некомерцијално 2.5 Unported лиценце.)

Превод: Горан Обрадовић на викију за превођење стрипова.

Wikipedia Page Views By Country – Breakdown with Wikipedia Size and Quality

недеља, јануар 17th, 2010

This post tries to assess how visitation of Wikipedias in various languages is dependent on Wikipedia size or quality.

Two days ago, I analysed Erik Zachte’s statistics about Wikipedia page views per country by replacing page views per person with page views per Internet user. In the meantime, Erik did the same in his original report, so in future it’s better to refer to it for that.

Now I am taking a look at another of Erik’s statistics, namely Wikipedia Page Views Per Country – Breakdown by also including in the data number of Wikipedias’ articles, Wikipedias’ depth, and Wikipedias’ score by sample of articles.

The results are at Wikipedia Page Views By Country – Breakdown with Wikipedia Size and Quality (HTML), but before you see them, you will want to know what they are and how are they interpreted. I will explain them on the example of the United States, with expanded column names:

United States (31.1% share of global total)
Wikipedia Percentage of views Number of articles Score by number of articles Depth Score by depth Sample score Score by sample score
Wikipedia %V №A S/№A Depth S/Depth Sample S/Sample
English Wp 89.0% 3,159,745 0.028 481 0.19 84.36 1.06
Japanese Wp 1.6% 645,920 0.002 46 0.03 42.64 0.04
Spanish Wp 1.1% 551,472 0.002 142 0.01 60.84 0.02

Here,

  • Wikipedia — is the name of the Wikipedia;
  • Percentage of views — is the percentage of views from the original Erik’s data;
  • Number of articles — is the number of articles on the Wikipedia (from the List of Wikipedias), a statistics I believe everyone is familiar with :) ;
  • Score by number of articles — is the percentage of views divided by the number of articles (itself divided by 1,000), which gets a better explanation below;
  • Depth — is the Depth from the List of Wikipedias, explained in detail on that page;
  • Score by depth — is the percentage of views divided by the depth;
  • Sample score — is the Score from the List of Wikipedias by sample of articles, explained in detail on that page; and,
  • Score by sample score — confusingly named, is the percentage of views divided by the sample score.

What exactly is this score I am talking about? It’s a measure of popularity of Wikipedia by one of its parameters. For example, if in some country two Wikipedias would be read equally, but one of them would be twice smaller, it would have twice larger score (because readers will go to it despite the fact that they are half as likely to find what they need; and so, in a way, they like it twice as much). I hope this makes some sense to the reader; if not, I hope it will make sense by the end of the post.

Scores are a quick way for the readers to estimate „actual popularity“ a Wikipedia has regardless of its depth or breadth, as opposed to „measured popularity“ as measured by the number of visits (which is influenced by depth and breadth). Note that they shouldn’t be compared between countries.

Also note that scores by number of articles seem to roughly agree with scores by depth; but scores by the sample score seem not to. Not sure how to interpret this.

Now, in the example of the United States, scores by number of articles are 0.028 for English, 0.002 for Japanese and 0.002 for Spanish Wikipedia. What this means is that Japanese Wikipedia is as liked as Spanish, and is more visited only because it has more articles; if Spanish Wikipedia would be an exact translation of Japanese Wikipedia to Spanish language, it would be more visited than Japanese Wikipedia is now. However, Spanish Wikipedia has 14 times ( 0.028 / 0.002 ) smaller score than English Wikipedia; if it would be an exact translation of English Wikipedia to Spanish language, it still wouldn’t be read equally, but 14 times less (right now it is being read 81 times less).

My mind has just been blown by the discovery of the fact that this exactly matches the proportion of people in the United States who speak English only at home by people who speak Spanish only and ‘Speak English less than „very well“‘[1] as the US Census Bureau would put it ( 227,365,509 / 16,156,584 = 14 ). In other words: in the US, the English Wikipedia is read exactly as much more than the Spanish Wikipedia as would be expected from the numbers of English-only and Spanish-only speakers in the US and the size difference between the two Wikipedias. It remains to be seen whether this will stay true for other cases.

In order to further explain what all these number means, I will use a better example, of Faroe Islands:

Wikipedia %V №A S/№A
English Wp 74.8% 3,159,745 0.024
Danish Wp 11.0% 122,013 0.09
Faroese Wp 2.8% 3,872 0.723

Assumption: if Danish Wikipedia would have the same number of articles as English Wikipedia, its readership would proportionally rise. Question: how much would it then be exactly?

The percentages are recalculated thus:

P = 3,159,745 / 122,013 = 26 (Danish Wikipedia would be read 26 times as much.)
X = P * 11.0 = 285 (We don’t know its actual readership but we assume it will rise 26 times in proportion, hence we have 285 „percents“.)
Xp = ( X + ( 100 – 11.0 ) ) / 100 = 3.7 (We add the remaining 89.0% and recalculate to 100% i.e. each „new percent“ has 3.7 „old percents“.)
E = E0 / Xp = 19.5 (E0 is old percentage of English Wikipedia etc.)
D = X / Xp = 76.1
F = F0 / Xp = 0.7

I was not 100% sure of this math, but it was verified to be correct by Goran Obradovic. I would have felt much more comfortable if I had actual readership numbers instead of percents though :) The results are:

Wikipedia %V №A S/№A
English Wp 19.5% 3,159,745 0.006
Danish Wp 76.1% 3,159,745 0.024
Faroese Wp 0.7% 3,872 0.18

Note that the scores remained in proportion: 0.723 / 0.09 ≈ 0.18 / 0.024 and 0.09 / 0.024 ≈ 0.024 / 0.006

Meanwhile, if Faroese Wikipedia would have the same number of articles as Danish Wikipedia, the results would be:

Wikipedia %V №A S/№A
English Wp 40.3% 3,159,745 0.013
Danish Wp 5.9% 122,013 0.048
Faroese Wp 47.6% 122,013 0.39

And if it would have the same number of articles as English Wikipedia, The results would be:

Wikipedia %V №A S/№A
English Wp 3.1% 3,159,745 0.001
Danish Wp 0.5% 122,013 0.004
Faroese Wp 95.9% 3,159,745 0.03

Since percentages this high are usual in countries with large national Wikipedias, I pronounce the results correct :D

Finally: what if both Faroese and Danish Wikipedias would have the same number of articles as English Wikipedia?

Wikipedia %V №A S/№A
English Wp 2.8% 3,159,745 0.001
Danish Wp 11.5% 3,159,745 0.003
Faroese Wp 85.3% 3,159,745 0.027

It appears that the Statistics Faroe Islands does not have any data about languages spoken in Faroe Islands. If anyone could find that, it could serve to reaffirm (or not) these results.

By the way, you don’t have to compare Wikipedias only with each other. You can also ask the question like: if Faroese Wikipedia would be twice as large as it is now, how many readers it would have?

In case someone else would like to toy further with these data, here are the source code and source data I used to get it (as well as the previous post).

Wikipedia Page Views Per Country with Internet users

петак, јануар 15th, 2010

Recently, Erik Zachte released statistics of Wikipedia page views per country & language. Excellently made and much needed, they tell us how should Wikipedias be propagated at local level.

However, the statistics Wikipedia Page Views Per Country – Overview show us page views per person, while I believe a much more useful statistics is page views per Internet user. After all, Wikimedia could do little to increase global Internet usage. And so, as I promised, I merged the statistics with data from the List of countries by number of Internet users. Here they are:

These statistics paint quite a different picture. There are some anomalies that point to bad geolocation or bad data on number of Internet users; however, it’s clear that there are hundreds of millions of people who could benefit from Wikipedia if only it were more popular in their countries. My conclusions:

  • Iran and China are very low, but there’s not much we can do about it.
  • There are a number of countries (African, Arabic, India, Indonesia…) with tens of millions of Internet users and relatively few Wikipedia readers. I assume it’s the case of low foreign language knowledge combined with little material on local Wikipedias. Some bootstrapping might help here, such as digitalization of already existing encyclopedias (in PD or by outright buying copyrights), or bot creation of relevant articles.
  • Some former Soviet republics are very low, and it’s difficult to see why, as a number of people in them know Russian, and Russian Wikipedia is fairly large.
  • There are also hardly explainable differences between countries that should be similar, for example Germany has almost twice as many views as France.

TODO: Install the Language Switcher plugin.

Перспектива

четвртак, јануар 14th, 2010
Перспектива

Питам се шта сам то сањао па ми се јавила та мисао. Надам се да није опет Ричард Сталман.

(Преузето са http://xkcd.com/198/ под условима Кријејтив комонс Ауторство-Некомерцијално 2.5 Unported лиценце.)

Превод: Горан Обрадовић на викију за превођење стрипова.

Екстраполација

четвртак, јануар 7th, 2010
Екстраполација

До трећег триместра носићеш на стотине беба.

(Преузето са http://xkcd.com/605/ под условима Кријејтив комонс Ауторство-Некомерцијално 2.5 Unported лиценце.)

Превод: Горан Обрадовић.

Будући жртвом успешне завере, натеран сам да направим систем за превођење стрипова. На њему, свако ко пожели може слободно до миле воље исправљати овај превод или сам преводити нове стрипове. Критике превода на овом месту неће бити баш најсврсисходније.