Monday, October 20, 2014

#Charkop - a Vidhan Sabha constituency

Data about politics, politicians regularly finds its ways to Wikidata. When an item gets my attention, I often add all associated items to Wikidata as well. Charkop is a consistency in Maharashtra according to an associated category there are many more.

Given that the software I use is broken at this time, I can blog about one dilemma.

Charkop is a Vidhan Sabha constituency it is part of the Mumbai North Lok Sabha constituency. The question is if Charkop "is in the administrative territorial entity" of Mumbar North or Maharashtra.

#Google - Let us #share in the sum of all #knowledge

Dear Google, in our own ways, we share the aspiration to share in the sum of all knowledge. We are really happy to share everything we have with you. Our licenses are designed to share widely.

Dear Google, could you please help us make sure that our Labs webservices survive your bots? What we do not want is for your bots not to run. What we want is for our webservers to serve our own needs first and use all the spare capacity for you. As it is our software dies.

We really want you to have our data and, there are several other ways whereby you can get all out data any way. For this reason please help us with our software so that we can continue to share the sum of all our available knowledge with you.

Sunday, October 19, 2014

#Wikidata - P1472, the #Commons #Creator #Template

The work of many artists is represented in Commons. Having great information available for all of them is a Herculean job. Having all that information and more available in all the languages that are supported by the Wikimedia Foundation is very much an aspiration.. Once Commons is wikidatified, all information needs to be understood in all our languages..

France Prešeren is one of 13,481 people who currently have a Creator template and are known as such in Wikidata. All the data in those templates can be harvested and included in an Wikidata item. For all the templates NOT known in Wikidata, an item can be found or created to make them known in Wikidata as well.
A lot is already known about Mr Prešeren in Wikidata and much of that data can be expressed in multiple languages. The same can be said for the Creator template itself; as you can see, the template already shows its labels in multiple languages. With Wikidata we can show the information in all our languages as well.

Realising this will introduce the Commons community in a positive way and reduce one obstacle that needs to be overcome during the wikidatification of Commons.

Saturday, October 18, 2014

Bringing #Wikidata to #Commons, one step at a time

There is this big project that is to bring structured data to the 23,422,581 media files that make up one of the biggest resources of freely usable media files.

It is to bring many different benefits to the users of Commons. To accomplish this many steps have to be taken. Many of these steps can already be taken and will indicate why this project is done and, what its benefits are.

Take for instance Mr Daniel Havell. He is an English engraver born in  Reading. There is no Wikipedia article about him but there is information about him in Wikidata. It includes all the information that is in his "Creator" template and the category about him on Commons.

Having such information for all the "Creators" on Wikidata is easy and obvious. Having all those templates refer to Wikidata builds an anticipation of things to come. Next steps are making sure that the information looks good on Wikidata and is informative. Currently the best we can offer is by showing the information in Reasonator.

Using tools like Reasonator for now establishes that the WMF and the Wikidata team appreciates all the efforts that promote the use of Wikidata and accepts it as indicative of the type of information it will have to bring.

This can all be done today. No waiting is necessary and it makes data from Commons available in multiple languages. This is Mr Havell in Russian. Bringing the benefits of Wikidata to Commons today helps. It brings awareness to our public of the inherent benefits. It allows them to comment and get involved slowly but surely. It will prevent a "big bang" announcement of this is "it",take it or leave it. It will even bring more information in more languages to Commons sooner rather than later.

Sunday, October 12, 2014

#MediaWiki is about sharing the sum of all #knowledge

The organisational structure of the Wikimedia Foundation has been completed with the hiring of Mr Damon Sicore. In his first IRC #Wikimedia-Office chat the ugly head of Wikipedia centrism was found to be alive and well.

Mr Sicore made some important statements: "The most urgent issue seems to be software quality and shipping what we say we are going to ship, on time." and also "this urgency is compounded by the fact that we must be able to compete in mobile".

Wikidata is firmly part of us sharing in the sum of all knowledge and it is increasingly important at that. So far Wikidata was mostly about linking Wikipedia articles about the same subject. Increasingly available data is used in info-boxes. Once the wikidatification of multimedia files happens Wikidata needs to become editable from mobile phones and it needs to be easy and obvious in any and all languages..

Currently it is not easy nor obvious in any language.

This is not to say that it is not possible to make it increasingly easy and obvious in all languages. It is important because it is a requirement when the wikidatification of multi media files is to succeed. This is however only one use case where improved usability of Wikidata is essential for us to continue to share the sum of all the data we have available to us.

Only one challenge for Mr Sicore is the extend Wikidata will make a difference. There are many more he faces. I wish him well because his success is our success.

#Wikidata - the maintenance of #awards

Mrs Kizer died. She won several awards. One of the awards she won was the Robert Frost medal, another award was the Theodore Roethke Memorial Poetry Prize. Two other awards, the John Masefield Memorial Award and the Borestone Award are not linked in the article yet.

The funny thing with awards is that they have a habit of being awarded regularly. This has several consequences;
  • you can predict how many winners there may have been
  • you can predict when the next winner is likely to be known
Given that many awards are not maintained as well as for instance the Nobel Prize or the Pulitzer Prize for Poetry, it should not be that hard to produce something that lists all the awards that have no winner yet for a given year. Wikidata already provides most of the main elements; these are all the awards for instance and it shows how many Wikipedias have an article for them.

By adding a statement about the frequency of the award it becomes [possible to find the awards that were not awarded in a given year. It will stimulate adding awards, it can be the basis for a tool that shows lists of winners on Wikipedias and it would stimulate me to indicate that Mrs Kizer won the Pulitzer Prize for Poetry in 1985.

Thursday, October 09, 2014

#Wikidata - #Statistics are a #data game

The Wikidata statistics are a marvel. They exist in their own little corner of the Wikiverse and rely on the dumps that are regularly produced. When everything is fine, a refresh is generated automatically. Some crazy people find them of interest and go over the numbers trying to understand what is happening. Every now and again, they are amazed or appalled.

Recently the dumps who are available in JSON changed its format in the midst of a dump. The resulting hodge podge of data made the statistics unrealistic. Magnus was on a holiday. Yes, he has a real life, so it took a bit of time before he reasoned his way out of the mess.

It is wonderful that our community has people like Erik Zachte and Magnus Manske. They spend so much time and effort in providing us with meaningful statistics. It is important to remember that they rely on underlying data and it is their skills that ensures that the data remains comparable over time.

NB Currently 56,83% of the Wikidata items have 0, 1 or 2 statements.. :)

Wednesday, October 08, 2014

#Wikidata - Does Mr Ulibarri live and when he does, then what ?

According to the #Portuguese #Wikipedia, Mr Ulibarri died. The date of his demise was given as June 1 2014. It was marked in a category of people who died, then it was picked up by tools and consequently Mr Ulibarri was marked as dead in Wikidata.

According to some, unsourced facts should not be in Wikidata and a Wikipedia is not a source. It is part of a blame game; I was accused of entering wrong information.

I prefer to live by the motto that I am proud of the mistakes I make; they prove that I am productive. Realistically, Wikidata has hardly any sources when you remove all the Wikipedias from the equation. Errors will be included all the time by me and by countless others. There is no helping that.

For those Wikipedias who expect sources for all statements; tough. It won't happen any time soon. The best that can be expected is that comparisons are made. Differences will be found in that way and they can be fixed where needed. In the case of Mr Ulibarri it is suggested that it is a case of mistaken identities. A Mr Marinho Chagas died, he was also a soccer star.  Mr Ulibarri's full name however is Mario Peres Ulibarri, he is also known as Marinho Peres.

An unanimous user edited the Portuguese Wikipedia and made Mr Ulibarri live again. It was commented that there are no sources for his demise. I am happy for Mr Ulibarri that it turned out all right for him.

Sunday, October 05, 2014

#Wikipedia - Ümit Yaşar Toprak commander of al #Nusra and #NPOV

The "Neutral Point of View" is one of the guiding principles of Wikipedia. In science it is defined as:
the concept of a position formed without incorporating one's own prejudice
According to the article about him, Mr Toprak died in an air strike inside of Syria. The problem with the article however is in several of the categorisations; 20th-century criminals, 21st-century criminals, War crimes committed by Islamist militant groups. They imply that Mr Toprak was both a criminal and that he personally was responsible for war crimes.The article does not support this in any way.

There is no need to appreciate Mr Toprak but the argument to include him in such categories are obviously partisan. As these claims are not supported in the text, it makes Wikipedia partisan as a consequence. It undermines the Wikipedia validity as a source for this conflict and it removes the legitimacy of NPOV claims in other domains as well.