Thursday, September 29, 2016

Trust

I read an article, I found what was written astounding and signalled that I had to read it again to really understand what is said and what it implies. The article was published in a quality newspaper; the Independent. The reply that I got was: "Indeed. And it's Fisk, so you can't just pretend it is an obscure journalist talking about something that may have happened..."

As I did not know Robert Fisk, I looked him up. I checked his Wikipedia article and found that he has indeed a reputation that is really good. He received many more rewards than was known at Wikidata so I added several and it is fun to establish the quality of its sources. For the Lannan Cultural Freedom Prize the Lannan website says it all. It is linked on the item for the award and that should suffice. For the Amnesty International UK Media Award it is not so obvious. It is conferred by te UK branch of Amnesty International and it has no dedicated page for the award. I added the award, the chapter and had a look at the pages for the award ceremony for each year. These Wikipedia articles refer to webpages that no longer exist.

For the Lannan Cultural Freedom Prize I added the other recipients because it gives some insight in the relevance of the award. I did not do this for the Martha Gellhorn prize for journalism.

The point of this all is that reputation amounts to trust about the message that is written. Read the article, it is likely that you are not familiar with the Wahhabi belief, a subset of Sunni Islam that is practiced in Saudi Arabia. The article is about 200 Sunni scholars that denounce the Wahhabi belief. Several major scholars are involved. Have a read and have a think, the article is by a major journalist published in a major news paper about something that is not without consequences.
Thanks,
       GerardM

Thursday, September 08, 2016

#Wikimedia - the need for #sceptism

It is all over the news; another psychology study debunked. With two thirds of the repeated studies being debunked, there is a lot in the literature of psychology no longer valid. The source for the article I read is Mr Eric-Jan Wagenmakers professor at the university of Amsterdam.

The NWO, the Netherlands Organisation for Scientific Research, is funding 3 million Euro to repeat key research. The problem is that science is in love with what is new and quick results. Three million is at best a start.

When science cannot be relied on, collaboration with scientists and universities easily becomes controversial. The programs taught are inherently point of view and often a conflict of interest is easily established. Consider; when doctors prescribe substances that are FDA approved, it seems obvious that these substances have a positive effect on patients. Then consider that we have a Wikipedian in Residence at Cochrane, they make a reputation from debunking much of the use of such substances. We provide end user information and it seems obvious that just repeating the list of FDA approved substances without further information is not at all in our users best interest. It is even likely that we are liable for misinformation under several legislatures.

There is a need to be sceptical about sources. It is important that we not only improve the technology behind our sources, we also need an ability to mark information as debunked and have that information filter through our projects and in the information we provide. Remember, debunked is not a POV it comes with sources of its own.
Thanks,
       GerardM

Sunday, September 04, 2016

#Diversity - A Woman's hall of Fame

Wikipedia has a category of some 40 Women's hall of Fame. They are women from the past and the present that are seen as exemplary. For all the women who have an English article there is now a statement indicating that they are seen as such.

For many women who are on these lists there is no article. Obviously when the objective is to have quality articles on notable women, it is good when there are lists with articles that could be written.

There are such lists and the best thing is they is some form of automated maintenance. The Women in Red project has such lists. Many of their lists find their basis in Wikidata and it is therefore possible to add people to their lists by adding key data.

All the women who have articles are now known as such, The next thing is to add the missing articles, the red links. So far I have added items for them one by one and stated what they are known for. Obviously this is a stub. More information is needed to state what they are known for, where they lived, why they are notable. It is not only how you enrich the data it is also how you increase diversity.
Thanks,
      GerardM

#Wikidata - the conflict of interest in medical information

According to the clinical evidence handbook only 12% of the 2500 most prebscribed substances and treatments by doctors are not proven effective. There is a massive conflict of interest when unsubstantiated facts are allowed in Wikidata. Arguments like "it is NPOV" are used to defend the practice or "it is harmful for patients" when they can find out that a substance is no better than a placebo but does have negative side effects.

When an external source knows about a substance, it is fine to link to that source. This is not the same as importing the data wholesale particularly when the data is so obviously categorically problematic.

The Wikimedia Foundation has a responsibility and it is not in indicating what substances are prescribed. When we are to include information it is not on the basis that it has been approved for use but on the basis of that it is actually proven to be beneficial. An error rate of 12% on such vital information is not acceptable.
Thanks,
      GerardM

Sunday, August 28, 2016

#Wikidata - La GalerĂ­a de las Mujeres de Costa Rica

#Marketing is something the #Wikimedia Foundation does not do. It does not mean that concepts like KPI are foreign to the WMF. Take this list from the English article "La GalerĂ­a de las Mujeres de Costa Rica" the women listed are "women who have broken gender stereotypes and advanced human rights principals".

A lot of effort goes into fighting for a diverse Wikipedia where both women are given proper attention. If I were a marketing man, I would say that lists like this provide pointers to people who want to help. I would be happy with a list that shows all the current people with an article and I would be ecstatic when I had a list that would show all the missing articles that would auto update.

The funny thing is that technically it is not that hard to produce. It is not even that hard to include the technology into MediaWiki but it takes a marketing man to drive the point home that you have to engage people and that it shows the quality of a Wikipedia project when we know where we are lacking and where we should concentrate.
Thanks,
     GerardM

Tuesday, August 23, 2016

#Wikidata - Colorado Women's Hall of Fame

There is a continuous effort underway in #Wikipedia to celebrate notable women. When women are seen as a role model, it is obvious that they deserve attention.

The Colorado Women's Hall of Fame is an organisation that celebrates women and every year 10 more women are included. The article on the organisation includes a list and it includes many red links. So more can be done, not only in Wikipedia but also in Wikidata.

As Wikidata is maturing, SPARQL is now of sufficient quality that many of the tools developed by Magnus are transitioning to SPARQL. This takes time and at the same time some tools are discontinued or do not fully function any more. Linked Items is one such tool. It creates a list of items that are found in a Wikipedia text. It is ideal when a text based file full of wiki links exist. It is just a matter of copying in the links and it will generate a list with Wikidata items for you. It is then needed to restrict the items that are used and it was possible to use WDQ the engine that could when SPARQL for Wikidata was a distant dream. Sadly it does not work anymore.

A solution is taking the list of items and copying to Petscan, the tool Magnus favours. It uses SPARQL and it is something of a Swiss army knife for data. When you are used to earlier tools like Autolist, many of the assumptions are wrong and it takes time to discover how the tool works. It does and that is why there are a large number of women who are known to be on the Colorado women's hall of fame.
Thanks,
      GerardM

Sunday, August 14, 2016

#Wikidata - #quality is not abstract

There is a new "Request for Comments" on quality for Wikidata. It is an attempt to describe quality in a top down approach. It is about words, it is abstract and well, I wish them well.

Wikidata has qualities. When you understand Wikidata by what it is and what it does you understand the not so abstract qualities it has. Its principle aim is to bring structure to the data that is in the Wikimedia projects.

The first quality that Wikidata brought was that it replaced the text based interwiki links. The improvement was important; in a short space of time the quality of these interwiki links improved and the associated number of edits went down. The quality of the interwiki links is not absolute but there has been no research on the follow up.

Interwiki links represent  connection between articles of Wikimedia projects that are about the same subject. Within a Wikipedia, a Wikisource there are links that are in essence similar to Wikidata statements. When a university is mentioned, the subject may be a student or staff at that university and when the statement has been made there is a reason for inclusion in categories. We can research the concurrence of such statements and Wikilinks. Quality improves when the concurrence improves.

When enough data is available, it becomes possible to use Wikidata statements in templates. Templates and info boxes expect high quality data in Wikidata and the available data is typically not good enough. When it is easy to make statements to wiki links and red links, the data in an info box will grow with the added statements.

We do need to work on the quality for our readers. This is done best by leveraging the data we have and engage our communities not only to link articles together but also by expanding these links with the statements that bind them together.

Yes, we will have to solve abstract issues but the reality is that they are not so abstract. Issues have their basis in what it is we have to understand this in what we hope to achieve; serving the world with the sum of all our available knowledge.
Thanks,
       GerardM

Monday, August 08, 2016

Is convergence between #Wikipedia and #Wikidata possible?

Wikidata is piggybacking on Wikipedia I was told. This is true; much data is imported from any and all of the Wikipedias and thereby Wikidata changes for the better. It improves in quality and become much more than what any single Wikipedia has to offer. At the same time Wikidata is rather awkward in its use and, there has been too much thinking in terms of what people know and expect for their own project.

Perspectives evolve. I tend to think of Wikidata as not yet good enough for most purposes. It is incomplete and its quality is inconsistent when we consider statements about its items. The remedy is obvious; work on the areas that are relevant and where Wikidata can easily make a difference.

That is fine road plan for me but Wikipedians also use Wikidata, they even need to use Wikidata. When they add an article about a person, the authority control data is served from Wikidata and, they have to add the information to Wikidata if it is to show. So what can be done to make this easy so that the use of Wikidata and Wikipedia may converge?

One aspect that seems important is that Wikidata information needs to function in whatever edit mode. The biggest motivational handicap I found is that most of what I did does not have an effect. It is much more rewarding when effects are more noticeable. All wiki links in an article link to other articles that have items of their own. Why not have a toggle that either shows these links with relations or not? For the brave hearts that take an interest it is cool, The others do not even have to notice.

When such links are annotated, they result in statements and such statements may even imply categories or other subsequent functionality. Currently bots only harvest in Wikipedia but why not have them add to the Wikipedias in a predetermined way? It makes for a much more dynamic editing process and it will definitely improve quality.

What do you think?
Thanks,
      GerardM