Thursday, February 22, 2018

#Wikimedia - The George Polk Award winners; how to catch them all

It is this time of year again; the George Polk Awards have been announced. Last year I spend a lot of time adding information to and cleaning the data at Wikidata. There are over 500 award winners known so Reasonator does not catch them all.  Listeria shows more data but multiple entries are an issue.

There are multiple reasons to complete a list like this. In this way celebrated journalists like Michael Winerip  or Michael Schwirtz finally get their presence in the Wikimedia world. It is a way to celebrate journalism, important enough in this time of fake news and, it is a demonstration how data at Wikidata can extend the quality of Wikipedia's information.

Given the amount of award winners, it takes too much time to do all the work in one go. It is now largely a matter of adding the red and "black" linked award winners. At this time it is the 2014 award winners who are being added.

The problem is that time spend on one award takes away time from other projects, equally deserving. Projects like completing information on US governors or British governors. How to register information like epidemics because their impact is not fully appreciated. How to make plain that a source has a negative impact when it is actually retracted..

Anyway, congratulation to the George Polk Award winners for 2017; that their career may blossom with this recognition.
Thanks,
     GerardM

Sunday, February 11, 2018

#Wikidata - William Gorges, first colonial governor of the Province of Maine

Mr Gorges was born in Britain, he died in Britain. He was tasked to oversee investments for two years by a nephew and as a result he was the first colonial governor of the province of Maine. Consequently he is said to be a citizen of the USA, (he died in February 1658)..

The problem with nationality and citizenship is that we tend to adopt people as belonging to something that did not exist at the time and consequently it is a falsehood. It is the same with all these generals, governors of the confederacy; they did not identify with the United States of America, they had their own state they swore allegiance to, so why call them citizen of the USA? How dare we?

It is the same for people from Wales, Scotland, Northern Ireland. They may have opposed the Brits but from a nationality point of view, their behaviour was judged by the British laws.

Associating people with states / nations that may not even have existed at the time are false facts, pure and simple.
Thanks,
       GerardM

Friday, February 09, 2018

#Wikimedia and #Cochrane - sharing resources and sharing results

Jane Falconer, a medical librarian, wrote a real interesting blog post. She writes about the importance of reliable information to front line health professionals and stresses the importance of systematic reviews that are conducted according to recognised and tested methodologies.

The big problem: what to include in the systematic review, and what to exclude in projects like are PRISMA and Cochrane. This is the same problem we face when we seek sources for Wikipedia articles and, the Wikimedia solution to provide sources is the "The Wikipedia Library Card Platform".

Cochrane and the Wikimedia Foundation are partners and one scenario I can see is one where this partnership is intensified. When Cochrane shares its results with Wikidata, they can have all the data of Wikidata anyway the quality and the relevance of the Wikidata data improves. When Cochrane volunteers may share the Library Card Platform, it would bring a major contribution to the volunteers at Cochrane. The relevance of the data at Wikidata will improve substantially. This in turn will help us verify the content of medical information and the quality of the sources in all our Wikipedias.
Thanks,
     GerardM

Saturday, February 03, 2018

#Wikidata - Just another award; the 2018 Newberry Library Award


I read it on Twitter; Mrs Carla Hayden received the 2018 Newberry Library Award. There are many awards that do not keep track of all the awardees. Personally I found only one other award winner. I asked on Twitter for more information and a friend found several more.

This is one of the awards that I want to keep track of. So I added a Listeria page on my user page. Every time the underlying data changes, Listeria will pick it up.

In the mean time, Mrs Hayden, congratulations.
Thanks,
       GerardM

Saturday, January 20, 2018

#Wikipedia - entering the rabbit hole

When you start reading Wikipedia, when you continue with a next article and the next, you become part of a click stream identifying what people read and how they get there. It is hugely interesting and dumps for this click stream are available for the English, Russian, German, Spanish, and Japanese Wikipedias.

Just consider; all articles on the same subject have a Wikidata identifier. This makes it possible to aggregate these click streams. When a particular link between articles is popular in multiple Wikipedias, there is a good chance that adding a missing article will be popular as well.

It is always a question if suggestions like this will be taken up, if they indeed prove to be read more than just an average new article in a domain. That is however the subject of follow up research. In the mean time it provides an argument to collect the click streams for any and all Wikipedias. Providing educated guesses of what will be popular stimulates people to write what will be read.
Thanks,
     GerardM

Thursday, January 18, 2018

#Wikimedia - #Personal - there is no silver bullet

For everything that ails any of the #Wikimedia projects, there is no silver bullet. To complicate things, there is no agreement what it is that ails these projects mostly because there is hardly any collaboration.

I am not a Wikipedian. I love Wikipedia but I do not identify with it. I have been involved in many projects including Wikipedia and my global account is testament to that. My involvements have been substantial and central in my motivation is: how can we share the sum of all knowledge, how will we reach the biggest audience and have the biggest effect.

I have been called "monomaniacal with my silver bullet du jour". Over time several topics have occupied me and this has resulted in an evolving understanding of what I perceive as issues with what we do and how we do it. When you are interested in how my opinions evolved, read my blog, it runs from 2005.

The English Wikipedia is Wikimedia's success. Its biggest problem; over 50% of its target audience does not speak English. At that, organisational attention in any project attention is mostly for English. There are several solutions possible that help us "share the sum of the knowledge that is available" more widely.
  • localisation of the user interface makes our software better usable and more user friendly
  • the user interface of Wikidata makes it easy and obvious to add labels in *your* languages
  • the data of Wikidata is used to generate texts that are cached, not saved, when there is no Wikipedia article on the subject
  • Advertise the information we have; things like finished books in Wikisource
I do promote translatewiki.net for the localisation of the MediaWiki software and I would love to see the Internet Archive and the OCLC to use translatewiki.net and have their services localised in all the languages that Wikipedia supports.

Reasonator is still the best interface on the Wikidata data. Data becomes informative and it makes it easy to add labels in *your* language. In essence this is again all about "sharing in the sum of all available knowledge". Hidden gems are the "Concept cloud" and the QRcode available on every Reasonator page. Reasonator is just one of the many tools by Magnus that makes Wikidata usable.

My main motto is "what is the purpose". When I was particularly involved in Wiktionary, I collaborated with many people in many Wiktionaries and this is where I learned to appreciate the lack of coordination that exists between projects. Thanks to wonderful people like Sabine Cretella, I developed the ideas and in the end a data model for a project that became the basis for OmegaWiki. This data model was inspected and approved by among others Alan K. Melby. Thanks to Jimbo I got into contact with Barend Mons and became involved in bio-medical data and science. The development of OmegaWiki happened parallel to the main work in Wikiproteins.

At this time Wikidata and the opportunities it presents has my interest. Contrary to some, I am not an apologist for everything Wikidata and contrary to what some say, I do not blame the development team but the group pressures that so often result in unhappy compromises and decisions. It is for instance an acknowledged fact that Wikidata descriptions are problematic and that automated descriptions are superior.. "Never mind; it is what we do" is the prevailing sentiment.. (as always).

There is no silver bullet and consequently a result is only achieved after a lot of work. I want functionality that mimics an Algerian project I blogged about way back in 2013. To achieve this I am adding dates to the governorships of all USA states. It allows for queries like this. A next stage will be when a map of the USA is shown with all its states and a slider to move in time. It is then easy to show the governors at that time..

I am not sorry that I keep on returning to issues mentioned it the past, what some people miss is the amount of continuous effort that goes into achieving them.
Thanks,
      GerardM

#Wikipedia - Cebuano; be inspired

In an answer to the Wikimedia blogpost "Inspire New Readers campaign: Raise awareness of Wikipedia where you live" I replied: 
Make sure there is a lot to read. It is counter intuitive but the Cebuano Wikipedia approach with a twist could make a huge difference. The difference; caching generated content and not saving it. Do not mistake the absence of information in hand written articles as preferable over providing no information.
I was asked to expand on it.

My comment was not intended to dwell on the past, given the overload of acrimony not that inspiring, but on a future where we share in "the sum of knowledge that is available to us". The Cebuano Wikipedia is one of the biggest Wikipedias because a bot started with publicly available data on a subject, build a text with variables for the data and build Wikipedia articles from the data. As a result all that data had to be linked in Wikidata and there were a lot of complaints. The one undeniable point; there are errors in the data and even when we fix it, it is not fixed in that Wikipedia.

The twist: the public data is imported in Wikidata first. The text is generated in the same way but it is not saved as an article but cached. It follows that when the data is in error and corrected, the cache will expire and the new text will have the latest and greatest. In this way we do provide information in the local language to the best of our knowledge and ability.

Wikidata has an ever growing amount of data on subjects that are unlikely to generate Wikipedia articles in any language. It does not mean that we could not provide information. What it takes is accepting counter intuitive arguments use tools like Reasonator, make use of the LSJbot and accept that search results should include what is stored elsewhere, something that has been in production on several Wikipedias for years now.

Our objective "share the sum of all knowledge". I will happy when we share the knowledge that is available to us.
Thanks,
      GerardM

Wednesday, January 10, 2018

#Wikipedia - fiduciary responsibilities for #Wikipedia #Medical

Retraction Watch has a very relevant article for one of the most important resources for medical information: Wikipedia. Its title: “A concerning – largely unrecognised – threat to patient safety:” Nursing reviews cite retracted trials. It is a follow up interview of an article in the International Journal of Nursing Studies with Richard Gray the principal author.

Given that Wikipedia is the most read resource by medical practitioners, the interview has many relevant pointers on ensuring safe practices. I quote them from the paper and with some modifications they apply to any and all sources used in Wikimedia content.

  1. A retraction filter (or whatever mechanism the database in question allows) must be applied to the end output of any search strategy.
  2. Journals/databases must make retractions more visible (step 1 above depends on it).
  3. Collaborations (e.g. Cochrane, Campbell, The JBI) need to incorporate into their handbooks directives around retraction. For example, a scan for retractions after data sourcing; a scan for retractions before data extraction; a scan for retractions before review submission.
  4. The reporting guidelines for systematic reviews (Preferred Reporting Items for Systematic Reviews and Meta-Analyses, PRISMA) needs to include an item stating that authors have checked if any included studies have been retracted.
  5. Journal editors should require authors, when submitting manuscripts, to confirm that they have checked that none of the included studies have been retracted. Authors should also include a statement in the paper stating they have done this.
  6. Proofreaders may also have an important role to play. For example, authors of one review included in their reference list a citation that clearly indicated the reference was for a retracted paper. Proofreaders could be trained to spot and report these anomalies.
Registering retractions in Wikidata would be a start.
Thanks,
     GerardM

#Wikidata - Rachael E Jack; Spearman medal winner

On Facebook I mentioned a 2016 blog post about the Spearman Medal. I checked for missing entries; they were the two 2017 award winners, Mrs Claire Haworth and Mrs Rachael E Jack.

Adding award winners to Wikidata is something I do regularly. It always starts with a search. Mrs Jack was known as "Rachael Jack" on Wikipedia and by drilling down into the ORCID information I found confirmation that this is indeed the same person.

Mrs Haworth is known to ORCID as well, and through a link to a profile, there was a confirmation that it was the same person; the award winner of the Spearman medal.

Typically I do not spend that much time on red links. What I wanted to know is the value of the network. Given the titles of publications known at ORCID, some of the publications of Mrs Haworth could already be found in Wikidata and were linked.

Thanks to all the work done on scholarly publications, scaffolding information for Wikipedia articles become available.. These two ladies are notable if only because of being recipients of the Spearman medal.
Thanks,
     GerardM