Wikidata quality: a data consumers' perspective

Alessandro Piscopo

Data quality is an important topic for Wikidata, as the number of initiatives and projects around this topic testifies. To name just a few, the Item quality campaign relied on the work of the community to evaluate Items using a single-grading label scheme. The CoolWD project focuses on a particular dimension of quality, providing users with information about the completeness of the results of a query and allowing them to add this information to Wikidata. Furthermore, I have previously sought in a Request for Comment (Data quality framework for Wikidata) to gather opinions from the Wikidata community in order to create an appropriate data quality framework for this platform, which would be rooted in prior scientific literature and distinguish several quality dimensions. All these projects focus either on measuring data quality under various viewpoints or on generating a conceptualisation of data quality in Wikidata. They are essential to our understanding of Wikidata, as they explore different aspects of its quality. Nevertheless, data quality is most commonly defined as "fitness for purpose". As such, it is seen from the point of view of data consumers. What can be an acceptable degree of completeness or accuracy for e.g. providing tourist information, it is not enough when it comes to using the data to provide medical advice. Therefore, for a comprehensive understanding of what data quality means in Wikidata we need to have a clear overview of how this is used as a resource. Specifically, the aims of this session will be to: identify typologies of data consumers for Wikidata; gain an overview about the needs of each data consumer type and of the quality issues they experience. This session is open to everyone interested in Wikidata. However, it would be ideal to have a mixed audience, with member of the Wikidata community and professionals using this project as a data resource, in order to facilitate the exchange of different points of view. The presence of both practitioners using Wikidata as individuals and members of organisations would be highly beneficial. The session will be structured in three parts: Short introduction by the author of the submission about data quality-related projects concerning Wikidata (10-15 min.); Open discussion, where the attendees will be invited to report their experiences and express their ideas about the topic (35 min.); Summing up of the discussion and final remarks (10-15 min.).