You are browsing the archive for Global Open Data Index.

The future of the Global Open Data Index: assessing the possibilities

Open Knowledge International - November 1, 2017 in Global Open Data Index, godi, GODI16, Open Government Data, open-government

In the last couple of months we have received questions regarding the status of the new Global Open Data Index (GODI) from a few members of our Network. This blogpost is to update everyone on the status of GODI and what comes next. But first, some context: GODI is one of the biggest assessments of the state of open government data globally, alongside the Web Foundation’s Open Data Barometer. We notice persistent obstacles for open data year-by-year. High-income countries regularly secure top rankings, yet overall there is little to no development in many countries. As our latest State Of Open Government Data in 2017 report shows, data is often not made available publicly at all. If so, we see many issues around findability, quality, processability, and licensing. Individual countries are notable exceptions to the rule. The Open Data Barometer made similar observations in its latest report, mentioning a slow uptake of policy, as well as persistent data quality issues in countries that provide open data. So there is still a lot of work to be done. To resolve issues like engagement with our community, we started to explore alternative paths for GODI. This includes a shift in focus from a mere measurement tool to a stronger conversational device between our user groups throughout the process. We understand that we need to speak to new audiences and focus on measurement as a tool in real world applications. We need to focus more on this. We want to understand the use cases of the Open Data Survey (the tool that powers GODI and the Open Data Census) in different contexts and with different goals. We have barely seen a few of the possible uses of the tool in the open data sphere and we want to see even more. In order to learn more about how GODI is taken up by different user groups, we are also currently exploring GODI’s effects on open data policy and publication. We wish to understand more systematically how individual elements of the GODI interface (such as country ranking, dataset results, discuss forum entries) help mobilising support for open data among different user groups. Our goal is to understand how to improve our survey design and workflow so that they more directly support action around open data policy and publication. In addition we are developing a new vision for the Open Data Index to either measure open data on a regional and city-level or by topical areas. We will elaborate on this vision in a follow-up blogpost soon. Taking this all into account, we have decided to focus on working on the aforementioned use cases and a regional Index during 2018. In the meantime, we will still work with our community to define a vision that will make GODI a sustainable measurement tool: we understand that tracking the changes in government data publication is crucial for the activists and governments themselves. We know that progress around open data is slower than we would like it to be, but therefore we need to ensure that discussions around open data do not end. Please do not hesitate to submit new discussions around country entries on our forum or reach out to us if you have any ideas on how to take GODI forwards and improve. If you’re running an Open Data Census, we we’ll continue giving you support in the measurement you’re currently working on, whether it’s local, regional or you have any new idea of a Census you’d like to try. If you want to run your own Census, you can request it here, or send an email to index@okfn.org to see how we could collaborate further.

Research call: Mapping the impacts of the Global Open Data Index

Open Knowledge International - September 6, 2017 in Global Open Data Index

The Global Open Data Index (GODI) is a worldwide assessment of open data publication in more than 90 countries. It provides evidence how well governments perform in open data publication. This call invites interested researchers and organisations to systematically study the effects of the Global Open Data Index on open data publication and the open data ecosystem. The study will identify different actors engaged around GODI, and how the information provided by GODI helped advance open data policy and publication. It will do so by investigating a sample of three countries with different degrees of open data adoption. The work will be conducted in close collaboration with Open Knowledge International’s (OKI) research department who will provide guidance, review and assistance throughout the project.   We invite interested parties to send their costed proposal to research@okfn.org. In order to be eligible, the proposal must include research background, a short description why they are interested in the topic and how they want to research it (300 words maximum), a track record demonstrating knowledge of the topic, as well as a written research sample around open data or related fields. Finally, the proposal must also specify how much time will be committed to the work and for what cost (in GBP or USD). Due to the nature of the funding supporting this work, we unfortunately cannot accept proposals from US-based people or organisations. Please make sure the submission is made before the proposal deadline of Wed 13 Sept, 21:00 UTC.

Outline

 

Background

The Global Open Data Index (GODI) is a worldwide assessment of open data publication in more than 90 countries. It provides evidence how well governments perform in open data publication. This includes mapping accessibility and access controls, findability of data, key data characteristics, as well as open licensing and machine-readability. At the same time GODI provides a venue for open data advocates and civil servants to discuss the production of open data. Evidence shows that governance indicators drive change if they embrace dialogue and mutual ownership of those who are assessed, and those who assess. This year we wanted to use the launch of GODI to spark dialogue and provide a venue for the ensuing discussions. Through this dialogue, governments learn about key datasets and data quality issues, while also receiving targeted feedback to help them improve. Furthermore, every year many interactions happen outside of the GODI process, not including the GODI staff or public discussions. Instead results are discussed within public institutions, or among civic actors and public institutions. Some scarce evidence of GODI’s outcomes is available, yet a systematic understanding of the diverse types of effects is missing to date.

Scope of research

This research is intended to get a systematic understanding of the effects of the Global Open Data Index on open data publication and the open data ecosystem. It addresses three research questions:
  1. In what ways does the Global Open Data Index process mobilize support for open data in countries with different degrees of open data policy and publication? How does this support manifest itself?
  2. How does the Global Open Data Index influence open data publication in governments both in terms of quantity and quality of data?
  3. How do different elements of the Global Open Data Index help governments and civil society actors to drive progress  around question 1 and 2?
GODI’s effects can tentatively be grouped into high-level policy and strategy development as well as strategy implementation and ongoing publication. This research will assess how different actors such as civil servants, high-level government officials, open data advocates and communities engage with different elements of GODI and how this helps advancing open data policy and publication. The research should also, whenever applicable, provide a critical account of GODI’s adverse effects. This can include ‘ceiling effects’, tunnel vision and reactivity, or other effects. The research will assess these effects in three countries. These may include Argentina, Colombia, Ukraine, South Africa, Thailand, or others. It is possible to propose alternative countries, if the researcher has strong experience in those or if it would help gathering data for the research. Proposals should specify which three  countries would be assessed. If alternative countries are proposed, they should meet the following criteria:
  1. One country without national open data policy, one country with a recent open data policy (in effect between 3 months and 2 years), as well as countries with established open data policies older than 2 years)
  2. A mix of countries with different endorsement for GODI, including countries who actively announced to increase their ranking (high importance) and countries where no public claims for open data improvement are documented
  3. Presence of country in past two GODI editions
  4. May include members of the Open Government Partnership and Open Data Charter adopters, as well as non-members.

Deliverables

The work will provide a written report between 5000 and 7000 words length addressing each of the research questions. The report must include a clearly written methodology section and country sampling approach. The desired format is a narrative report in English. A qualitative, critical assessment of GODI’s effects on open data policy and publication is expected. It needs to describe the actors using GODI, how they interacted with different aspects of GODI, and how this helped to drive change around the first two research questions outlined above. Furthermore following deliverables are expected:
  • Interviews with least four interviewees per country
  • A semi-structured  interview guide
  • Draft report by 15 October, structured around country portraits for three sample countries.
  • Weekly catch-ups with the Research team at OKI
  • Final report by 1 November

Methods and data sources

The researcher can draw from several sources to start this research, including OKI’s country contacts, Global Open Data Index scores, etc. Suggested methodology approaches include interviews with government officials and GODI contributors, as well as document analysis. Alternative research approaches and data sources shall be discussed with OKI’s research team. The research team will provide assistance in sampling interviewees in the initial phase of the research.

Activities

It is expected that this work is conducted in close contact with OKI’s research department. We will arrange a kick-off meeting to discuss your approach and have weekly calls to discuss activity and progress on the work. Early drafts will be shared with the OKI team to provide comments and discuss them with you. In addition we will have a final reflection call. Remote availability is expected (via email, Skype, Slack, or other channels). Overall research outline and goals will be discussed and agreed upon with the research lead of GODI who will help in sampling countries and will review project progress.

Decision criteria

We will base our decision of selecting a research party on following criteria:
  • Evidence of an understanding of open data assessments and indicators, and their influence on policy development and implementation.
  • Track record in the field of open data assessment and measurement.
  • Clarity and feasibility of methodology you propose to follow.
Due to the nature of the funding supporting this work, we unfortunately cannot accept proposals from US-based people or organisations. Please make sure the submission is made before the proposal deadline of Wed 13 Sept, 21:00 UTC.

Using the Global Open Data Index to strengthen open data policies: Best practices from Mexico

Oscar Montiel - August 16, 2017 in Global Open Data Index, Open Data Index, Open Government Data, Open Knowledge

This is a blog post coauthored with Enrique Zapata, of the Mexican National Digital Strategy. As part of the last Global Open Data Index (GODI), Open Knowledge International (OKI) decided to have a dialogue phase, where we invited individuals, CSOs, and national governments to exchange different points of view, knowledge about the data and understand data publication in a more useful way. In this process, we had a number of valuable exchanges that we tried to capture in our report about the state of open government data in 2017, as well as the records in the forum. Additionally, we decided to highlight the dialogue process between the government and civil society in Mexico and their results towards improving data publication in the executive authority, as well as funding to expand this work to other authorities and improve the GODI process. Here is what we learned from the Mexican dialogue:

The submission process

During this stage, GODI tries to directly evaluate how easy it is to find and their data quality in general. To achieve this, civil society and government actors discussed how to best submit and agreed to submit together, based on the actual data availability.   Besides creating an open space to discuss open data in Mexico and agreeing on a joint submission process, this exercise showed some room for improvement in the characteristics that GODI measured in 2016:
  • Open licenses: In Mexico and many other countries, the licenses are linked to datasets through open data platforms. This showed some discrepancies with the sources referenced by the reviewers since the data could be found in different sites where the license application was not clear.
  • Data findability: Most of the requested datasets assess in GODI are the responsibility of the federal government and are available in datos.gob.mx. Nevertheless, the titles to identify the datasets are based on technical regulation needs, which makes it difficult for data users to easily reach the data.
  • Differences of government levels and authorities: GODI assesses national governments but some of these datasets – such as land rights or national laws – are in the hands of other authorities or local governments. This meant that some datasets can’t be published by the federal government since it’s not in their jurisdiction and they can’t make publication of these data mandatory.
 

Open dialogue and the review process

  During the review stage, taking the feedback into account, the Open Data Office of the National Digital Strategy worked on some of them. They summoned a new session with civil society, including representatives from the Open Data Charter and OKI in order to:
  • Agree on the state of the data in Mexico according to GODI characteristics;
  • Show the updates and publication of data requested by GODI;
  • Discuss paths to publish data that is not responsibility of the federal government;
  • Converse about how they could continue to strengthen the Mexican Open Data Policy.
  The results   As a result of this dialogue, we agreed six actions that could be implemented internationally beyond just the Mexican context both by governments with centralised open data repositories and those which don’t centralise their data, as well as a way to improve the GODI methodology:  
  1. Open dialogue during the GODI process: Mexico was the first country to develop a structured dialogue to agree with open data experts from civil society about submissions to GODI. The Mexican government will seek to replicate this process in future evaluations and include new groups to promote open data use in the country. OKI will take this experience into account to improve the GODI processes in the future.
  2. Open licenses by default: The Mexican government is reviewing and modifying their regulations to implement the terms of Libre Uso MX for every website, platform and online tool of the national government. This is an example of good practice which OKI have highlighted in our ongoing Open Licensing research.
  3. “GODI” data group in CKAN: Most data repositories allow users to create thematic groups. In the case of GODI, the Mexican government created the “Global Open Data Index” group in datos.gob.mx. This will allow users to access these datasets based on their specific needs.
  4. Create a link between government built visualization tools and datos.gob.mx: The visualisations and reference tools tend to be the first point of contact for citizens. For this reason, the Mexican government will have new regulations in their upcoming Open Data Policy so that any new development includes visible links to the open data they use.
  5. Multiple access points for data: In August 2018, the Mexican government will launch a new section on datos.gob.mx to provide non-technical users easy access to valuable data. These data called “‘Infraestructura de Datos Abiertos MX’ will be divided into five easy-to-explore and understand categories.
  6. Common language for data sets: Government naming conventions aren’t the easiest to understand and can make it difficult to access data. The Mexican government has agreed to change the names to use more colloquial language can help on data findability and promote their use. In case this is not possible with some datasets, the government will go for an option similar to the one established in point 5.
We hope these changes will be useful for data users as well as other governments who are looking to improve their publication policies. Got any other ideas? Share them with us on Twitter by messaging @OKFN or send us an email to index@okfn.org  

The final Global Open Data Index is now live

Oscar Montiel - June 15, 2017 in Global Open Data Index

The updated Global Open Data Index has been published today, along with our report on the state of Open Data this year. The report includes a broad overview of the problems we found around data publication and how we can improve government open data. You can download the full report here. Also, after the Public Dialogue phase, we have updated the Index. You can see the updated edition here We will also keep our forum open for discussions about open data quality and publication. You can see the conversation here.  

Ποιότητα ανοικτών δεδομένων – η επόμενη αλλαγή στα ανοικτά δεδομένα;

Χριστίνα Καρυπίδου - June 10, 2017 in Featured, Featured @en, Global Open Data Index, News, Open Data Handbook, ανοικτά δεδομένα, Νέα

Από το Open Knowledge International Αυτή η ανάρτηση είναι μέρος του Global Open Data Blog. Είναι ένα κάλεσμα να επαναπροσδιορίσουμε την προσοχή μας στα πολλά διαφορετικά στοιχεία που συμβάλλουν στην «καλή ποιότητα» των ανοικτών δεδομένων, στις ανταλλαγές μεταξύ τους και στον τρόπο με τον οποίο υποστηρίζουν τη χρηστικότητα των δεδομένων (βλ. εδώ μερικά σημαντικά έργα […]

What data do we need? The story of the Cadasta GODI fellowship

Mor Rubinstein - June 9, 2017 in Global Open Data Index

This blogpost was written by Lindsay Ferris and Mor Rubinstein   There is a lot of data out there, but which data users needs to solve their issues? How can we, as an external body, know which data is vital so we can measure it?  Moreover, what to do when data is published in so many levels – local, regional and federal that is so hard to find? Every year we are thinking about these questions in order to improve the Global Open Data Index (GODI), and make it more relevant to civil society. Having the relevant data characteristics is crucial for data use since without specific data it is hard to analysed and learn. After the publication of the GODI 2015, Cadasta Foundation approached us to discuss the results of GODI in the land ownership category.  Throughout this initial, lively discussion, we noticed that a systematic understanding of land data in general, and land ownership data in particular, was missing. An idea emerged: What if we will We decided to bridge these gaps to build a systematic understanding of land ownership data for the 2016 GODI. And so came to life the idea of the GODI fellowship. It was simple – Cadasta will have a fellow for a period of 6 months to explore the publication of data that is relevant to land ownership issues. The fellowship would be funded by Cadasta and the fellow would be an integral part of the team. OKI would give in-kind support of guidance and research. The fellowship goals were:
  • Global policy analysis of open data in the field of land and resource rights
  • Better definition for the land ownership dataset in the Global Open Data Index for 2016;
  • Mapping stakeholders and partners for the Global Open Data Index (for submissions);
  • Recommendations for a thematic Index;
  • A working paper or a series of blog posts about open data in land and resource ownership.
Throughout the fellowship, Lindsay conducted interviews with land experts, NGOs and government officials as well as on-going desk research on the land data publication practices across different contexts. She established 4 key outputs:
  1. Outlining the challenges of opening land ownership data. Blog post here.
  2. Mapping the different types of land data and their availability. Overview here.
  3. Assessing the privacy and security risks of opening certain types of land data. See our work here: cadasta.org/open-data/assessing-the-risks-of-opening-property-rights-data/
4.Identifying user needs and creating user personas for open land data.  User personas here.   Throughout the GODI process, our aim is to advocate for datasets that different stakeholders actually need and that make sense within the context in which they are published. For example, one of the main challenges in land ownership is that data is not always recorded or gathered by the federal level, and is collect in cities and regions. One of the primary users of land ownership data are other government agencies. Having a grasp of this type of knowledge helped us better define the land ownership dataset for the GODI. Ultimately, we developed a thoughtful definition based on these reflections and recommendations.   For us at OKI, having someone dedicated in an organisation that is an expert in a data category was immensely helpful. It makes the index categories more relevant for real life use  and help us to measure the categories better. It helps us to make sure our assumptions and foundation for the research are good. For Cadasta, having a person dedicate on open data helped to create a knowledge based and resources that help them look at the open data better. It was a win – win for both sides. In fact, The work Lindsay was doing was very valuable for Cadasra that Lindsay time was extended at Cassata and she worked on writing a case study about open data and land in Sao Paulo and Land Debate final report and a paper on Open Data in Land Governance for the 2017 World Bank Land and Poverty Conference. Going forward in the future of open data assessment, we believe that having this expert input in the design of the survey is crucial. Having only an open data lense can lead us to bias and wrong measurements. In our vision, we see the GODI tool as community owned assessment, that can help all fields to promote, find and use the data that is relevant for them. Interested of thinking the future of your field through open data? Write to us on the forum – https://discuss.okfn.org/c/open-data-index/global-open-data-index-2016

The state of open licensing in 2017

Danny Lämmerhirt - June 8, 2017 in Global Open Data Index, Open Definition, Open Government Data, Open Knowledge

This blog post is part of our Global Open Data Index (GODI) blog series. Firstly, it discusses what open licensing is and why it is crucial for opening up data. Afterward, it outlines the most urgent issues around open licensing as identified in the latest edition of the Global Open Data Index and concludes with 10 recommendations how open data advocates can unlock this data. The blog post was jointly written by Danny Lämmerhirt and Freyja van den Boom.   Open data must be reusable by anyone and users need the right to access and use data freely, for any purpose. But legal conditions often block the effective use of data. Whoever wants to use existing data needs to know whether they have the right to do so. Researchers cannot use others’ data if they are unsure whether they would be violating intellectual property rights. For example, a developer wanting to locate multinational companies in different countries and visualize their paid taxes can’t do so unless they can find how this business information is licensed. Having clear and open licenses attached to the data, which allow for use with the least restrictions possible, are necessary to make this happen.   Yet, open licenses still have a long way to go. The Global Open Data Index (GODI) 2016/17 shows that only a small portion of government data can be used without legal restrictions. This blog post discusses the status of ‘legal’ openness. We start by explaining what open licenses are and discussing GODI’s most recent findings around open licensing. And we conclude by offering policy- and decisionmakers practical recommendations to improve open licensing.   What is an open license? As the Open Definition states, data is legally open “if the legal conditions under which data is provided allow for free use”.  For a license to be an open license it must comply with the conditions set out under the  Open Definition 2.1.  These legal conditions include specific requirements on use, non-discrimination, redistribution, modification, and no charge.   Why do we need open licenses? Data may fall under copyright protection. Copyright grants the author of an original work exclusive rights over that work. If you want to use a work under copyright protection you need to have permission. There are exceptions and limitations to copyright when permission is not needed for example when the data is in the ‘public domain’ it is not or no longer protected by copyright, or when your use is permitted under an exception.   Be aware that some countries also allow legal protection for databases which limit what use can be made of the data and the database. It is important to check what the national requirements are, as they may differ.   Because some types of data (papers, images) can fall under the scope of copyright protection we need data licensing. Data licensing helps solve problems in practice including not knowing whether the data is indeed copyright protected and how to get permission. Governments should therefore clearly state if their data is in the public domain or when the data falls under the scope of copyright protection what the license is.
  • When data is public domain it is recommended to use the CC0 Public Domain license for clarity.
  • When the data falls under the scope of copyright it is recommended to use an existing Open license such as CC-BY to improve interoperability.
Using Creative Commons or Open Data Commons licenses is best practice. Many governments already apply one of the Creative Commons licenses (see this wiki). Some governments have chosen however to write their own licenses or formulate ‘terms of use’ which grant use rights similar to widely acknowledged open licenses. This is problematic from the perspective of the user because of interoperability. The proliferation of ever more open government licenses has been criticized for a long time. By creating their own versions, governments may add unnecessary information for users, cause incompatibility and significantly reduce reusability of data.  Creative Commons licenses are designed to reduce these problems by clearly communicating use rights and to make the sharing and reuse of works possible.  

The state of open licensing in 2017

Initial results from the GODI 2016/17 show roughly that only 38 percent of the eligible datasets were openly licensed (this value may change slightly after the final publication on June 15). The other licenses include many use restrictions including use limitations to non-commercial purposes, restrictions on reuse and/or modifications of the data.     Where data is openly licensed, best practices are hardly ever followed In the majority of cases, our research team found governments apply general terms of use instead of specific licenses for the data. Open government licenses and Creative Commons licenses were seldom used. As outlined above, this is problematic. Using customized licenses or terms of use may impose additional requirements such as:
  • Require specific attribution statements desired by the publisher
  • Add clauses that make it unclear how data can be reused and modified.
  • Adapt licenses to local legislation
Throughout our assessment, we encountered unnecessary or ambivalent clauses, which in turn may cause legal concerns, especially when people consider to use data commercially. Sometimes we came across redundant clauses that cause more confusion than clarity.  For example clauses may forbid to use data in an unlawful way (see also the discussion here).   Standard open licenses are intended to reduce legal ambiguity and enable everyone to understand use rights. Yet many licenses and terms contain unclear clauses or are not obvious to what data they refer to. This can, for instance, mean that governments restrict the use of substantial parts of a database (and only allow the use of insignificant parts of it). We recommend that governments give clear examples which use cases are acceptable and which ones are not.   Licenses do not make clear enough to what data they apply.  Data should include a link to the license, but this is not commonly done. For instance, in Mexico, we found out that procurement information available via Compranet, the procurement platform for the Federal Government, was openly licensed, but the website does not state this clearly. Mexico hosts the same procurement data on datos.gob.mx and applies an open license to this data. As a government official told us, the procurement data is therefore openly licensed, regardless where it is hosted. But again this is not clear to the user who may find this data on a different website. Therefore we recommend to always have the data accompanied with a link to the license.  We also recommend to have a license notice attached or ‘in’ the data too. And to keep the links updated to avoid ‘link rot’.   The absence of links between data and legal terms makes an assessment of open licenses impossible Users may need to consult legal texts and see if the rights granted to comply with the open definition. Problems arise if there is not a clear explanation or translation available what specific licenses entail for the end user. One problem is that users need to translate the text and when the text is not in a machine-readable format they cannot use translation services. Our experience shows that it was a significant source of error in our assessment. If open data experts struggle to assess public domain status, this problem is even exacerbated for open data users. Assessing public domain status requires substantial knowledge of copyright – something the use of open licenses explicitly wants to avoid.   Copyright notices on websites can confuse users. In several cases, submitters and reviewers were unable to find any terms or conditions. In the absence of any other legal terms, submitters sometimes referred to copyright notices that they found in website footers. These copyright details, however, do not necessarily refer to the actual data. Often they are simply a standard copyright notice referring to the website.

Recommendations for data publishers

Based on our finding we prepared 10 recommendations that policymakers and other government officials should take into account:  
  1. Does the data and/or dataset fall under the scope of IP protection? Often government data does not fall under copyright protection and should not be presented as such. Governments should be aware and clear about the scope of intellectual property (IP) protection.
  2. Use standardized open licenses. Open licenses are easily understandable and should be the first choice. The Open Definition provides conformant licenses that are interoperable with one another.
  3. In some cases, governments might want to use a customized open government license. These should be as open as possible with the least restrictions necessary and compatible (see point 2). To guarantee a license is compatible, the best practice is to submit the license for approval under the Open Definition.
  4. Exactly pinpoint within the license what data it refers to and provide a timestamp when the data has been provided.
  5. Clearly, publish open licensing details next to the data. The license should be clearly attached to the data and be both human and machine-readable. It also helps to have a license notice ‘in’ the data.
  6. Maintain the links to licenses so that users can access license terms at all times.
  7. Highlight the license version and provide context how data can be used.
  8. Whenever possible, avoid restrictive clauses that are not included in standard licenses.
  9. Re-evaluate the web design and avoid confusing and contradictory copyright notices in website footers, as well as disclaimers and terms of use.
  10. When government data is in the public domain by default, make clear to end users what that means for them.
 

Open data quality – the next shift in open data?

Open Knowledge International - May 31, 2017 in Data Quality, Global Open Data Index, GODI16, Open Data

This blog post is part of our Global Open Data Index blog series. It is a call to recalibrate our attention to the many different elements contributing to the ‘good quality’ of open data, the trade-offs between them and how they support data usability (see here some vital work by the World Wide Web Consortium). Focusing on these elements could help support governments to publish data that can be easily used. The blog post was jointly written by Danny Lämmerhirt and Mor Rubinstein.   Some years ago, open data was heralded to unlock information to the public that would otherwise remain closed. In the pre-digital age, information was locked away, and an array of mechanisms was necessary to bridge the knowledge gap between institutions and people. So when the open data movement demanded “Openness By Default”, many data publishers followed the call by releasing vast amounts of data in its existing form to bridge that gap. To date, it seems that opening this data has not reduced but rather shifted and multiplied the barriers to the use of data, as Open Knowledge International’s research around the Global Open Data Index (GODI) 2016/17 shows. Together with data experts and a network of volunteers, our team searched, accessed, and verified more than 1400 government datasets around the world. We found that data is often stored in many different places on the web, sometimes split across documents, or hidden many pages deep on a website. Often data comes in various access modalities. It can be presented in various forms and file formats, sometimes using uncommon signs or codes that are in the worst case only understandable to their producer. As the Open Data Handbook states, these emerging open data infrastructures resemble the myth of the ‘Tower of Babel’: more information is produced, but it is encoded in different languages and forms, preventing data publishers and their publics from communicating with one another. What makes data usable under these circumstances? How can we close the information chain loop? The short answer: by providing ‘good quality’ open data.  

Understanding data quality – from quality to qualities

The open data community needs to shift focus from mass data publication towards an understanding of good data quality. Yet, there is no shared definition what constitutes ‘good’ data quality. Research shows that there are many different interpretations and ways of measuring data quality. They include data interpretability, data accuracy, timeliness of publication, reliability, trustworthiness, accessibility, discoverability, processability, or completeness.  Since people use data for different purposes, certain data qualities matter more to a user group than others. Some of these areas are covered by the Open Data Charter, but the Charter does not explicitly name them as ‘qualities’ which sum up to high quality. Current quality indicators are not complete – and miss the opportunity to highlight quality trade-offs Also, existing indicators assess data quality very differently, potentially framing our language and thinking of data quality in opposite ways. Examples are: Some indicators focus on the content of data portals (number of published datasets) or access to data. A small fraction focus on datasets, their content, structure, understandability, or processability. Even GODI and the Open Data Barometer from the World Wide Web Foundation do not share a common definition of data quality.
 Arguably, the diversity of existing quality indicators prevents from a targeted and strategic approach to improving data quality.

At the moment GODI sets out the following indicators for measuring data quality:
  • Completeness of dataset content
  • Accessibility (access-controlled or public access?)
  • Findability of data
  • Processability (machine-readability and amount of effort needed to use data)
  • Timely publication
This leaves out other qualities. We could ask if data is actually understandable by people. For example, is there a description what each part of the data content means (metadata)?   Improving quality by improving the way data is produced Many data quality metrics are (rightfully so) user-focussed. However, it is critical that government as data producers better understand, monitor and improves the inherent quality of the data they produce. Measuring data quality can incentivise governments to design data for impact: by raising awareness of the quality issues that would make data files otherwise practically impossible to use. At Open Knowledge International, we target data producers and the quality issues of data files mostly via the Frictionless Data project. Notable projects include the Data Quality Spec which defines some essential quality aspects for tabular data files. GoodTables provides structural and schema validation of government data, and the Data Quality Dashboard enables open data stakeholders to see data quality metrics for entire data collections “at a glance”, including the amount of errors in a data file. These tools help to develop a more systematic assessment of the technical processability and usability of data.

A call for joint work towards better data quality

We are aware that good data quality requires solutions jointly working together. Therefore, we would love to hear your feedback. What are your experiences with open data quality? Which quality issues hinder you from using open data? How do you define these data qualities? What could the GODI team improve?  Please let us know by joining the conversation about GODI on our forum.

Measuring the Openness of Government Data in the Balkans

Blina Meta - May 24, 2017 in Global Open Data Index

Open Data Kosovo is a civic-tech organization that uses technology to contribute towards social good. The organization has created an exciting network of partners both local and international while working on projects related to visualizing procurement data, mapping satellite imagery for human rights violations, data collection and entry of 112 emergency calls, countering violent extremism online, providing digital solutions to public institutions, index measurement of the degree of openness of public institutions, visualizing election data, growth of the female coders community, and more. This portfolio made us a trustworthy candidate for the next task from Open Knowledge International, measuring the state of openness of government data for the countries in South Eastern Europe: Bulgaria, Macedonia, Serbia, Kosovo, Croatia, Albania, Slovenia, Bosnia and Herzegovina, Romania, Montenegro.   We agreed to the task, and thereby the journey of measuring the openness of the Southern Europe countries began. We had a two month period of submissions time, which at first glance looked like enough time but that’s always a tricky perspective. The first weeks went relatively calm: we dug up some old contacts in various countries and reached out to our partners and friends who would be interested in submitting to the index. We received positive replies by most of them and I felt calm and confident, but I also had an instinct that is only created by experience of crowdsourcing contributions, so obviously I had a plan B. We asked for help from Arianit Dobroshi, a longstanding friend of Open Data Kosovo who is excited about mapping, openness, and general digital goodness. His task was to help us with the submissions, fill out on whatever country-specific problems may there arise, and make sure tasks are completed. Time was passing and pressure was rising, and there were very few submissions on the index. It was the end of the year so I started to receive staff emails of planned vacations. This triggered an emergency alert on me: I panicked, and did what a modern woman does when they panic: I took a break and procrastinated even further for an hour or two. Then I pulled myself together and started contacting our friends from the region. First on the list was Zoran Luša, Senior IT Adviser, Ministry of Public Administration of the Republic of Croatia. Zoran immediately was up for the task and invited his colleague Anamarija Musa to join in the efforts. Croatia was never measured before so they needed to do it from scratch. Not an easy task, so we asked for some extra help just in case. We contacted Miroslav Schlossberg from CodeForCroatia, who promptly informed us that they were supposed to do a sprint to evaluate local cities so they included contributions to GODI 2016 in there. The mix was perfect: these people are serious in their digital contributions and the kind of people you want to work with. Croatia was covered. Parallel to the Global Open Data Index 2016, I was managing an EU-funded project that did a thorough index research for the openness of public institutions in the Western Balkan countries. This project is implemented with a regional network of organizations called ACTIONSEE. So I reached out to our friends from this network one by one.
  • In Serbia, we contacted our great friends from the local organization CRTA. We work with them in many exciting projects and they are always very thrilled to be part of initiatives that combine transparency and technology. Pavle Dimitrij was quick to jump on board and promised timely and accurate submissions for Serbia. Slobodan Marković reached out to us and was interested to participate, so we had two parties involved and a team at the office to make sure it goes smoothly: Serbia was covered.
  • When you think internet and government in Macedonia, you think of the Metamorphosis foundation. They are the leaders in their field, so of course we reached out to them. Tamara Resavska and Goran Rizaov rose to the challenge: Macedonia was covered.
  • Next, we contacted our friends in Albania, the organisation MJAFT. We discussed a couple of common national problems in sweet Albanian and agreed that this index submission is important. Ms. Xheni Lame promised to submit, and so she did.
  • Lastly, the Montenegro submission was agreed upon with our friends from CDT, where Milena Gvozdenovic memorably said “I find this Index very interesting and valuable. Therefore, we’ll complete the survey within the deadline.” The remaining countries were mostly filled out by the team at Open Data Kosovo: that’s how the index submission was completed, and how the community was wrangled.
The results are out today and I can’t help but feel sad for the low score of Kosovo, ranked #56 out of 94 countries with a score of 29%. Currently, we are living in a very bad environmental pollution situation, and the having open data related to the environment would surely be a good step towards advocating for improvement. Furthermore, Kosovo does have some budgetary information but they are presented in a low quality, and not in an open data format, which further decreased our score. In fact, all the Balkan countries seem to line up together at the bottom of the list sharing similar openness problems and challenges. It’s been a great experience working with Open Knowledge International and acting as Community Wrangler. I learned a lot about the state of open data in the region but I also established a network of like-minded individuals who care about having transparent countries, who are eager to see them rank higher, who thrill on seeing improvement and want to contribute towards it. I am looking forward to being part of it again next year!

Measuring the Openness of Government Data in the Balkans

Blina Meta - May 24, 2017 in Global Open Data Index

Open Data Kosovo is a civic-tech organization that uses technology to contribute towards social good. The organization has created an exciting network of partners both local and international while working on projects related to visualizing procurement data, mapping satellite imagery for human rights violations, data collection and entry of 112 emergency calls, countering violent extremism online, providing digital solutions to public institutions, index measurement of the degree of openness of public institutions, visualizing election data, growth of the female coders community, and more. This portfolio made us a trustworthy candidate for the next task from Open Knowledge International, measuring the state of openness of government data for the countries in South Eastern Europe: Bulgaria, Macedonia, Serbia, Kosovo, Croatia, Albania, Slovenia, Bosnia and Herzegovina, Romania, Montenegro.   We agreed to the task, and thereby the journey of measuring the openness of the Southern Europe countries began. We had a two month period of submissions time, which at first glance looked like enough time but that’s always a tricky perspective. The first weeks went relatively calm: we dug up some old contacts in various countries and reached out to our partners and friends who would be interested in submitting to the index. We received positive replies by most of them and I felt calm and confident, but I also had an instinct that is only created by experience of crowdsourcing contributions, so obviously I had a plan B. We asked for help from Arianit Dobroshi, a longstanding friend of Open Data Kosovo who is excited about mapping, openness, and general digital goodness. His task was to help us with the submissions, fill out on whatever country-specific problems may there arise, and make sure tasks are completed. Time was passing and pressure was rising, and there were very few submissions on the index. It was the end of the year so I started to receive staff emails of planned vacations. This triggered an emergency alert on me: I panicked, and did what a modern woman does when they panic: I took a break and procrastinated even further for an hour or two. Then I pulled myself together and started contacting our friends from the region. First on the list was Zoran Luša, Senior IT Adviser, Ministry of Public Administration of the Republic of Croatia. Zoran immediately was up for the task and invited his colleague Anamarija Musa to join in the efforts. Croatia was never measured before so they needed to do it from scratch. Not an easy task, so we asked for some extra help just in case. We contacted Miroslav Schlossberg from CodeForCroatia, who promptly informed us that they were supposed to do a sprint to evaluate local cities so they included contributions to GODI 2016 in there. The mix was perfect: these people are serious in their digital contributions and the kind of people you want to work with. Croatia was covered. Parallel to the Global Open Data Index 2016, I was managing an EU-funded project that did a thorough index research for the openness of public institutions in the Western Balkan countries. This project is implemented with a regional network of organizations called ACTIONSEE. So I reached out to our friends from this network one by one.
  • In Serbia, we contacted our great friends from the local organization CRTA. We work with them in many exciting projects and they are always very thrilled to be part of initiatives that combine transparency and technology. Pavle Dimitrij was quick to jump on board and promised timely and accurate submissions for Serbia. Slobodan Marković reached out to us and was interested to participate, so we had two parties involved and a team at the office to make sure it goes smoothly: Serbia was covered.
  • When you think internet and government in Macedonia, you think of the Metamorphosis foundation. They are the leaders in their field, so of course we reached out to them. Tamara Resavska and Goran Rizaov rose to the challenge: Macedonia was covered.
  • Next, we contacted our friends in Albania, the organisation MJAFT. We discussed a couple of common national problems in sweet Albanian and agreed that this index submission is important. Ms. Xheni Lame promised to submit, and so she did.
  • Lastly, the Montenegro submission was agreed upon with our friends from CDT, where Milena Gvozdenovic memorably said “I find this Index very interesting and valuable. Therefore, we’ll complete the survey within the deadline.” The remaining countries were mostly filled out by the team at Open Data Kosovo: that’s how the index submission was completed, and how the community was wrangled.
The results are out today and I can’t help but feel sad for the low score of Kosovo, ranked #56 out of 94 countries with a score of 29%. Currently, we are living in a very bad environmental pollution situation, and the having open data related to the environment would surely be a good step towards advocating for improvement. Furthermore, Kosovo does have some budgetary information but they are presented in a low quality, and not in an open data format, which further decreased our score. In fact, all the Balkan countries seem to line up together at the bottom of the list sharing similar openness problems and challenges. It’s been a great experience working with Open Knowledge International and acting as Community Wrangler. I learned a lot about the state of open data in the region but I also established a network of like-minded individuals who care about having transparent countries, who are eager to see them rank higher, who thrill on seeing improvement and want to contribute towards it. I am looking forward to being part of it again next year!