You are browsing the archive for Global Open Data Index.

New report: Governing by rankings – How the Global Open Data Index helps advance the open data agenda

Danny Lämmerhirt - November 29, 2017 in Featured, Global Open Data Index, godi, GODI16, open data survey, research

This blogpost was jointly written by Danny Lämmerhirt and Mária Žuffová (University of Strathclyde). We are pleased to announce our latest report Governing by rankings – How the Global Open Data Index helps advance the open data agenda. The Global Open Data Index (GODI) is one of the largest worldwide assessments of how well governments publish open data, coordinated by Open Knowledge International since 2013. Over the years we observed how GODI is used to monitor open data publication. But to date, less was known how​ ​GODI​ ​may​ ​translate​ ​into​ ​open​ ​data​ ​policies​ ​and publication​. How does GODI mobilise support for open data? Which actors are mobilised? Which aspects of GODI are useful, and which are not? Our latest report provides insights to these questions.

Why does this research matter?

Global governance indices like GODI enjoy great popularity due to their capacity to count, calculate, and compare what is otherwise hardly comparable. A wealth of research – from science and technology studies to sociology of quantification and international policy – shows that the effects of governance indicators are complex (our report provides an extensive reading list). Different audiences can take up indices to different (unintended) ends. It is therefore paramount to trace the effects of governance indicators to inform their future design. The report argues that there are multiple ways of looking for ‘impacts’ depending on different audiences, and how they put GODI into practice. Does a comparative open data ranking like GODI help mobilise high-level policy commitments? Does it incentivise individual government agencies to adjust and improve the publication of open data? Does it open up spaces for discussion and deliberation between government and civil society? This thinking builds on an earlier report by Open Knowledge International arguing that indicators have different audiences, with different lived experiences, needs, and agendas. While any form of measurement needs to align with these needs to become actionable (which affects how the impact of indicators will take shape), it also needs to retain comparability.

Our findings

We used Argentina, United Kingdom and Ukraine as case studies to represent different degrees of open data publication, economic development and political set-up. Our report, drawing from a series of twelve interviews and document analysis, suggests that GODI drives change primarily from within government. We assume this finding is partly due to our limited sample size. While key actors in the government are easy to identify, as open data publication is often one of their job responsibilities,  further research is needed to identify more civil society actors and how they engage with GODI. Below we describe nine ways how GODI influences open data policy and publication.
  1. Getting international visibility and achieving progress in country rankings or generally high ranking may incentivise and maintain high-level political support for open data, despite non-comparability of results across years.  
  2. In the absence of open data legislation, GODI has been used by Argentinian government as a soft policy tool to pressure other government agencies to publish data.
  3. Government agencies tasked with implementing open data used GODI to reward and point out progress made by other agencies, but also flag blockages to high-level politicians.  
  4. GODI sets standards what datasets to publish and sets a baseline for improvement. Outcomes are debatable around categories where the central government does not have easy political levers to publish data.
  5. GODI may be confounded with broader commitments to open government and used as an argument to reduce investment in other aspects of open government agenda. In the past, some high-level politicians presented  high ranking in GODI as evidence of government transparency and obsoletion of other ways of providing government information.  
  6. This effect may possibly be exacerbated by superficial media coverage that reports on the ranking without engaging with broader information and transparency policies. An analysis of Google News results suggests that journalists tend to reproduce (mostly politicians’) misconceptions and confound a good ranking in GODI with a high degree of government transparency and openness.
  7. Our findings suggest that individuals and organisations working around transparency and anti-corruption make little use of GODI due to a lack of detail and a misalignment with their specialised work tasks. For instance, Transparency International Ukraine uses the Transparent Public Procurement Rating to evaluate the legal framework, aside from the publication of open data.
  8. On the other hand, academics show interest to GODI to develop new governance indicators. They also often use country scores as a proxy for measuring open data availability.
  9. GODI has a potential for use in data journalism. Data journalism trainers may use it as a source of government data during their trainings.  

What we learned and the road ahead

Our research suggests that governments in all analysed countries pay attention to GODI.  With a few exceptions, they use it mostly to support open data publication and pave the way for new open data policies. While this is a promising finding, it has important implications for GODI and its design. If GODI sets standards in open data publication, as some interviewees from the government suggest, it needs to make sure to represent different data demands in the assessment and to encourage the implementation of sound policies. The challenge is to support policy development, which is often a lengthy process as opposed to short-lived rank-seeking. Some interviewees suggested valuable avenues for GODI’s design. For instance, assessing progress in open data publication perpetually rather than once a year over a limited timespan would require a long-term commitment to open data publication and better opportunities for civic engagement, as it would prevent governments from updating datasets once a year before GODI’s deadline only. Another route forward is discussed in another recent research by OKI, highlighting the potential to adjust an open data index to align it more closely to specific needs of topical expert organisations. Beyond engaging via GODI, civil society and academia might also participate in the development of new data monitoring instruments such as the Open Data Survey, that are relevant for their mission.    

How do open data measurements help water advocates to advance their mission?

Danny Lämmerhirt - November 23, 2017 in Global Open Data Index, godi, GODI16, open data survey, WASH, water quality

This blogpost was jointly written by Danny Lämmerhirt and Nisha Thompson (DataMeet). Since its creation, the open data community has been at the heart of the Global Open Data Index (GODI). By teaming up with expert civil society organisations we define key datasets that should be opened by government to align with civil society’s priorities. We assumed that GODI also teaches our community to become more literate about government institutions, regulatory systems and management procedures that create data in the first place – making GODI an engagement tool with government.

Tracing the publics of water data

Over the past few months we have reevaluated these assumptions. How do different members of civil society perceive the data assessed by GODI? Is the data usable to advance their mission? How can GODI be improved to accommodate and reflect the needs of civil society? How would we go about developing user-centric open data measurements and would it be worth to run more local and contextual assessments? As part of this user research, OKI and DataMeet (a community of data science and open data enthusiasts in India) teamed up to investigate the needs of civic organisations in the water, sanitation and health (WASH) sector. GODI assesses whether governments release information on water quality, that is pollution levels, per water source. In detail this means that we check whether water data is available at potentially a household level or  at each untreated public water source such as a lake or river. The research was conducted by DataMeet and supervised by OKI, and included interviews and workshops with fifteen different organisations. In this blogpost we share insights on how law firms, NGOs, academic institutions, funding and research organisations perceive the usefulness of GODI for their work. Our research focussed on the South Asian countries India, Pakistan, Nepal, and Bangladesh. All countries face similar issues with ensuring safe water to their populations because of an over-reliance on groundwater, geogenic pollutants like arsenic, and high pollutants from industry, urbanisation, farming, and poor sanitation.

According to the latest GODI results, openness of water quality data remains low worldwide.

What kinds of water data matter to organisations in the water sector?

Whilst all interviewed organisations have a stake in access to clean water for citizens, they have very different motivations to use water quality data. Governmental water quality data is needed to
  1. Monitor government activities and highlight general issues with water management (for advocacy groups).
  2. Create a baseline to compare against civil society data (for organisations implementing water management systems)
  3. Detect geographic target areas of under-provision as well as specific water management problems to guide investment choices (for funding agencies and decision-makers)
Each use case requires data with different quality. Some advocacy interviewees told us that government data, despite a potential poor reliability, is enough to make the case that water quality is severely affected across their country. In contrast, researchers have a need for data that is provided continuously and at short updating cycles. Such data may not be provided by government. Government data is seen as support for their background research, but not a primary source of information. Funders and other decision-makers use water quality data largely for monitoring and evaluation – mostly to make sure their money is being used and is impactful. They will sometimes use their own water quality data to make the point that government data is not adequate. Funders push for data collection at a project level not continuous monitoring which can lead to gaps in understanding. GODI’s definition of water quality data is output-oriented and of general usefulness. It enables finding the answer to whether the water that people can access is clean or not. Yet, organisations on the ground need other data – some of which is process-oriented – to understand how water management services are regulated and governed or what laboratory is tasked to collect data. A major issue for meaningful engagement with water-related data is the complexity of water management systems. In the context of South Asia, managing, tracking, and safeguarding water resources for use today and in the future is complex. Water management systems, from domestic to industrial to agricultural ones, are diverse and hard to examine and keep accountable. Where water is coming from, how much of it is being used and for what, and then how waste is being disposed of are all crucial questions to these systems. Yet there is very little data available to address all these questions.

How do organisations in the WASH sector perceive the GODI interface?

GODI has an obvious drawback for the interviewed organisations: transparency is not a goal for organisations working on the ground and does not in itself provoke an increase in access to safe water or environmental conservation. GODI measures the publication of water quality data, but is not seen to stimulate improved outcomes. It also does not interact with the corresponding government agency. One part of GODI’s theory of change is that civil society becomes literate about government institutions and can engage with government via the publication of government data. Our interviews suggest that our theory of change needs to be reconsidered or new survey tools need to be developed that can enhance engagement between civil society and government. Below we share some ideas for future scenarios.

Our learnings and the road ahead

Adding questions to GODI

Interviews show that GODI’s current definition of water quality data does not always align with the needs of organisations on the ground. If GODI wants to be useful to organisations in the WASH sector, new questions can be added to the survey and be used as a jumping off point for outreach to groups. Some examples include:
  1. Add a question regarding metadata and methodology documentation to capture quality and provenance water data, but also where we found and selected data.
  2. Add a question regarding who did the data collection government or partner organisation. This allows community members to trace the data producers and engage with them.
  3. Assess transparency of water reports. Reports should be considered since they are an important source of information for civil society.

Customising the Open Data Survey for regional and local assessments

Many interviewees showed an interest in assessing water quality data at the regional and hyperlocal level. DataMeet is planning to customise the Open Data Survey and to team up with local WASH organisations to develop and maintain a prototype for a regional assessment of water quality. India will be our test case since there is local data for the whole country available at varying degrees across states. This may include to also assess quality of data and access to metadata. Highest transparency would mean to have water data from each individual lab were the samples are sent. Another use case of the Open Data Survey would include to measure the transparency of water laboratories. Bringing more transparency and accountability to labs would be the most valuable for ground groups sending samples to labs across the country.

Map of high (> 30 mg/l) fluoride values from 2013–14. From: The Gonda Water Data story

Storytelling through data

Whilst some interviewees saw little use in governmental water quality data, its usefulness can be greatly enhanced when combined with other information. As discussed earlier, governmental water data gives snapshots and may provide baseline values that serve NGOs as rough orientation for their work. Data visualisations could present river and water basin quality and tell stories about the ecological and health effects. Behavior change is a big issue when adapting to sanitation and hygiene interventions. Water quality and health data can be combined to educate people. If you got sick, have you checked your water? Do you use a public toilet? Are you washing your hands? This type of narration does not require granular accurate data.

Comparing water quality standards

Different countries and organisations have different standards for what counts as high water pollution levels. Another project could assess how the needs of South Asian countries are being served by a comparing pollution levels with different standards. For instance, fluorosis is an issue in certain parts of India: not just from high fluoride levels but also because of poor nutrition in those areas. Should fluoride affected areas have lower permissible amounts in poorer countries? These questions could be used to make water quality data actionable to advocacy  groups.

The future of the Global Open Data Index: assessing the possibilities

Open Knowledge International - November 1, 2017 in Global Open Data Index, godi, GODI16, Open Government Data, open-government

In the last couple of months we have received questions regarding the status of the new Global Open Data Index (GODI) from a few members of our Network. This blogpost is to update everyone on the status of GODI and what comes next. But first, some context: GODI is one of the biggest assessments of the state of open government data globally, alongside the Web Foundation’s Open Data Barometer. We notice persistent obstacles for open data year-by-year. High-income countries regularly secure top rankings, yet overall there is little to no development in many countries. As our latest State Of Open Government Data in 2017 report shows, data is often not made available publicly at all. If so, we see many issues around findability, quality, processability, and licensing. Individual countries are notable exceptions to the rule. The Open Data Barometer made similar observations in its latest report, mentioning a slow uptake of policy, as well as persistent data quality issues in countries that provide open data. So there is still a lot of work to be done. To resolve issues like engagement with our community, we started to explore alternative paths for GODI. This includes a shift in focus from a mere measurement tool to a stronger conversational device between our user groups throughout the process. We understand that we need to speak to new audiences and focus on measurement as a tool in real world applications. We need to focus more on this. We want to understand the use cases of the Open Data Survey (the tool that powers GODI and the Open Data Census) in different contexts and with different goals. We have barely seen a few of the possible uses of the tool in the open data sphere and we want to see even more. In order to learn more about how GODI is taken up by different user groups, we are also currently exploring GODI’s effects on open data policy and publication. We wish to understand more systematically how individual elements of the GODI interface (such as country ranking, dataset results, discuss forum entries) help mobilising support for open data among different user groups. Our goal is to understand how to improve our survey design and workflow so that they more directly support action around open data policy and publication. In addition we are developing a new vision for the Open Data Index to either measure open data on a regional and city-level or by topical areas. We will elaborate on this vision in a follow-up blogpost soon. Taking this all into account, we have decided to focus on working on the aforementioned use cases and a regional Index during 2018. In the meantime, we will still work with our community to define a vision that will make GODI a sustainable measurement tool: we understand that tracking the changes in government data publication is crucial for the activists and governments themselves. We know that progress around open data is slower than we would like it to be, but therefore we need to ensure that discussions around open data do not end. Please do not hesitate to submit new discussions around country entries on our forum or reach out to us if you have any ideas on how to take GODI forwards and improve. If you’re running an Open Data Census, we we’ll continue giving you support in the measurement you’re currently working on, whether it’s local, regional or you have any new idea of a Census you’d like to try. If you want to run your own Census, you can request it here, or send an email to index@okfn.org to see how we could collaborate further.

Research call: Mapping the impacts of the Global Open Data Index

Open Knowledge International - September 6, 2017 in Global Open Data Index

The Global Open Data Index (GODI) is a worldwide assessment of open data publication in more than 90 countries. It provides evidence how well governments perform in open data publication. This call invites interested researchers and organisations to systematically study the effects of the Global Open Data Index on open data publication and the open data ecosystem. The study will identify different actors engaged around GODI, and how the information provided by GODI helped advance open data policy and publication. It will do so by investigating a sample of three countries with different degrees of open data adoption. The work will be conducted in close collaboration with Open Knowledge International’s (OKI) research department who will provide guidance, review and assistance throughout the project.   We invite interested parties to send their costed proposal to research@okfn.org. In order to be eligible, the proposal must include research background, a short description why they are interested in the topic and how they want to research it (300 words maximum), a track record demonstrating knowledge of the topic, as well as a written research sample around open data or related fields. Finally, the proposal must also specify how much time will be committed to the work and for what cost (in GBP or USD). Due to the nature of the funding supporting this work, we unfortunately cannot accept proposals from US-based people or organisations. Please make sure the submission is made before the proposal deadline of Wed 13 Sept, 21:00 UTC.

Outline

 

Background

The Global Open Data Index (GODI) is a worldwide assessment of open data publication in more than 90 countries. It provides evidence how well governments perform in open data publication. This includes mapping accessibility and access controls, findability of data, key data characteristics, as well as open licensing and machine-readability. At the same time GODI provides a venue for open data advocates and civil servants to discuss the production of open data. Evidence shows that governance indicators drive change if they embrace dialogue and mutual ownership of those who are assessed, and those who assess. This year we wanted to use the launch of GODI to spark dialogue and provide a venue for the ensuing discussions. Through this dialogue, governments learn about key datasets and data quality issues, while also receiving targeted feedback to help them improve. Furthermore, every year many interactions happen outside of the GODI process, not including the GODI staff or public discussions. Instead results are discussed within public institutions, or among civic actors and public institutions. Some scarce evidence of GODI’s outcomes is available, yet a systematic understanding of the diverse types of effects is missing to date.

Scope of research

This research is intended to get a systematic understanding of the effects of the Global Open Data Index on open data publication and the open data ecosystem. It addresses three research questions:
  1. In what ways does the Global Open Data Index process mobilize support for open data in countries with different degrees of open data policy and publication? How does this support manifest itself?
  2. How does the Global Open Data Index influence open data publication in governments both in terms of quantity and quality of data?
  3. How do different elements of the Global Open Data Index help governments and civil society actors to drive progress  around question 1 and 2?
GODI’s effects can tentatively be grouped into high-level policy and strategy development as well as strategy implementation and ongoing publication. This research will assess how different actors such as civil servants, high-level government officials, open data advocates and communities engage with different elements of GODI and how this helps advancing open data policy and publication. The research should also, whenever applicable, provide a critical account of GODI’s adverse effects. This can include ‘ceiling effects’, tunnel vision and reactivity, or other effects. The research will assess these effects in three countries. These may include Argentina, Colombia, Ukraine, South Africa, Thailand, or others. It is possible to propose alternative countries, if the researcher has strong experience in those or if it would help gathering data for the research. Proposals should specify which three  countries would be assessed. If alternative countries are proposed, they should meet the following criteria:
  1. One country without national open data policy, one country with a recent open data policy (in effect between 3 months and 2 years), as well as countries with established open data policies older than 2 years)
  2. A mix of countries with different endorsement for GODI, including countries who actively announced to increase their ranking (high importance) and countries where no public claims for open data improvement are documented
  3. Presence of country in past two GODI editions
  4. May include members of the Open Government Partnership and Open Data Charter adopters, as well as non-members.

Deliverables

The work will provide a written report between 5000 and 7000 words length addressing each of the research questions. The report must include a clearly written methodology section and country sampling approach. The desired format is a narrative report in English. A qualitative, critical assessment of GODI’s effects on open data policy and publication is expected. It needs to describe the actors using GODI, how they interacted with different aspects of GODI, and how this helped to drive change around the first two research questions outlined above. Furthermore following deliverables are expected:
  • Interviews with least four interviewees per country
  • A semi-structured  interview guide
  • Draft report by 15 October, structured around country portraits for three sample countries.
  • Weekly catch-ups with the Research team at OKI
  • Final report by 1 November

Methods and data sources

The researcher can draw from several sources to start this research, including OKI’s country contacts, Global Open Data Index scores, etc. Suggested methodology approaches include interviews with government officials and GODI contributors, as well as document analysis. Alternative research approaches and data sources shall be discussed with OKI’s research team. The research team will provide assistance in sampling interviewees in the initial phase of the research.

Activities

It is expected that this work is conducted in close contact with OKI’s research department. We will arrange a kick-off meeting to discuss your approach and have weekly calls to discuss activity and progress on the work. Early drafts will be shared with the OKI team to provide comments and discuss them with you. In addition we will have a final reflection call. Remote availability is expected (via email, Skype, Slack, or other channels). Overall research outline and goals will be discussed and agreed upon with the research lead of GODI who will help in sampling countries and will review project progress.

Decision criteria

We will base our decision of selecting a research party on following criteria:
  • Evidence of an understanding of open data assessments and indicators, and their influence on policy development and implementation.
  • Track record in the field of open data assessment and measurement.
  • Clarity and feasibility of methodology you propose to follow.
Due to the nature of the funding supporting this work, we unfortunately cannot accept proposals from US-based people or organisations. Please make sure the submission is made before the proposal deadline of Wed 13 Sept, 21:00 UTC.

Using the Global Open Data Index to strengthen open data policies: Best practices from Mexico

Oscar Montiel - August 16, 2017 in Global Open Data Index, Open Data Index, Open Government Data, Open Knowledge

This is a blog post coauthored with Enrique Zapata, of the Mexican National Digital Strategy. As part of the last Global Open Data Index (GODI), Open Knowledge International (OKI) decided to have a dialogue phase, where we invited individuals, CSOs, and national governments to exchange different points of view, knowledge about the data and understand data publication in a more useful way. In this process, we had a number of valuable exchanges that we tried to capture in our report about the state of open government data in 2017, as well as the records in the forum. Additionally, we decided to highlight the dialogue process between the government and civil society in Mexico and their results towards improving data publication in the executive authority, as well as funding to expand this work to other authorities and improve the GODI process. Here is what we learned from the Mexican dialogue:

The submission process

During this stage, GODI tries to directly evaluate how easy it is to find and their data quality in general. To achieve this, civil society and government actors discussed how to best submit and agreed to submit together, based on the actual data availability.   Besides creating an open space to discuss open data in Mexico and agreeing on a joint submission process, this exercise showed some room for improvement in the characteristics that GODI measured in 2016:
  • Open licenses: In Mexico and many other countries, the licenses are linked to datasets through open data platforms. This showed some discrepancies with the sources referenced by the reviewers since the data could be found in different sites where the license application was not clear.
  • Data findability: Most of the requested datasets assess in GODI are the responsibility of the federal government and are available in datos.gob.mx. Nevertheless, the titles to identify the datasets are based on technical regulation needs, which makes it difficult for data users to easily reach the data.
  • Differences of government levels and authorities: GODI assesses national governments but some of these datasets – such as land rights or national laws – are in the hands of other authorities or local governments. This meant that some datasets can’t be published by the federal government since it’s not in their jurisdiction and they can’t make publication of these data mandatory.
 

Open dialogue and the review process

  During the review stage, taking the feedback into account, the Open Data Office of the National Digital Strategy worked on some of them. They summoned a new session with civil society, including representatives from the Open Data Charter and OKI in order to:
  • Agree on the state of the data in Mexico according to GODI characteristics;
  • Show the updates and publication of data requested by GODI;
  • Discuss paths to publish data that is not responsibility of the federal government;
  • Converse about how they could continue to strengthen the Mexican Open Data Policy.
  The results   As a result of this dialogue, we agreed six actions that could be implemented internationally beyond just the Mexican context both by governments with centralised open data repositories and those which don’t centralise their data, as well as a way to improve the GODI methodology:  
  1. Open dialogue during the GODI process: Mexico was the first country to develop a structured dialogue to agree with open data experts from civil society about submissions to GODI. The Mexican government will seek to replicate this process in future evaluations and include new groups to promote open data use in the country. OKI will take this experience into account to improve the GODI processes in the future.
  2. Open licenses by default: The Mexican government is reviewing and modifying their regulations to implement the terms of Libre Uso MX for every website, platform and online tool of the national government. This is an example of good practice which OKI have highlighted in our ongoing Open Licensing research.
  3. “GODI” data group in CKAN: Most data repositories allow users to create thematic groups. In the case of GODI, the Mexican government created the “Global Open Data Index” group in datos.gob.mx. This will allow users to access these datasets based on their specific needs.
  4. Create a link between government built visualization tools and datos.gob.mx: The visualisations and reference tools tend to be the first point of contact for citizens. For this reason, the Mexican government will have new regulations in their upcoming Open Data Policy so that any new development includes visible links to the open data they use.
  5. Multiple access points for data: In August 2018, the Mexican government will launch a new section on datos.gob.mx to provide non-technical users easy access to valuable data. These data called “‘Infraestructura de Datos Abiertos MX’ will be divided into five easy-to-explore and understand categories.
  6. Common language for data sets: Government naming conventions aren’t the easiest to understand and can make it difficult to access data. The Mexican government has agreed to change the names to use more colloquial language can help on data findability and promote their use. In case this is not possible with some datasets, the government will go for an option similar to the one established in point 5.
We hope these changes will be useful for data users as well as other governments who are looking to improve their publication policies. Got any other ideas? Share them with us on Twitter by messaging @OKFN or send us an email to index@okfn.org  

The final Global Open Data Index is now live

Oscar Montiel - June 15, 2017 in Global Open Data Index

The updated Global Open Data Index has been published today, along with our report on the state of Open Data this year. The report includes a broad overview of the problems we found around data publication and how we can improve government open data. You can download the full report here. Also, after the Public Dialogue phase, we have updated the Index. You can see the updated edition here We will also keep our forum open for discussions about open data quality and publication. You can see the conversation here.  

Ποιότητα ανοικτών δεδομένων – η επόμενη αλλαγή στα ανοικτά δεδομένα;

Χριστίνα Καρυπίδου - June 10, 2017 in Featured, Featured @en, Global Open Data Index, News, Open Data Handbook, ανοικτά δεδομένα, Νέα

Από το Open Knowledge International Αυτή η ανάρτηση είναι μέρος του Global Open Data Blog. Είναι ένα κάλεσμα να επαναπροσδιορίσουμε την προσοχή μας στα πολλά διαφορετικά στοιχεία που συμβάλλουν στην «καλή ποιότητα» των ανοικτών δεδομένων, στις ανταλλαγές μεταξύ τους και στον τρόπο με τον οποίο υποστηρίζουν τη χρηστικότητα των δεδομένων (βλ. εδώ μερικά σημαντικά έργα […]

What data do we need? The story of the Cadasta GODI fellowship

Mor Rubinstein - June 9, 2017 in Global Open Data Index

This blogpost was written by Lindsay Ferris and Mor Rubinstein   There is a lot of data out there, but which data users needs to solve their issues? How can we, as an external body, know which data is vital so we can measure it?  Moreover, what to do when data is published in so many levels – local, regional and federal that is so hard to find? Every year we are thinking about these questions in order to improve the Global Open Data Index (GODI), and make it more relevant to civil society. Having the relevant data characteristics is crucial for data use since without specific data it is hard to analysed and learn. After the publication of the GODI 2015, Cadasta Foundation approached us to discuss the results of GODI in the land ownership category.  Throughout this initial, lively discussion, we noticed that a systematic understanding of land data in general, and land ownership data in particular, was missing. An idea emerged: What if we will We decided to bridge these gaps to build a systematic understanding of land ownership data for the 2016 GODI. And so came to life the idea of the GODI fellowship. It was simple – Cadasta will have a fellow for a period of 6 months to explore the publication of data that is relevant to land ownership issues. The fellowship would be funded by Cadasta and the fellow would be an integral part of the team. OKI would give in-kind support of guidance and research. The fellowship goals were:
  • Global policy analysis of open data in the field of land and resource rights
  • Better definition for the land ownership dataset in the Global Open Data Index for 2016;
  • Mapping stakeholders and partners for the Global Open Data Index (for submissions);
  • Recommendations for a thematic Index;
  • A working paper or a series of blog posts about open data in land and resource ownership.
Throughout the fellowship, Lindsay conducted interviews with land experts, NGOs and government officials as well as on-going desk research on the land data publication practices across different contexts. She established 4 key outputs:
  1. Outlining the challenges of opening land ownership data. Blog post here.
  2. Mapping the different types of land data and their availability. Overview here.
  3. Assessing the privacy and security risks of opening certain types of land data. See our work here: cadasta.org/open-data/assessing-the-risks-of-opening-property-rights-data/
4.Identifying user needs and creating user personas for open land data.  User personas here.   Throughout the GODI process, our aim is to advocate for datasets that different stakeholders actually need and that make sense within the context in which they are published. For example, one of the main challenges in land ownership is that data is not always recorded or gathered by the federal level, and is collect in cities and regions. One of the primary users of land ownership data are other government agencies. Having a grasp of this type of knowledge helped us better define the land ownership dataset for the GODI. Ultimately, we developed a thoughtful definition based on these reflections and recommendations.   For us at OKI, having someone dedicated in an organisation that is an expert in a data category was immensely helpful. It makes the index categories more relevant for real life use  and help us to measure the categories better. It helps us to make sure our assumptions and foundation for the research are good. For Cadasta, having a person dedicate on open data helped to create a knowledge based and resources that help them look at the open data better. It was a win – win for both sides. In fact, The work Lindsay was doing was very valuable for Cadasra that Lindsay time was extended at Cassata and she worked on writing a case study about open data and land in Sao Paulo and Land Debate final report and a paper on Open Data in Land Governance for the 2017 World Bank Land and Poverty Conference. Going forward in the future of open data assessment, we believe that having this expert input in the design of the survey is crucial. Having only an open data lense can lead us to bias and wrong measurements. In our vision, we see the GODI tool as community owned assessment, that can help all fields to promote, find and use the data that is relevant for them. Interested of thinking the future of your field through open data? Write to us on the forum – https://discuss.okfn.org/c/open-data-index/global-open-data-index-2016

The state of open licensing in 2017

Danny Lämmerhirt - June 8, 2017 in Global Open Data Index, Open Definition, Open Government Data, Open Knowledge

This blog post is part of our Global Open Data Index (GODI) blog series. Firstly, it discusses what open licensing is and why it is crucial for opening up data. Afterward, it outlines the most urgent issues around open licensing as identified in the latest edition of the Global Open Data Index and concludes with 10 recommendations how open data advocates can unlock this data. The blog post was jointly written by Danny Lämmerhirt and Freyja van den Boom.   Open data must be reusable by anyone and users need the right to access and use data freely, for any purpose. But legal conditions often block the effective use of data. Whoever wants to use existing data needs to know whether they have the right to do so. Researchers cannot use others’ data if they are unsure whether they would be violating intellectual property rights. For example, a developer wanting to locate multinational companies in different countries and visualize their paid taxes can’t do so unless they can find how this business information is licensed. Having clear and open licenses attached to the data, which allow for use with the least restrictions possible, are necessary to make this happen.   Yet, open licenses still have a long way to go. The Global Open Data Index (GODI) 2016/17 shows that only a small portion of government data can be used without legal restrictions. This blog post discusses the status of ‘legal’ openness. We start by explaining what open licenses are and discussing GODI’s most recent findings around open licensing. And we conclude by offering policy- and decisionmakers practical recommendations to improve open licensing.   What is an open license? As the Open Definition states, data is legally open “if the legal conditions under which data is provided allow for free use”.  For a license to be an open license it must comply with the conditions set out under the  Open Definition 2.1.  These legal conditions include specific requirements on use, non-discrimination, redistribution, modification, and no charge.   Why do we need open licenses? Data may fall under copyright protection. Copyright grants the author of an original work exclusive rights over that work. If you want to use a work under copyright protection you need to have permission. There are exceptions and limitations to copyright when permission is not needed for example when the data is in the ‘public domain’ it is not or no longer protected by copyright, or when your use is permitted under an exception.   Be aware that some countries also allow legal protection for databases which limit what use can be made of the data and the database. It is important to check what the national requirements are, as they may differ.   Because some types of data (papers, images) can fall under the scope of copyright protection we need data licensing. Data licensing helps solve problems in practice including not knowing whether the data is indeed copyright protected and how to get permission. Governments should therefore clearly state if their data is in the public domain or when the data falls under the scope of copyright protection what the license is.
  • When data is public domain it is recommended to use the CC0 Public Domain license for clarity.
  • When the data falls under the scope of copyright it is recommended to use an existing Open license such as CC-BY to improve interoperability.
Using Creative Commons or Open Data Commons licenses is best practice. Many governments already apply one of the Creative Commons licenses (see this wiki). Some governments have chosen however to write their own licenses or formulate ‘terms of use’ which grant use rights similar to widely acknowledged open licenses. This is problematic from the perspective of the user because of interoperability. The proliferation of ever more open government licenses has been criticized for a long time. By creating their own versions, governments may add unnecessary information for users, cause incompatibility and significantly reduce reusability of data.  Creative Commons licenses are designed to reduce these problems by clearly communicating use rights and to make the sharing and reuse of works possible.  

The state of open licensing in 2017

Initial results from the GODI 2016/17 show roughly that only 38 percent of the eligible datasets were openly licensed (this value may change slightly after the final publication on June 15). The other licenses include many use restrictions including use limitations to non-commercial purposes, restrictions on reuse and/or modifications of the data.     Where data is openly licensed, best practices are hardly ever followed In the majority of cases, our research team found governments apply general terms of use instead of specific licenses for the data. Open government licenses and Creative Commons licenses were seldom used. As outlined above, this is problematic. Using customized licenses or terms of use may impose additional requirements such as:
  • Require specific attribution statements desired by the publisher
  • Add clauses that make it unclear how data can be reused and modified.
  • Adapt licenses to local legislation
Throughout our assessment, we encountered unnecessary or ambivalent clauses, which in turn may cause legal concerns, especially when people consider to use data commercially. Sometimes we came across redundant clauses that cause more confusion than clarity.  For example clauses may forbid to use data in an unlawful way (see also the discussion here).   Standard open licenses are intended to reduce legal ambiguity and enable everyone to understand use rights. Yet many licenses and terms contain unclear clauses or are not obvious to what data they refer to. This can, for instance, mean that governments restrict the use of substantial parts of a database (and only allow the use of insignificant parts of it). We recommend that governments give clear examples which use cases are acceptable and which ones are not.   Licenses do not make clear enough to what data they apply.  Data should include a link to the license, but this is not commonly done. For instance, in Mexico, we found out that procurement information available via Compranet, the procurement platform for the Federal Government, was openly licensed, but the website does not state this clearly. Mexico hosts the same procurement data on datos.gob.mx and applies an open license to this data. As a government official told us, the procurement data is therefore openly licensed, regardless where it is hosted. But again this is not clear to the user who may find this data on a different website. Therefore we recommend to always have the data accompanied with a link to the license.  We also recommend to have a license notice attached or ‘in’ the data too. And to keep the links updated to avoid ‘link rot’.   The absence of links between data and legal terms makes an assessment of open licenses impossible Users may need to consult legal texts and see if the rights granted to comply with the open definition. Problems arise if there is not a clear explanation or translation available what specific licenses entail for the end user. One problem is that users need to translate the text and when the text is not in a machine-readable format they cannot use translation services. Our experience shows that it was a significant source of error in our assessment. If open data experts struggle to assess public domain status, this problem is even exacerbated for open data users. Assessing public domain status requires substantial knowledge of copyright – something the use of open licenses explicitly wants to avoid.   Copyright notices on websites can confuse users. In several cases, submitters and reviewers were unable to find any terms or conditions. In the absence of any other legal terms, submitters sometimes referred to copyright notices that they found in website footers. These copyright details, however, do not necessarily refer to the actual data. Often they are simply a standard copyright notice referring to the website.

Recommendations for data publishers

Based on our finding we prepared 10 recommendations that policymakers and other government officials should take into account:  
  1. Does the data and/or dataset fall under the scope of IP protection? Often government data does not fall under copyright protection and should not be presented as such. Governments should be aware and clear about the scope of intellectual property (IP) protection.
  2. Use standardized open licenses. Open licenses are easily understandable and should be the first choice. The Open Definition provides conformant licenses that are interoperable with one another.
  3. In some cases, governments might want to use a customized open government license. These should be as open as possible with the least restrictions necessary and compatible (see point 2). To guarantee a license is compatible, the best practice is to submit the license for approval under the Open Definition.
  4. Exactly pinpoint within the license what data it refers to and provide a timestamp when the data has been provided.
  5. Clearly, publish open licensing details next to the data. The license should be clearly attached to the data and be both human and machine-readable. It also helps to have a license notice ‘in’ the data.
  6. Maintain the links to licenses so that users can access license terms at all times.
  7. Highlight the license version and provide context how data can be used.
  8. Whenever possible, avoid restrictive clauses that are not included in standard licenses.
  9. Re-evaluate the web design and avoid confusing and contradictory copyright notices in website footers, as well as disclaimers and terms of use.
  10. When government data is in the public domain by default, make clear to end users what that means for them.
 

Open data quality – the next shift in open data?

Open Knowledge International - May 31, 2017 in Data Quality, Global Open Data Index, GODI16, Open Data

This blog post is part of our Global Open Data Index blog series. It is a call to recalibrate our attention to the many different elements contributing to the ‘good quality’ of open data, the trade-offs between them and how they support data usability (see here some vital work by the World Wide Web Consortium). Focusing on these elements could help support governments to publish data that can be easily used. The blog post was jointly written by Danny Lämmerhirt and Mor Rubinstein.   Some years ago, open data was heralded to unlock information to the public that would otherwise remain closed. In the pre-digital age, information was locked away, and an array of mechanisms was necessary to bridge the knowledge gap between institutions and people. So when the open data movement demanded “Openness By Default”, many data publishers followed the call by releasing vast amounts of data in its existing form to bridge that gap. To date, it seems that opening this data has not reduced but rather shifted and multiplied the barriers to the use of data, as Open Knowledge International’s research around the Global Open Data Index (GODI) 2016/17 shows. Together with data experts and a network of volunteers, our team searched, accessed, and verified more than 1400 government datasets around the world. We found that data is often stored in many different places on the web, sometimes split across documents, or hidden many pages deep on a website. Often data comes in various access modalities. It can be presented in various forms and file formats, sometimes using uncommon signs or codes that are in the worst case only understandable to their producer. As the Open Data Handbook states, these emerging open data infrastructures resemble the myth of the ‘Tower of Babel’: more information is produced, but it is encoded in different languages and forms, preventing data publishers and their publics from communicating with one another. What makes data usable under these circumstances? How can we close the information chain loop? The short answer: by providing ‘good quality’ open data.  

Understanding data quality – from quality to qualities

The open data community needs to shift focus from mass data publication towards an understanding of good data quality. Yet, there is no shared definition what constitutes ‘good’ data quality. Research shows that there are many different interpretations and ways of measuring data quality. They include data interpretability, data accuracy, timeliness of publication, reliability, trustworthiness, accessibility, discoverability, processability, or completeness.  Since people use data for different purposes, certain data qualities matter more to a user group than others. Some of these areas are covered by the Open Data Charter, but the Charter does not explicitly name them as ‘qualities’ which sum up to high quality. Current quality indicators are not complete – and miss the opportunity to highlight quality trade-offs Also, existing indicators assess data quality very differently, potentially framing our language and thinking of data quality in opposite ways. Examples are: Some indicators focus on the content of data portals (number of published datasets) or access to data. A small fraction focus on datasets, their content, structure, understandability, or processability. Even GODI and the Open Data Barometer from the World Wide Web Foundation do not share a common definition of data quality.
 Arguably, the diversity of existing quality indicators prevents from a targeted and strategic approach to improving data quality.

At the moment GODI sets out the following indicators for measuring data quality:
  • Completeness of dataset content
  • Accessibility (access-controlled or public access?)
  • Findability of data
  • Processability (machine-readability and amount of effort needed to use data)
  • Timely publication
This leaves out other qualities. We could ask if data is actually understandable by people. For example, is there a description what each part of the data content means (metadata)?   Improving quality by improving the way data is produced Many data quality metrics are (rightfully so) user-focussed. However, it is critical that government as data producers better understand, monitor and improves the inherent quality of the data they produce. Measuring data quality can incentivise governments to design data for impact: by raising awareness of the quality issues that would make data files otherwise practically impossible to use. At Open Knowledge International, we target data producers and the quality issues of data files mostly via the Frictionless Data project. Notable projects include the Data Quality Spec which defines some essential quality aspects for tabular data files. GoodTables provides structural and schema validation of government data, and the Data Quality Dashboard enables open data stakeholders to see data quality metrics for entire data collections “at a glance”, including the amount of errors in a data file. These tools help to develop a more systematic assessment of the technical processability and usability of data.

A call for joint work towards better data quality

We are aware that good data quality requires solutions jointly working together. Therefore, we would love to hear your feedback. What are your experiences with open data quality? Which quality issues hinder you from using open data? How do you define these data qualities? What could the GODI team improve?  Please let us know by joining the conversation about GODI on our forum.