You are browsing the archive for Global Open Data Index.

How open is government data in Africa?

- March 5, 2019 in africa, Global Open Data Index, Open Data Index, research

Findings from the Africa Open Data Index and Africa Data Revolution Report

Today, we are pleased to announce the results of Open Knowledge International’s Africa Open Data Index. This regional version of our Global Open Data Index collected baseline data on open data publication in 30 African countries to provide input for the second Africa Data Revolution Report. Based on an adaptation of the methodology for the Global Open Data Index,  this project mapped out to what extent African public institutions make key datasets available as open data online. Beyond scrutinising data availability, digitisation degree, and openness of national datasets, we considered the broader landscape of actors involved in the production of government data such as private actors. Key datasets and methodology were developed in collaboration with the United Nations Development Program (UNDP), the International Development Research Centre (IDRC), and well as the World Wide Web Foundation. We focused on national key datasets such as:
  1. Data describing processes of government bodies at the highest administrative level (e.g. federal government budgets);
  2. Data produced by sub-national actors but collected by a national agency (e.g. certain statistical information).
We also captured if data was available on sub-national levels or by private companies but did not assign scores to these sets. You can find the detailed methodology here. Ultimately, the key datasets we considered are:
  • Administrative records: budgets, procurement information, company registers
  • Legislative data: national law
  • Statistical data: core economic statistics, health, gender, educational and environmental statistics
  • Infrastructural data
  • Agricultural data
  • Election results
  • Geographic information and land ownership

Figure 1: Screenshot of the Africa Open Data Index Interface

Understanding who produces government data

Many government agencies produce at least parts of the key datasets we assessed. Some key datasets, such as environmental data, are rarely produced. For instance, air pollution and water quality data are sometimes produced in individual administrative zones, but not on national levels. Some initiatives assist producing data on deforestation, such as REDD+ or the Congo Basin Forest Atlases, with the assistance of the World Resources Institute (WRI) and USAID. Multiple search strategies may be required to identify agencies producing and publishing official records. Some agencies develop public databases, search interfaces and other dedicated infrastructure to facilitate search and retrieval. Statistical yearbooks are another useful access point to several information groups, including economic and social statistics as well as figures on environmental degradation or market figures. In several cases it was necessary to consult third-party literature to identify which public institutions hold the remits to collect data such as World Bank’s Land Governance Assessment Framework (LGAF) and reports issued by the Extractives Industries Transparency Initiative (EITI). Sometimes, private companies provide data infrastructure to aggregate and host data centrally. For instance, the company Trimble develops data portals for the extractives sector in 15 countries in Africa. These data portals are used to publish data on mining concession, including geographic boundaries, the size of territory, concession types, licensees, or contract start and duration.

Procuring data infrastructure from private organisations

While being a useful central access point, Trimble’s terms of use do not comply with open licensing requirements. This points to a larger concern regarding appropriate licensing schemes and how they can be integrated into the procurement process. We propose that multi stakeholder initiatives such as the Extractives Industries Transparency Initiative (EITI) and national multi stakeholder groups define appropriate terms of use, recommending the use of standard open licences, when procuring services in order to ensure an appropriate degree of openness to prevent lock-in and public access. An alternative information aggregator using open licence terms is called African Legal Information Institute (AfricanLII), gathering national legal code from several African countries. It is a programme of the Democratic Governance and Rights Unit at the Department of Public Law at the University of Cape Town.

Sometimes stark differences what data gets published  

To test what data gets published online, we defined crucial data points to be included in every key data category (see here). If at least one of these data points was found online, we considered the data category for assessment. This means that we assessed datasets whose completeness can differ across countries. Figure 2 shows which data points are how often provided across our sample of 30 countries.

Figure 2: Percentages of data points found across key datasets. Percentage relative to the total amount of countries (100% = data point available in 30 countries).  Source: Africa Data Revolution Report, pp. 19-20.

Budget and procurement data most often contains the relevant data points we have assessed. Several key statistical indicators are provided fairly commonly, too. Agricultural data, environmental data and land ownership data are least commonly provided. For a more thorough analysis we recommend to read the Africa Data Revolution Report, pages 16-22.

One third of the data is provided in a timely manner

To assess timely publication our research considered whether governments publish data in a particular update frequency. Figure 3 shows a clear difference in timely data provision across different data types. The y-scale indicates the percentage of countries publishing updated information. A score of 100 would indicate that the total sample of 30 countries publishes a data category in a timely fashion.

Figure 3: Data provision across the various datasets

We found significant differences across individual data categories and countries. Roughly three out of four countries update their budget data (80% of all countries), national laws (73% of all countries) and procurement information (70% of all countries) in a timely manner. Approximately half of all countries publish updated elections records (50% of all countries), or keep their company registers up-to-date (47% of all countries). All other data categories are published in a timely manner only by a fraction of the assessed countries. For instance, the majority of all countries does not provide updated statistical information. We strongly advise to interpret these findings as trends rather than representative representations of timely data publication. This has several reasons. In some data categories, we included considerably more and diverse data points. For instance, the agricultural data category includes not only statistics on crop yields but also short-term weather forecasts. If one of these data types was not provided in a timely manner, the data category was considered not to be updated. Furthermore, if a country did not provide timestamps and metadata, we did not consider the data to be updated, as we were unable to proof the opposite.

Open licensing and machine-readability

Only 6% of all data (28 out of 420 datasets assessed) is openly licensed in compliance with the criteria laid out by the Open Definition. Open licence terms are used by statistical offices in Botswana, Senegal, Rwanda, and Somalia, as well as open data portals in Cote d’Ivoire, Eritrea and Kenya and Mauritius. Usually, websites provide copyright notes but do not apply licence terms dedicated to the website’s data. In rare cases we found a Creative Commons Attribution (CC-BY) licence being used. More common are bespoke terms that are compliant with the Open Definition. 14.5% of all data (61 out of 420 datasets assessed) is provided in at least one machine-readable format. Most data, however, is provided in printed reports, digitised as PDFs, or embedded on websites in HTML. Importantly, some types of data, such as land records, may still be in the process of digitisation. If we found that governments hold paper-based records, we tested if our researchers may request the data. If this was not the case, we did not consider the data for our assessment.

Recommendations

The following recommendations are excerpts from the Africa Data Revolution Report 2018. A comprehensive list of recommendations can be found in the report itself. On the basis of our findings we recommend that public institutions:
  • Communicate clearly on their agency websites what data they are collecting about different government activities.
  • Clarify which data has authoritative status in case multiple versions exist: Metadata must be available clarifying provenance and authoritative status of data. This is important in cases where multiple entities collect data, or whenever governments gather data with the help of international organisations, bilateral donors, foreign governments, or others.
  • Make data permanently accessible and findable: Data should be made available at a permanent internet location and in a stable data format for as long as possible. Avoid broken links and provide links to the data whenever you publish data elsewhere (for example via a statistical agency). Add ​metadata​ to ensure that data can be understood by citizens and found via search engines.
  • When procuring data, define a set of terms of use to ensure the appropriate  degree of openness: Private vendors may want to license data under proprietary terms, which may limit data accessibility. Research found that many data-intense projects in development contexts use haphazard, proprietary licence terms which may prevent the public from accessing data, increase complexity of use terms, and costs of data access.
  • Provide data in machine-readable formats: Ensure that data is processable. ​Raw data must be published in machine-readable formats that are user friendly.
  • Use standard open licences: Use CC0 for public domain dedication or standardized open licences, preferably CC BY 4.0. They can be reused by anyone, which helps ensure compatibility with other datasets. Clarify if data falls under the scope of copyright, or similar rights. If information is in the public domain, apply legally non-binding notices to your data. If you opt for a custom open licence, ensure compatibility with the Open Definition. It is strongly recommended to submit the licence for approval under the Open Definition.
  • Avoid confusion around licence terms: Attach the licence clearly to the information to which it applies. Clearly separate a website’s terms and conditions from the terms of open licences. Maintain stable links to licences so that users can access licence terms at all times.

More information

We have gathered all raw data in a summary spreadsheet. Browse the results and use the links we provide to reach a dataset of interest directly. If you are interested in specific country assessments, please find here our research diaries. The Open Data Survey tool, powering this project as well as our Global Open Data Index is open to be reused. If you are interested in setting up a regional or national version, get in touch with us at index@okfn.org.   Acknowledgements We would like to thank the experts at Local Development Research Institute (LDRI), the Communauté Afrique Francophone pour les Données Ouvertes (CAFDO) and the Access to Knowledge for Development Center (A2K4D) at the American University, Cairo for advising on the methodology and their support throughout the research process. Furthermore, we would like to thank our 30 country researchers, as well as our expert reviewers Codrina Maria Ilie, Jennifer Walker, and Oscar Montiel. Finally, we would like to thank our partners at the United Nations Development Programme, the International Development Research Centre and the Web Foundation, without whose support this project would not have been possible.

New report: Governing by rankings – How the Global Open Data Index helps advance the open data agenda

- November 29, 2017 in Featured, Global Open Data Index, godi, GODI16, open data survey, research

This blogpost was jointly written by Danny Lämmerhirt and Mária Žuffová (University of Strathclyde). We are pleased to announce our latest report Governing by rankings – How the Global Open Data Index helps advance the open data agenda. The Global Open Data Index (GODI) is one of the largest worldwide assessments of how well governments publish open data, coordinated by Open Knowledge International since 2013. Over the years we observed how GODI is used to monitor open data publication. But to date, less was known how​ ​GODI​ ​may​ ​translate​ ​into​ ​open​ ​data​ ​policies​ ​and publication​. How does GODI mobilise support for open data? Which actors are mobilised? Which aspects of GODI are useful, and which are not? Our latest report provides insights to these questions.

Why does this research matter?

Global governance indices like GODI enjoy great popularity due to their capacity to count, calculate, and compare what is otherwise hardly comparable. A wealth of research – from science and technology studies to sociology of quantification and international policy – shows that the effects of governance indicators are complex (our report provides an extensive reading list). Different audiences can take up indices to different (unintended) ends. It is therefore paramount to trace the effects of governance indicators to inform their future design. The report argues that there are multiple ways of looking for ‘impacts’ depending on different audiences, and how they put GODI into practice. Does a comparative open data ranking like GODI help mobilise high-level policy commitments? Does it incentivise individual government agencies to adjust and improve the publication of open data? Does it open up spaces for discussion and deliberation between government and civil society? This thinking builds on an earlier report by Open Knowledge International arguing that indicators have different audiences, with different lived experiences, needs, and agendas. While any form of measurement needs to align with these needs to become actionable (which affects how the impact of indicators will take shape), it also needs to retain comparability.

Our findings

We used Argentina, United Kingdom and Ukraine as case studies to represent different degrees of open data publication, economic development and political set-up. Our report, drawing from a series of twelve interviews and document analysis, suggests that GODI drives change primarily from within government. We assume this finding is partly due to our limited sample size. While key actors in the government are easy to identify, as open data publication is often one of their job responsibilities,  further research is needed to identify more civil society actors and how they engage with GODI. Below we describe nine ways how GODI influences open data policy and publication.
  1. Getting international visibility and achieving progress in country rankings or generally high ranking may incentivise and maintain high-level political support for open data, despite non-comparability of results across years.  
  2. In the absence of open data legislation, GODI has been used by Argentinian government as a soft policy tool to pressure other government agencies to publish data.
  3. Government agencies tasked with implementing open data used GODI to reward and point out progress made by other agencies, but also flag blockages to high-level politicians.  
  4. GODI sets standards what datasets to publish and sets a baseline for improvement. Outcomes are debatable around categories where the central government does not have easy political levers to publish data.
  5. GODI may be confounded with broader commitments to open government and used as an argument to reduce investment in other aspects of open government agenda. In the past, some high-level politicians presented  high ranking in GODI as evidence of government transparency and obsoletion of other ways of providing government information.  
  6. This effect may possibly be exacerbated by superficial media coverage that reports on the ranking without engaging with broader information and transparency policies. An analysis of Google News results suggests that journalists tend to reproduce (mostly politicians’) misconceptions and confound a good ranking in GODI with a high degree of government transparency and openness.
  7. Our findings suggest that individuals and organisations working around transparency and anti-corruption make little use of GODI due to a lack of detail and a misalignment with their specialised work tasks. For instance, Transparency International Ukraine uses the Transparent Public Procurement Rating to evaluate the legal framework, aside from the publication of open data.
  8. On the other hand, academics show interest to GODI to develop new governance indicators. They also often use country scores as a proxy for measuring open data availability.
  9. GODI has a potential for use in data journalism. Data journalism trainers may use it as a source of government data during their trainings.  

What we learned and the road ahead

Our research suggests that governments in all analysed countries pay attention to GODI.  With a few exceptions, they use it mostly to support open data publication and pave the way for new open data policies. While this is a promising finding, it has important implications for GODI and its design. If GODI sets standards in open data publication, as some interviewees from the government suggest, it needs to make sure to represent different data demands in the assessment and to encourage the implementation of sound policies. The challenge is to support policy development, which is often a lengthy process as opposed to short-lived rank-seeking. Some interviewees suggested valuable avenues for GODI’s design. For instance, assessing progress in open data publication perpetually rather than once a year over a limited timespan would require a long-term commitment to open data publication and better opportunities for civic engagement, as it would prevent governments from updating datasets once a year before GODI’s deadline only. Another route forward is discussed in another recent research by OKI, highlighting the potential to adjust an open data index to align it more closely to specific needs of topical expert organisations. Beyond engaging via GODI, civil society and academia might also participate in the development of new data monitoring instruments such as the Open Data Survey, that are relevant for their mission.    

How do open data measurements help water advocates to advance their mission?

- November 23, 2017 in Global Open Data Index, godi, GODI16, open data survey, WASH, water quality

This blogpost was jointly written by Danny Lämmerhirt and Nisha Thompson (DataMeet). Since its creation, the open data community has been at the heart of the Global Open Data Index (GODI). By teaming up with expert civil society organisations we define key datasets that should be opened by government to align with civil society’s priorities. We assumed that GODI also teaches our community to become more literate about government institutions, regulatory systems and management procedures that create data in the first place – making GODI an engagement tool with government.

Tracing the publics of water data

Over the past few months we have reevaluated these assumptions. How do different members of civil society perceive the data assessed by GODI? Is the data usable to advance their mission? How can GODI be improved to accommodate and reflect the needs of civil society? How would we go about developing user-centric open data measurements and would it be worth to run more local and contextual assessments? As part of this user research, OKI and DataMeet (a community of data science and open data enthusiasts in India) teamed up to investigate the needs of civic organisations in the water, sanitation and health (WASH) sector. GODI assesses whether governments release information on water quality, that is pollution levels, per water source. In detail this means that we check whether water data is available at potentially a household level or  at each untreated public water source such as a lake or river. The research was conducted by DataMeet and supervised by OKI, and included interviews and workshops with fifteen different organisations. In this blogpost we share insights on how law firms, NGOs, academic institutions, funding and research organisations perceive the usefulness of GODI for their work. Our research focussed on the South Asian countries India, Pakistan, Nepal, and Bangladesh. All countries face similar issues with ensuring safe water to their populations because of an over-reliance on groundwater, geogenic pollutants like arsenic, and high pollutants from industry, urbanisation, farming, and poor sanitation.

According to the latest GODI results, openness of water quality data remains low worldwide.

What kinds of water data matter to organisations in the water sector?

Whilst all interviewed organisations have a stake in access to clean water for citizens, they have very different motivations to use water quality data. Governmental water quality data is needed to
  1. Monitor government activities and highlight general issues with water management (for advocacy groups).
  2. Create a baseline to compare against civil society data (for organisations implementing water management systems)
  3. Detect geographic target areas of under-provision as well as specific water management problems to guide investment choices (for funding agencies and decision-makers)
Each use case requires data with different quality. Some advocacy interviewees told us that government data, despite a potential poor reliability, is enough to make the case that water quality is severely affected across their country. In contrast, researchers have a need for data that is provided continuously and at short updating cycles. Such data may not be provided by government. Government data is seen as support for their background research, but not a primary source of information. Funders and other decision-makers use water quality data largely for monitoring and evaluation – mostly to make sure their money is being used and is impactful. They will sometimes use their own water quality data to make the point that government data is not adequate. Funders push for data collection at a project level not continuous monitoring which can lead to gaps in understanding. GODI’s definition of water quality data is output-oriented and of general usefulness. It enables finding the answer to whether the water that people can access is clean or not. Yet, organisations on the ground need other data – some of which is process-oriented – to understand how water management services are regulated and governed or what laboratory is tasked to collect data. A major issue for meaningful engagement with water-related data is the complexity of water management systems. In the context of South Asia, managing, tracking, and safeguarding water resources for use today and in the future is complex. Water management systems, from domestic to industrial to agricultural ones, are diverse and hard to examine and keep accountable. Where water is coming from, how much of it is being used and for what, and then how waste is being disposed of are all crucial questions to these systems. Yet there is very little data available to address all these questions.

How do organisations in the WASH sector perceive the GODI interface?

GODI has an obvious drawback for the interviewed organisations: transparency is not a goal for organisations working on the ground and does not in itself provoke an increase in access to safe water or environmental conservation. GODI measures the publication of water quality data, but is not seen to stimulate improved outcomes. It also does not interact with the corresponding government agency. One part of GODI’s theory of change is that civil society becomes literate about government institutions and can engage with government via the publication of government data. Our interviews suggest that our theory of change needs to be reconsidered or new survey tools need to be developed that can enhance engagement between civil society and government. Below we share some ideas for future scenarios.

Our learnings and the road ahead

Adding questions to GODI

Interviews show that GODI’s current definition of water quality data does not always align with the needs of organisations on the ground. If GODI wants to be useful to organisations in the WASH sector, new questions can be added to the survey and be used as a jumping off point for outreach to groups. Some examples include:
  1. Add a question regarding metadata and methodology documentation to capture quality and provenance water data, but also where we found and selected data.
  2. Add a question regarding who did the data collection government or partner organisation. This allows community members to trace the data producers and engage with them.
  3. Assess transparency of water reports. Reports should be considered since they are an important source of information for civil society.

Customising the Open Data Survey for regional and local assessments

Many interviewees showed an interest in assessing water quality data at the regional and hyperlocal level. DataMeet is planning to customise the Open Data Survey and to team up with local WASH organisations to develop and maintain a prototype for a regional assessment of water quality. India will be our test case since there is local data for the whole country available at varying degrees across states. This may include to also assess quality of data and access to metadata. Highest transparency would mean to have water data from each individual lab were the samples are sent. Another use case of the Open Data Survey would include to measure the transparency of water laboratories. Bringing more transparency and accountability to labs would be the most valuable for ground groups sending samples to labs across the country.

Map of high (> 30 mg/l) fluoride values from 2013–14. From: The Gonda Water Data story

Storytelling through data

Whilst some interviewees saw little use in governmental water quality data, its usefulness can be greatly enhanced when combined with other information. As discussed earlier, governmental water data gives snapshots and may provide baseline values that serve NGOs as rough orientation for their work. Data visualisations could present river and water basin quality and tell stories about the ecological and health effects. Behavior change is a big issue when adapting to sanitation and hygiene interventions. Water quality and health data can be combined to educate people. If you got sick, have you checked your water? Do you use a public toilet? Are you washing your hands? This type of narration does not require granular accurate data.

Comparing water quality standards

Different countries and organisations have different standards for what counts as high water pollution levels. Another project could assess how the needs of South Asian countries are being served by a comparing pollution levels with different standards. For instance, fluorosis is an issue in certain parts of India: not just from high fluoride levels but also because of poor nutrition in those areas. Should fluoride affected areas have lower permissible amounts in poorer countries? These questions could be used to make water quality data actionable to advocacy  groups.

The future of the Global Open Data Index: assessing the possibilities

- November 1, 2017 in Global Open Data Index, godi, GODI16, Open Government Data, open-government

In the last couple of months we have received questions regarding the status of the new Global Open Data Index (GODI) from a few members of our Network. This blogpost is to update everyone on the status of GODI and what comes next. But first, some context: GODI is one of the biggest assessments of the state of open government data globally, alongside the Web Foundation’s Open Data Barometer. We notice persistent obstacles for open data year-by-year. High-income countries regularly secure top rankings, yet overall there is little to no development in many countries. As our latest State Of Open Government Data in 2017 report shows, data is often not made available publicly at all. If so, we see many issues around findability, quality, processability, and licensing. Individual countries are notable exceptions to the rule. The Open Data Barometer made similar observations in its latest report, mentioning a slow uptake of policy, as well as persistent data quality issues in countries that provide open data. So there is still a lot of work to be done. To resolve issues like engagement with our community, we started to explore alternative paths for GODI. This includes a shift in focus from a mere measurement tool to a stronger conversational device between our user groups throughout the process. We understand that we need to speak to new audiences and focus on measurement as a tool in real world applications. We need to focus more on this. We want to understand the use cases of the Open Data Survey (the tool that powers GODI and the Open Data Census) in different contexts and with different goals. We have barely seen a few of the possible uses of the tool in the open data sphere and we want to see even more. In order to learn more about how GODI is taken up by different user groups, we are also currently exploring GODI’s effects on open data policy and publication. We wish to understand more systematically how individual elements of the GODI interface (such as country ranking, dataset results, discuss forum entries) help mobilising support for open data among different user groups. Our goal is to understand how to improve our survey design and workflow so that they more directly support action around open data policy and publication. In addition we are developing a new vision for the Open Data Index to either measure open data on a regional and city-level or by topical areas. We will elaborate on this vision in a follow-up blogpost soon. Taking this all into account, we have decided to focus on working on the aforementioned use cases and a regional Index during 2018. In the meantime, we will still work with our community to define a vision that will make GODI a sustainable measurement tool: we understand that tracking the changes in government data publication is crucial for the activists and governments themselves. We know that progress around open data is slower than we would like it to be, but therefore we need to ensure that discussions around open data do not end. Please do not hesitate to submit new discussions around country entries on our forum or reach out to us if you have any ideas on how to take GODI forwards and improve. If you’re running an Open Data Census, we we’ll continue giving you support in the measurement you’re currently working on, whether it’s local, regional or you have any new idea of a Census you’d like to try. If you want to run your own Census, you can request it here, or send an email to index@okfn.org to see how we could collaborate further.

Research call: Mapping the impacts of the Global Open Data Index

- September 6, 2017 in Global Open Data Index

The Global Open Data Index (GODI) is a worldwide assessment of open data publication in more than 90 countries. It provides evidence how well governments perform in open data publication. This call invites interested researchers and organisations to systematically study the effects of the Global Open Data Index on open data publication and the open data ecosystem. The study will identify different actors engaged around GODI, and how the information provided by GODI helped advance open data policy and publication. It will do so by investigating a sample of three countries with different degrees of open data adoption. The work will be conducted in close collaboration with Open Knowledge International’s (OKI) research department who will provide guidance, review and assistance throughout the project.   We invite interested parties to send their costed proposal to research@okfn.org. In order to be eligible, the proposal must include research background, a short description why they are interested in the topic and how they want to research it (300 words maximum), a track record demonstrating knowledge of the topic, as well as a written research sample around open data or related fields. Finally, the proposal must also specify how much time will be committed to the work and for what cost (in GBP or USD). Due to the nature of the funding supporting this work, we unfortunately cannot accept proposals from US-based people or organisations. Please make sure the submission is made before the proposal deadline of Wed 13 Sept, 21:00 UTC.

Outline

 

Background

The Global Open Data Index (GODI) is a worldwide assessment of open data publication in more than 90 countries. It provides evidence how well governments perform in open data publication. This includes mapping accessibility and access controls, findability of data, key data characteristics, as well as open licensing and machine-readability. At the same time GODI provides a venue for open data advocates and civil servants to discuss the production of open data. Evidence shows that governance indicators drive change if they embrace dialogue and mutual ownership of those who are assessed, and those who assess. This year we wanted to use the launch of GODI to spark dialogue and provide a venue for the ensuing discussions. Through this dialogue, governments learn about key datasets and data quality issues, while also receiving targeted feedback to help them improve. Furthermore, every year many interactions happen outside of the GODI process, not including the GODI staff or public discussions. Instead results are discussed within public institutions, or among civic actors and public institutions. Some scarce evidence of GODI’s outcomes is available, yet a systematic understanding of the diverse types of effects is missing to date.

Scope of research

This research is intended to get a systematic understanding of the effects of the Global Open Data Index on open data publication and the open data ecosystem. It addresses three research questions:
  1. In what ways does the Global Open Data Index process mobilize support for open data in countries with different degrees of open data policy and publication? How does this support manifest itself?
  2. How does the Global Open Data Index influence open data publication in governments both in terms of quantity and quality of data?
  3. How do different elements of the Global Open Data Index help governments and civil society actors to drive progress  around question 1 and 2?
GODI’s effects can tentatively be grouped into high-level policy and strategy development as well as strategy implementation and ongoing publication. This research will assess how different actors such as civil servants, high-level government officials, open data advocates and communities engage with different elements of GODI and how this helps advancing open data policy and publication. The research should also, whenever applicable, provide a critical account of GODI’s adverse effects. This can include ‘ceiling effects’, tunnel vision and reactivity, or other effects. The research will assess these effects in three countries. These may include Argentina, Colombia, Ukraine, South Africa, Thailand, or others. It is possible to propose alternative countries, if the researcher has strong experience in those or if it would help gathering data for the research. Proposals should specify which three  countries would be assessed. If alternative countries are proposed, they should meet the following criteria:
  1. One country without national open data policy, one country with a recent open data policy (in effect between 3 months and 2 years), as well as countries with established open data policies older than 2 years)
  2. A mix of countries with different endorsement for GODI, including countries who actively announced to increase their ranking (high importance) and countries where no public claims for open data improvement are documented
  3. Presence of country in past two GODI editions
  4. May include members of the Open Government Partnership and Open Data Charter adopters, as well as non-members.

Deliverables

The work will provide a written report between 5000 and 7000 words length addressing each of the research questions. The report must include a clearly written methodology section and country sampling approach. The desired format is a narrative report in English. A qualitative, critical assessment of GODI’s effects on open data policy and publication is expected. It needs to describe the actors using GODI, how they interacted with different aspects of GODI, and how this helped to drive change around the first two research questions outlined above. Furthermore following deliverables are expected:
  • Interviews with least four interviewees per country
  • A semi-structured  interview guide
  • Draft report by 15 October, structured around country portraits for three sample countries.
  • Weekly catch-ups with the Research team at OKI
  • Final report by 1 November

Methods and data sources

The researcher can draw from several sources to start this research, including OKI’s country contacts, Global Open Data Index scores, etc. Suggested methodology approaches include interviews with government officials and GODI contributors, as well as document analysis. Alternative research approaches and data sources shall be discussed with OKI’s research team. The research team will provide assistance in sampling interviewees in the initial phase of the research.

Activities

It is expected that this work is conducted in close contact with OKI’s research department. We will arrange a kick-off meeting to discuss your approach and have weekly calls to discuss activity and progress on the work. Early drafts will be shared with the OKI team to provide comments and discuss them with you. In addition we will have a final reflection call. Remote availability is expected (via email, Skype, Slack, or other channels). Overall research outline and goals will be discussed and agreed upon with the research lead of GODI who will help in sampling countries and will review project progress.

Decision criteria

We will base our decision of selecting a research party on following criteria:
  • Evidence of an understanding of open data assessments and indicators, and their influence on policy development and implementation.
  • Track record in the field of open data assessment and measurement.
  • Clarity and feasibility of methodology you propose to follow.
Due to the nature of the funding supporting this work, we unfortunately cannot accept proposals from US-based people or organisations. Please make sure the submission is made before the proposal deadline of Wed 13 Sept, 21:00 UTC.

Using the Global Open Data Index to strengthen open data policies: Best practices from Mexico

- August 16, 2017 in Global Open Data Index, Open Data Index, Open Government Data, Open Knowledge

This is a blog post coauthored with Enrique Zapata, of the Mexican National Digital Strategy. As part of the last Global Open Data Index (GODI), Open Knowledge International (OKI) decided to have a dialogue phase, where we invited individuals, CSOs, and national governments to exchange different points of view, knowledge about the data and understand data publication in a more useful way. In this process, we had a number of valuable exchanges that we tried to capture in our report about the state of open government data in 2017, as well as the records in the forum. Additionally, we decided to highlight the dialogue process between the government and civil society in Mexico and their results towards improving data publication in the executive authority, as well as funding to expand this work to other authorities and improve the GODI process. Here is what we learned from the Mexican dialogue:

The submission process

During this stage, GODI tries to directly evaluate how easy it is to find and their data quality in general. To achieve this, civil society and government actors discussed how to best submit and agreed to submit together, based on the actual data availability.   Besides creating an open space to discuss open data in Mexico and agreeing on a joint submission process, this exercise showed some room for improvement in the characteristics that GODI measured in 2016:
  • Open licenses: In Mexico and many other countries, the licenses are linked to datasets through open data platforms. This showed some discrepancies with the sources referenced by the reviewers since the data could be found in different sites where the license application was not clear.
  • Data findability: Most of the requested datasets assess in GODI are the responsibility of the federal government and are available in datos.gob.mx. Nevertheless, the titles to identify the datasets are based on technical regulation needs, which makes it difficult for data users to easily reach the data.
  • Differences of government levels and authorities: GODI assesses national governments but some of these datasets – such as land rights or national laws – are in the hands of other authorities or local governments. This meant that some datasets can’t be published by the federal government since it’s not in their jurisdiction and they can’t make publication of these data mandatory.
 

Open dialogue and the review process

  During the review stage, taking the feedback into account, the Open Data Office of the National Digital Strategy worked on some of them. They summoned a new session with civil society, including representatives from the Open Data Charter and OKI in order to:
  • Agree on the state of the data in Mexico according to GODI characteristics;
  • Show the updates and publication of data requested by GODI;
  • Discuss paths to publish data that is not responsibility of the federal government;
  • Converse about how they could continue to strengthen the Mexican Open Data Policy.
  The results   As a result of this dialogue, we agreed six actions that could be implemented internationally beyond just the Mexican context both by governments with centralised open data repositories and those which don’t centralise their data, as well as a way to improve the GODI methodology:  
  1. Open dialogue during the GODI process: Mexico was the first country to develop a structured dialogue to agree with open data experts from civil society about submissions to GODI. The Mexican government will seek to replicate this process in future evaluations and include new groups to promote open data use in the country. OKI will take this experience into account to improve the GODI processes in the future.
  2. Open licenses by default: The Mexican government is reviewing and modifying their regulations to implement the terms of Libre Uso MX for every website, platform and online tool of the national government. This is an example of good practice which OKI have highlighted in our ongoing Open Licensing research.
  3. “GODI” data group in CKAN: Most data repositories allow users to create thematic groups. In the case of GODI, the Mexican government created the “Global Open Data Index” group in datos.gob.mx. This will allow users to access these datasets based on their specific needs.
  4. Create a link between government built visualization tools and datos.gob.mx: The visualisations and reference tools tend to be the first point of contact for citizens. For this reason, the Mexican government will have new regulations in their upcoming Open Data Policy so that any new development includes visible links to the open data they use.
  5. Multiple access points for data: In August 2018, the Mexican government will launch a new section on datos.gob.mx to provide non-technical users easy access to valuable data. These data called “‘Infraestructura de Datos Abiertos MX’ will be divided into five easy-to-explore and understand categories.
  6. Common language for data sets: Government naming conventions aren’t the easiest to understand and can make it difficult to access data. The Mexican government has agreed to change the names to use more colloquial language can help on data findability and promote their use. In case this is not possible with some datasets, the government will go for an option similar to the one established in point 5.
We hope these changes will be useful for data users as well as other governments who are looking to improve their publication policies. Got any other ideas? Share them with us on Twitter by messaging @OKFN or send us an email to index@okfn.org  

The final Global Open Data Index is now live

- June 15, 2017 in Global Open Data Index

The updated Global Open Data Index has been published today, along with our report on the state of Open Data this year. The report includes a broad overview of the problems we found around data publication and how we can improve government open data. You can download the full report here. Also, after the Public Dialogue phase, we have updated the Index. You can see the updated edition here We will also keep our forum open for discussions about open data quality and publication. You can see the conversation here.  

Ποιότητα ανοικτών δεδομένων – η επόμενη αλλαγή στα ανοικτά δεδομένα;

- June 10, 2017 in Featured, Featured @en, Global Open Data Index, News, Open Data Handbook, ανοικτά δεδομένα, Νέα

Από το Open Knowledge International Αυτή η ανάρτηση είναι μέρος του Global Open Data Blog. Είναι ένα κάλεσμα να επαναπροσδιορίσουμε την προσοχή μας στα πολλά διαφορετικά στοιχεία που συμβάλλουν στην «καλή ποιότητα» των ανοικτών δεδομένων, στις ανταλλαγές μεταξύ τους και στον τρόπο με τον οποίο υποστηρίζουν τη χρηστικότητα των δεδομένων (βλ. εδώ μερικά σημαντικά έργα […]

What data do we need? The story of the Cadasta GODI fellowship

- June 9, 2017 in Global Open Data Index

This blogpost was written by Lindsay Ferris and Mor Rubinstein   There is a lot of data out there, but which data users needs to solve their issues? How can we, as an external body, know which data is vital so we can measure it?  Moreover, what to do when data is published in so many levels – local, regional and federal that is so hard to find? Every year we are thinking about these questions in order to improve the Global Open Data Index (GODI), and make it more relevant to civil society. Having the relevant data characteristics is crucial for data use since without specific data it is hard to analysed and learn. After the publication of the GODI 2015, Cadasta Foundation approached us to discuss the results of GODI in the land ownership category.  Throughout this initial, lively discussion, we noticed that a systematic understanding of land data in general, and land ownership data in particular, was missing. An idea emerged: What if we will We decided to bridge these gaps to build a systematic understanding of land ownership data for the 2016 GODI. And so came to life the idea of the GODI fellowship. It was simple – Cadasta will have a fellow for a period of 6 months to explore the publication of data that is relevant to land ownership issues. The fellowship would be funded by Cadasta and the fellow would be an integral part of the team. OKI would give in-kind support of guidance and research. The fellowship goals were:
  • Global policy analysis of open data in the field of land and resource rights
  • Better definition for the land ownership dataset in the Global Open Data Index for 2016;
  • Mapping stakeholders and partners for the Global Open Data Index (for submissions);
  • Recommendations for a thematic Index;
  • A working paper or a series of blog posts about open data in land and resource ownership.
Throughout the fellowship, Lindsay conducted interviews with land experts, NGOs and government officials as well as on-going desk research on the land data publication practices across different contexts. She established 4 key outputs:
  1. Outlining the challenges of opening land ownership data. Blog post here.
  2. Mapping the different types of land data and their availability. Overview here.
  3. Assessing the privacy and security risks of opening certain types of land data. See our work here: cadasta.org/open-data/assessing-the-risks-of-opening-property-rights-data/
4.Identifying user needs and creating user personas for open land data.  User personas here.   Throughout the GODI process, our aim is to advocate for datasets that different stakeholders actually need and that make sense within the context in which they are published. For example, one of the main challenges in land ownership is that data is not always recorded or gathered by the federal level, and is collect in cities and regions. One of the primary users of land ownership data are other government agencies. Having a grasp of this type of knowledge helped us better define the land ownership dataset for the GODI. Ultimately, we developed a thoughtful definition based on these reflections and recommendations.   For us at OKI, having someone dedicated in an organisation that is an expert in a data category was immensely helpful. It makes the index categories more relevant for real life use  and help us to measure the categories better. It helps us to make sure our assumptions and foundation for the research are good. For Cadasta, having a person dedicate on open data helped to create a knowledge based and resources that help them look at the open data better. It was a win – win for both sides. In fact, The work Lindsay was doing was very valuable for Cadasra that Lindsay time was extended at Cassata and she worked on writing a case study about open data and land in Sao Paulo and Land Debate final report and a paper on Open Data in Land Governance for the 2017 World Bank Land and Poverty Conference. Going forward in the future of open data assessment, we believe that having this expert input in the design of the survey is crucial. Having only an open data lense can lead us to bias and wrong measurements. In our vision, we see the GODI tool as community owned assessment, that can help all fields to promote, find and use the data that is relevant for them. Interested of thinking the future of your field through open data? Write to us on the forum – https://discuss.okfn.org/c/open-data-index/global-open-data-index-2016

The state of open licensing in 2017

- June 8, 2017 in Global Open Data Index, Open Definition, Open Government Data, Open Knowledge

This blog post is part of our Global Open Data Index (GODI) blog series. Firstly, it discusses what open licensing is and why it is crucial for opening up data. Afterward, it outlines the most urgent issues around open licensing as identified in the latest edition of the Global Open Data Index and concludes with 10 recommendations how open data advocates can unlock this data. The blog post was jointly written by Danny Lämmerhirt and Freyja van den Boom.   Open data must be reusable by anyone and users need the right to access and use data freely, for any purpose. But legal conditions often block the effective use of data. Whoever wants to use existing data needs to know whether they have the right to do so. Researchers cannot use others’ data if they are unsure whether they would be violating intellectual property rights. For example, a developer wanting to locate multinational companies in different countries and visualize their paid taxes can’t do so unless they can find how this business information is licensed. Having clear and open licenses attached to the data, which allow for use with the least restrictions possible, are necessary to make this happen.   Yet, open licenses still have a long way to go. The Global Open Data Index (GODI) 2016/17 shows that only a small portion of government data can be used without legal restrictions. This blog post discusses the status of ‘legal’ openness. We start by explaining what open licenses are and discussing GODI’s most recent findings around open licensing. And we conclude by offering policy- and decisionmakers practical recommendations to improve open licensing.   What is an open license? As the Open Definition states, data is legally open “if the legal conditions under which data is provided allow for free use”.  For a license to be an open license it must comply with the conditions set out under the  Open Definition 2.1.  These legal conditions include specific requirements on use, non-discrimination, redistribution, modification, and no charge.   Why do we need open licenses? Data may fall under copyright protection. Copyright grants the author of an original work exclusive rights over that work. If you want to use a work under copyright protection you need to have permission. There are exceptions and limitations to copyright when permission is not needed for example when the data is in the ‘public domain’ it is not or no longer protected by copyright, or when your use is permitted under an exception.   Be aware that some countries also allow legal protection for databases which limit what use can be made of the data and the database. It is important to check what the national requirements are, as they may differ.   Because some types of data (papers, images) can fall under the scope of copyright protection we need data licensing. Data licensing helps solve problems in practice including not knowing whether the data is indeed copyright protected and how to get permission. Governments should therefore clearly state if their data is in the public domain or when the data falls under the scope of copyright protection what the license is.
  • When data is public domain it is recommended to use the CC0 Public Domain license for clarity.
  • When the data falls under the scope of copyright it is recommended to use an existing Open license such as CC-BY to improve interoperability.
Using Creative Commons or Open Data Commons licenses is best practice. Many governments already apply one of the Creative Commons licenses (see this wiki). Some governments have chosen however to write their own licenses or formulate ‘terms of use’ which grant use rights similar to widely acknowledged open licenses. This is problematic from the perspective of the user because of interoperability. The proliferation of ever more open government licenses has been criticized for a long time. By creating their own versions, governments may add unnecessary information for users, cause incompatibility and significantly reduce reusability of data.  Creative Commons licenses are designed to reduce these problems by clearly communicating use rights and to make the sharing and reuse of works possible.  

The state of open licensing in 2017

Initial results from the GODI 2016/17 show roughly that only 38 percent of the eligible datasets were openly licensed (this value may change slightly after the final publication on June 15). The other licenses include many use restrictions including use limitations to non-commercial purposes, restrictions on reuse and/or modifications of the data.     Where data is openly licensed, best practices are hardly ever followed In the majority of cases, our research team found governments apply general terms of use instead of specific licenses for the data. Open government licenses and Creative Commons licenses were seldom used. As outlined above, this is problematic. Using customized licenses or terms of use may impose additional requirements such as:
  • Require specific attribution statements desired by the publisher
  • Add clauses that make it unclear how data can be reused and modified.
  • Adapt licenses to local legislation
Throughout our assessment, we encountered unnecessary or ambivalent clauses, which in turn may cause legal concerns, especially when people consider to use data commercially. Sometimes we came across redundant clauses that cause more confusion than clarity.  For example clauses may forbid to use data in an unlawful way (see also the discussion here).   Standard open licenses are intended to reduce legal ambiguity and enable everyone to understand use rights. Yet many licenses and terms contain unclear clauses or are not obvious to what data they refer to. This can, for instance, mean that governments restrict the use of substantial parts of a database (and only allow the use of insignificant parts of it). We recommend that governments give clear examples which use cases are acceptable and which ones are not.   Licenses do not make clear enough to what data they apply.  Data should include a link to the license, but this is not commonly done. For instance, in Mexico, we found out that procurement information available via Compranet, the procurement platform for the Federal Government, was openly licensed, but the website does not state this clearly. Mexico hosts the same procurement data on datos.gob.mx and applies an open license to this data. As a government official told us, the procurement data is therefore openly licensed, regardless where it is hosted. But again this is not clear to the user who may find this data on a different website. Therefore we recommend to always have the data accompanied with a link to the license.  We also recommend to have a license notice attached or ‘in’ the data too. And to keep the links updated to avoid ‘link rot’.   The absence of links between data and legal terms makes an assessment of open licenses impossible Users may need to consult legal texts and see if the rights granted to comply with the open definition. Problems arise if there is not a clear explanation or translation available what specific licenses entail for the end user. One problem is that users need to translate the text and when the text is not in a machine-readable format they cannot use translation services. Our experience shows that it was a significant source of error in our assessment. If open data experts struggle to assess public domain status, this problem is even exacerbated for open data users. Assessing public domain status requires substantial knowledge of copyright – something the use of open licenses explicitly wants to avoid.   Copyright notices on websites can confuse users. In several cases, submitters and reviewers were unable to find any terms or conditions. In the absence of any other legal terms, submitters sometimes referred to copyright notices that they found in website footers. These copyright details, however, do not necessarily refer to the actual data. Often they are simply a standard copyright notice referring to the website.

Recommendations for data publishers

Based on our finding we prepared 10 recommendations that policymakers and other government officials should take into account:  
  1. Does the data and/or dataset fall under the scope of IP protection? Often government data does not fall under copyright protection and should not be presented as such. Governments should be aware and clear about the scope of intellectual property (IP) protection.
  2. Use standardized open licenses. Open licenses are easily understandable and should be the first choice. The Open Definition provides conformant licenses that are interoperable with one another.
  3. In some cases, governments might want to use a customized open government license. These should be as open as possible with the least restrictions necessary and compatible (see point 2). To guarantee a license is compatible, the best practice is to submit the license for approval under the Open Definition.
  4. Exactly pinpoint within the license what data it refers to and provide a timestamp when the data has been provided.
  5. Clearly, publish open licensing details next to the data. The license should be clearly attached to the data and be both human and machine-readable. It also helps to have a license notice ‘in’ the data.
  6. Maintain the links to licenses so that users can access license terms at all times.
  7. Highlight the license version and provide context how data can be used.
  8. Whenever possible, avoid restrictive clauses that are not included in standard licenses.
  9. Re-evaluate the web design and avoid confusing and contradictory copyright notices in website footers, as well as disclaimers and terms of use.
  10. When government data is in the public domain by default, make clear to end users what that means for them.