You are browsing the archive for Danny Lämmerhirt.

What data counts in Europe? Towards a public debate on Europe’s high value data and the PSI Directive

- January 16, 2019 in Open Government Data, Open Standards, Policy, research

This blogpost was co-authored by Danny Lämmerhirt and Pierre Chrzanowski (*author note at the bottom) January 22 will mark a crucial moment for the future of open data in Europe. That day, the final trilogue between European Commission, Parliament, and Council is planned to decide over the ratification of the updated PSI Directive. Among others, the European institutions will decide over what counts as ‘high value’ data. What essential information should be made available to the public and how those data infrastructures should be funded and managed are critical questions for the future of the EU. As we will discuss below, there are many ways one might envision the collective ‘value’ of those data. This is a democratic question and we should not be satisfied by an ill and broadly defined proposal. We therefore propose to organise a public debate to collectively define what counts as high value data in Europe.

What does PSI Directive say about high value datasets?  

The European Commission provides several hints in the current revision of the PSI Directive on how it envisions high value datasets. They are determined by one of the following ‘value indicators’:
  • The potential to generate significant social, economic, or environmental benefits,
  • The potential to generate innovative services,
  • The number of users, in particular SMEs,  
  • The revenues they may help generate,  
  • The data’s potential for being combined with other datasets
  • The expected impact on the competitive situation of public undertakings.
Given the strategic role of open data for Europe’s Digital Single Market, these indicators are not surprising. But as we will discuss below, there are several challenges defining them. Also, there are different ways of understanding the importance of data. The annex of the PSI Directive also includes a list of preliminary high value data, drawing primarily from the key datasets defined by Open Knowledge International’s (OKI’s) Global Open Data Index, as well as the G8 Open Data Charter Technical Annex. See the proposed list in the table below. List of categories and high-value datasets:
Category Description
1. Geospatial Data Postcodes, national and local maps (cadastral, topographic, marine, administrative boundaries).
2. Earth observation and environment Space and situ data (monitoring of the weather and of the quality of land and water, seismicity, energy consumption, the energy performance of buildings and emission levels).
3. Meteorological data Weather forecasts, rain, wind and atmospheric pressure.
4. Statistics National, regional and local statistical data with main demographic and economic indicators (gross domestic product, age, unemployment, income, education).
5. Companies Company and business registers (list of registered companies, ownership and management data, registration identifiers).
6. Transport data Public transport timetables of all modes of transport, information on public works and the state of the transport network including traffic information.
  According to the proposal, regardless of who provide them, these datasets shall be available for free, machine-readable and accessible for download, and where appropriate, via APIs. The conditions for re-use shall be compatible with open standard licences.

Towards a public debate on high value datasets at EU level

There has been attempts by EU Member States to define what constitutes high-value data at national level, with different results. In Denmark, basic data has been defined as the five core information public authorities use in their day-to-day case processing and should release. In France, the law for a Digital Republic aims to make available reference datasets that have the greatest economic and social impact. In Estonia, the country relies on the X-Road infrastructure to connect core public information systems, but most of the data remains restricted. Now is the time for a shared and common definition on what constitute high-value datasets at EU level. And this implies an agreement on how we should define them. However, as it stands, there are several issues with the value indicators that the European Commission proposes. For example, how does one define the data’s potential for innovative services? How to confidently attribute revenue gains to the use of open data? How does one assess and compare the social, economic, and environmental benefits of opening up data? Anyone designing these indicators must be very cautious, as metrics to compare social, economic, and environmental benefits may come with methodical biases. Research found for example, that comparing economic and environmental benefits can unfairly favour data of economic value at the expense of fuzzier social benefits, as economic benefits are often more easily quantifiable and definable by default. One form of debating high value datasets could be to discuss what data gets currently published by governments and why. For instance, with their Global Open Data Index, Open Knowledge International has long advocated for the publication of disaggregated, transactional spending figures. Another example is OKI’s Open Data For Tax Justice initiative which wanted to influence the requirements for multinational companies to report their activities in each country (so-called ‘Country-By-Country-Reporting’), and influence a standard for publicly accessible key data.   A public debate of high value data should critically examine the European Commission’s considerations regarding the distortion of competition. What market dynamics are engendered by opening up data? To what extent do existing markets rely on scarce and closed information? Does closed data bring about market failure, as some argue (Zinnbauer 2018)? Could it otherwise hamper fair price mechanisms (for a discussion of these dynamics in open access publishing, see Lawson and Gray 2016)? How would open data change existing market dynamics? What actors proclaim that opening data could purport market distortion, and whose interests do they represent? Lastly, the European Commission does not yet consider cases of government agencies  generating revenue from selling particularly valuable data. The Dutch national company register has for a long time been such a case, as has the German Weather Service. Beyond considering competition, a public debate around high value data should take into account how marginal cost recovery regimes currently work.

What we want to achieve

For these reasons, we want to organise a public discussion to collectively define
  1. i) What should count as a high value datasets, and based on what criteria,
  2. ii) What information high value datasets should include,
  3. ii) What the conditions for access and re-use should be.
The PSI Directive will set the baseline for open data policies across the EU. We are therefore at a critical moment to define what European societies value as key public information. What is at stake is not only a question of economic impact, but the question of how to democratise European institutions, and the role the public can play in determining what data should be opened.

How you can participate

  1. We will use the Open Knowledge forum as main channel for coordination, exchange of information and debate. To join the debate, please add your thoughts to this thread or feel free to start a new discussion for specific topics.
  2. We gather proposals for high value datasets in this spreadsheet. Please feel free to use it as a discussion document, where we can crowdsource alternative ways of valuing data.
  3. We use the PSI Directive Data Census to assess the openness of high value datasets.
We also welcome any reference to scientific paper, blogpost, etc. discussing the issue of high-value datasets. Once we have gathered suggestions for high value datasets, we would like to assess how open proposed high-value datasets are. This will help to provide European countries with a diagnosis of the openness of key data.     Author note: Danny Lämmerhirt is senior researcher on open data, data governance, data commons as well as metrics to improve open governance. He has formerly worked with Open Knowledge International, where he led its research activities, including the methodology development of the Global Open Data Index 2016/17. His work focuses, among others, on the role of metrics for open government, and the effects metrics have on the way institutions work and make decisions. He has supervised and edited several pieces on this topic, including the Open Data Charter’s Measurement Guide. Pierre Chrzanowski is Data Specialist with the World Bank Group and a co-founder of Open Knowledge France local group. As part of his work, he developed the Open Data for Resilience Initiative (OpenDRI) Index, a tool to assess the openness of key datasets for disaster risk management projects. He has also participated in the impact assessment prior to the new PSI Directive proposal and has contributed to the Global Open Data Index as well as the Web Foundation’s Open Data Barometer.

Advancing Sustainability Together: Launching new report on citizen-generated data and its relevance for the SDGs

- December 17, 2018 in citizen generated data, research, SDG

We are pleased to announce the launch of our latest report Advancing Sustainability Together? Citizen-Generated Data and the Sustainable Development Goals. The research is the result of a collaboration with King’s College London, Public Data Lab, as well as the Global Partnership for Sustainable Development Data, and funded by the United Nations Foundation. Citizen-generated data (CGD) expands what gets measured, how, and for what purpose. As the collection and engagement with CGD increases in relevance and visibility, public institutions can learn from existing initiatives about what CGD initiatives do, how they enable different forms of sense-making and how this may further progress around the Sustainable Development Goals. Our report, as well as a guide for governments (find the layouted version here, as well as a living document here) shall help start conversations around the different approaches of doing and organising CGD. When CGD becomes good enough depends on the purpose it is used for but also how CGD is situated in relation to other data. As our work wishes to be illustrative rather than comprehensive, we started with a list of over 230 projects that were associated with the term “citizen-generated data” on Google Search, using an approach known as “search as research” (Rogers, 2013). Outgoing from this list, we developed case studies on a range of prominent CGD examples. The report identifies several benefits CGD can bring for implementing and monitoring the SDGs, underlining the importance for public institutions to further support these initiatives. Key findings:
  • Dealing with data is usually much more than ‘just producing’ data. CGD initiatives open up new types of relationships between individuals, civil society and public institutions. This includes local development and educational programmes, community outreach, and collaborative strategies for monitoring, auditing, planning and decision-making.
  • Generating data takes many shapes, from collecting new data in the field, to compiling, annotating, and structuring existing data to enable new ways of seeing things through data. Accessing and working with existing (government) data is often an important enabling condition for CGD initiatives to start in the first place.
  • CGD initiatives can help gathering data in regions otherwise not reachable. Some CGD approaches may provide updated and detailed data at lower costs and faster than official data collections.
  • Beyond filling data gaps, official measurements can be expanded, complemented, or cross-verified. This includes pattern and trend identification and the creation of baseline indicators for further research. CGD can help governments detect anomalies, test the accuracy of existing monitoring processes, understand the context around phenomena, and initiate its own follow-up data collections.
  • CGD can inform several actions to achieve the SDGs. Beyond education, community engagement and community-based problem solving, this includes baseline research, planning and strategy development, allocation and coordination of public and private programs, as well as improvement to public services.
  • CGD must be ‘good enough’ for different (and varying) purposes. Governments already develop pragmatic ways to negotiate and assess the usefulness of data for a specific task. CGD may be particularly useful when agencies have a clear remit or responsibility to manage a problem.  
  • Data quality can be comparable to official data collections, provided tasks are sufficiently easy to conduct, tool quality is high enough, and sufficient training, resources and quality assurance are provided.
You can find the full report as well as a summary report here. If you are interested in learning more about citizen-generated data, and how to engage with it, we have prepared a guide for everyone interested in engaging with CGD. In addition to our report we have gathered a list of more than 200 organisations, programs, and projects working on CGD. This list is open for everyone to contribute further examples of CGD. We have also prepared our raw dataset of “citizen generated data” according to Google searches accessible on figshare. If you are interested reading more about the academic discourse around CGD and related fields, or would like to share your own work, here we have prepared a Zotero group with relevant literature.

New research to map the diversity of citizen-generated data for sustainable development

- August 13, 2018 in citizen data, citizen generated data, research

We are excited to announce a new research project around citizen-generated data and the UN data revolution. This research will be led by Open Knowledge International in partnership with King’s College London and the Public Data Lab to develop a vocabulary for governments to navigate the landscape of citizen-generated data. This research elaborates on past work which explored how to democratise the data revolution, how citizen and civil society data can be used to advocate for changes in official data collection, and how citizen-generated data can be organised to monitor and advance sustainability. It is funded by the United Nations Foundation and commissioned by the Task Team on Citizen Generated Data which is hosted by the Global Partnership for Sustainable Development Data (GPSDD). Our research seeks to develop a working vocabulary of different citizen-generated data methodologies. This vocabulary shall highlight clear distinction criteria between different methods, but also point out different ways of thinking about citizen-generated data. We hope that such a vocabulary can help governments and international organisations attend to the benefits and pitfalls of citizen-generated data in a more nuanced way and will help them engage with citizen-generated data more strategically.

Why this research matters

The past decades have seen the rise of many citizen-generated data projects. A plethora of concepts and initiatives use citizen-generated data for many goals, ranging from citizen science, citizen sensing and environmental monitoring to participatory mapping, community-based monitoring and community policing. In these initiatives citizens may play very different roles (from assigning the role of mere sensors, to enabling them to shape what data gets collected). Initiatives may differ in the  media and technologies used to collect data, in the ways stakeholders are engaged with partners from government or business, or how activities are governed to align interests between these parties.

Air pollution monitoring devices used as part of Citizen Sense pilot study in New Cross, London (image from Changing What Counts report)

Likewise different actors articulate the concerns and benefits of CGD in different ways. Scientific and statistical communities may be concerned about data quality and interoperability of citizen-generated data whereas a community centered around the monitoring of the Sustainable Development Goals (SDGs) may be more concerned with issues of scalability and the potential of CGD to fill gaps in official data sets. Legal communities may consider liability issues for government administrations when using unofficial data,, whilst CSOs and international development organisations may want to know what resources and capacities are needed to support citizen-generated data and how to organise and plan projects. In our work we will address a range of questions including: What citizen-generated data methodologies work well, and for what purposes? What is the role of citizens in generating data, and what can data “generation” look like? How are participation and use of citizen data organised? What collaborative models between official data producers/users and citizen-generated data projects exist? Can citizen-generated data be used alongside or incorporated into statistical monitoring purposes, and if so, under what circumstances? And in what ways could citizen-generated data contribute to regulatory decision-making or other administrative tasks of government? In our research we will
  • Map existing literature, online content and examples of projects, practices and methods associated with the term “citizen generated data”;
  • Use this mapping to solicit for input and ideas on other kinds of citizen-generated data initiatives as well as other relevant literatures and practices from researchers, practitioners and others;
  • Gather suggestions from literature, researchers and practitioners about which aspects of citizen-generated data to attend to, and why;
  • Undertake fresh empirical research around a selection of citizen-generated data projects in order to explore these different perspectives.

Visual representation of the Bushwick Neighbourhood, geo-locating qualitative stories in the map (left image), and patterns of land usage (right image) (Source: North West Bushwick Community project)

Next steps

In the spirit of participatory and open research, we invite governments, civil society organisations and academia to share examples of citizen-generated data methodologies, the benefits of using citizen-generated data and issues we may want to look into as part of our research. If you’re interested in following or contributing to the project, you can find out more on our forum.

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.

The Open Data Charter Measurement Guide is out now!

- May 21, 2018 in Open Data measurements, research

This post was jointly written by Ana Brandusescu (Web Foundation) and Danny Lämmerhirt (Open Knowledge International), co-chairs of the Measurement and Accountability Working Group of the Open Data Charter. It was originally published via the Open Data Charter’s Medium account.     We are pleased to announce the launch of our Open Data Charter Measurement Guide. The guide is a collaborative effort of the Charter’s Measurement and Accountability Working Group (MAWG). It analyses the Open Data Charter principles and how they are assessed based on current open government data measurement tools. Governments, civil society, journalists, and researchers may use it to better understand how they can measure open data activities according to the Charter principles.

What can I find in the Measurement Guide?

  • An executive summary for people who want to quickly understand what measurement tools exist and for what principles.
  • An analysis of how each Charter principle is measured, including a comparison of indicators that are currently used to measure each Charter principle and its commitments. This analysis is based on the open data indicators used by the five largest measurement tools – the Web Foundation’s  Open Data Barometer, Open Knowledge International’s Global Open Data Index, Open Data Watch’s Open Data Inventory, OECD’s OURdata Index, and the European Open Data Maturity Assessment . For each principle, we also highlight case studies of how Charter adopters have practically implemented the commitments of that principle.
  • Comprehensive indicator tables show how each Charter principle commitment can be measured. This table is especially helpful when used to compare how different indices approach the same commitment, and where gaps exist. Here, you can see an example of the indicator tables for Principle 1.
  • A methodology section that details how the Working Group conducted the analysis of mapping existing measurements indices against Charter commitments.
  • A recommended list of resources for anyone that wants to read more about measurement and policy.
The Measurement Guide is available online in the form of a Gitbook and in a printable PDF version. If you are interested in using the indicators to measure open data, visit our indicator tables for each principle, or find the guide’s raw data here. Do you have comments or questions? Share your feedback with the community using the hashtag #OpenDataMetrics or get in touch with our working group at progressmeasurement-wg@opendatacharter.net.

The Open Data Charter Measurement Guide is out now!

- May 21, 2018 in Open Data measurements, research

This post was jointly written by Ana Brandusescu (Web Foundation) and Danny Lämmerhirt (Open Knowledge International), co-chairs of the Measurement and Accountability Working Group of the Open Data Charter. It was originally published via the Open Data Charter’s Medium account.     We are pleased to announce the launch of our Open Data Charter Measurement Guide. The guide is a collaborative effort of the Charter’s Measurement and Accountability Working Group (MAWG). It analyses the Open Data Charter principles and how they are assessed based on current open government data measurement tools. Governments, civil society, journalists, and researchers may use it to better understand how they can measure open data activities according to the Charter principles.

What can I find in the Measurement Guide?

  • An executive summary for people who want to quickly understand what measurement tools exist and for what principles.
  • An analysis of how each Charter principle is measured, including a comparison of indicators that are currently used to measure each Charter principle and its commitments. This analysis is based on the open data indicators used by the five largest measurement tools – the Web Foundation’s  Open Data Barometer, Open Knowledge International’s Global Open Data Index, Open Data Watch’s Open Data Inventory, OECD’s OURdata Index, and the European Open Data Maturity Assessment . For each principle, we also highlight case studies of how Charter adopters have practically implemented the commitments of that principle.
  • Comprehensive indicator tables show how each Charter principle commitment can be measured. This table is especially helpful when used to compare how different indices approach the same commitment, and where gaps exist. Here, you can see an example of the indicator tables for Principle 1.
  • A methodology section that details how the Working Group conducted the analysis of mapping existing measurements indices against Charter commitments.
  • A recommended list of resources for anyone that wants to read more about measurement and policy.
The Measurement Guide is available online in the form of a Gitbook and in a printable PDF version. If you are interested in using the indicators to measure open data, visit our indicator tables for each principle, or find the guide’s raw data here. Do you have comments or questions? Share your feedback with the community using the hashtag #OpenDataMetrics or get in touch with our working group at progressmeasurement-wg@opendatacharter.net.

The Open Data Charter’s Measurement Guide is now open for consultation!

- March 13, 2018 in Open Data, Open Data Charter, Open Data measurements, research

This blogpost is co-authored by  Ana Brandusescu  and Danny Lämmerhirt, co-chairs of the Measurement and Accountability Working Group of the Open Data Charter.

The Measurement and Accountability Working Group (MAWG) is launching the public consultation phase for the draft Open Data Charter Measurement* Guide!

Image: Imgflig.com

Measurement tools are often described in technical language. The Guide explains how the Open Data Charter principles can be measured. It provides a comprehensive overview of existing open data measurement tools and their indicators, which assess the state of open government data at a national level. Many of the indicators analysed are relevant for local and regional governments, too. This post explains what the Measurement Guide covers; the purpose of the public consultation, and how you can participate!

What can I find in the Measurement Guide?

  • An executive summary for people who want to quickly understand what measurement tools exist and for what principles.
  • An analysis of measuring the Charter principles, which includes a comparison of the indicators that are currently used to measure each Charter principle and its accompanying commitments. It reveals how the measurement tools — Open Data Barometer, Global Open Data Index, Open Data Inventory, OECD’s OURdata Index, European Open Data Maturity Assessment — address the Charter commitments. For each principle, case studies of how Charter adopters have put commitments into practice are also highlighted.
  • Comprehensive indicator tables show available indicators against each Charter commitment. This table is especially helpful when used to compare how different indices approach the same commitment, and where gaps exist.
  • A methodology section that details how the Working Group conducted the analysis of mapping existing measurements indices against Charter commitments.
  • A recommended list of resources for anyone that wants to read more about measurement and policy.

We want you — to give us your feedback!

The public consultation is a dialogue between measurement researchers and everyone who is working with measurements — including government, civil society, and researchers. If you consider yourself as part of one (or more) of these groups, we would appreciate your feedback on the guide. Please bear the questions below in mind as you review the Guide:

  • Is the Measurement Guide clear and understandable?
  • Government: Which indicators are most useful to assess your work on open data and why?
  • Civil society: In what ways do you find existing indicators useful to hold your government to account?
  • Researchers: Do you know measurements and assessments that are well-suited to understand the Charter commitments?

How does the public consultation process work?

The public consultation phase will be open for two weeks — from 12 to 26 March — and includes:

  1. Public feedback, where we gather comments in the Measurement Guide, the indicator tables document.
  2. Public (and private) responses from MAWG members throughout the consultation phase.

How can I give feedback to the public consultation?

  1. You can leave comments directly in the Measurement Guide, as well as the indicator tables.
  2. If you want to send a private message to the group chairs, drop Ana and Danny an email at ana.brandusescu@webfoundation.org and danny.lammerhirt@okfn.org. Or send us a tweet at @anabmap and @danlammerhirt.
  3. Share your feedback with the community using the hashtag #OpenDataMetrics.

We will incorporate your feedback in the Measurement Guide, during the public consultation period. We plan to publish a final version of the Measurement Guide guide by end of April 2018.

A note that we will not include new indicators or comments specifically on the Charter principles. If you have comments about improving the Charter principles, we encourage you to participate in the updating process of the Charter principles.

*Since the last time we wrote a blog post, we have changed the name to more accurately represent the document, from Assessment Guide to Measurement Guide.

The Open Data Charter’s Measurement Guide is now open for consultation!

- March 13, 2018 in Open Data, Open Data Charter, Open Data measurements, research

This blogpost is co-authored by  Ana Brandusescu  and Danny Lämmerhirt, co-chairs of the Measurement and Accountability Working Group of the Open Data Charter.

The Measurement and Accountability Working Group (MAWG) is launching the public consultation phase for the draft Open Data Charter Measurement* Guide!

Image: Imgflig.com

Measurement tools are often described in technical language. The Guide explains how the Open Data Charter principles can be measured. It provides a comprehensive overview of existing open data measurement tools and their indicators, which assess the state of open government data at a national level. Many of the indicators analysed are relevant for local and regional governments, too. This post explains what the Measurement Guide covers; the purpose of the public consultation, and how you can participate!

What can I find in the Measurement Guide?

  • An executive summary for people who want to quickly understand what measurement tools exist and for what principles.
  • An analysis of measuring the Charter principles, which includes a comparison of the indicators that are currently used to measure each Charter principle and its accompanying commitments. It reveals how the measurement tools — Open Data Barometer, Global Open Data Index, Open Data Inventory, OECD’s OURdata Index, European Open Data Maturity Assessment — address the Charter commitments. For each principle, case studies of how Charter adopters have put commitments into practice are also highlighted.
  • Comprehensive indicator tables show available indicators against each Charter commitment. This table is especially helpful when used to compare how different indices approach the same commitment, and where gaps exist.
  • A methodology section that details how the Working Group conducted the analysis of mapping existing measurements indices against Charter commitments.
  • A recommended list of resources for anyone that wants to read more about measurement and policy.

We want you — to give us your feedback!

The public consultation is a dialogue between measurement researchers and everyone who is working with measurements — including government, civil society, and researchers. If you consider yourself as part of one (or more) of these groups, we would appreciate your feedback on the guide. Please bear the questions below in mind as you review the Guide:

  • Is the Measurement Guide clear and understandable?
  • Government: Which indicators are most useful to assess your work on open data and why?
  • Civil society: In what ways do you find existing indicators useful to hold your government to account?
  • Researchers: Do you know measurements and assessments that are well-suited to understand the Charter commitments?

How does the public consultation process work?

The public consultation phase will be open for two weeks — from 12 to 26 March — and includes:

  1. Public feedback, where we gather comments in the Measurement Guide, the indicator tables document.
  2. Public (and private) responses from MAWG members throughout the consultation phase.

How can I give feedback to the public consultation?

  1. You can leave comments directly in the Measurement Guide, as well as the indicator tables.
  2. If you want to send a private message to the group chairs, drop Ana and Danny an email at ana.brandusescu@webfoundation.org and danny.lammerhirt@okfn.org. Or send us a tweet at @anabmap and @danlammerhirt.
  3. Share your feedback with the community using the hashtag #OpenDataMetrics.

We will incorporate your feedback in the Measurement Guide, during the public consultation period. We plan to publish a final version of the Measurement Guide guide by end of April 2018.

A note that we will not include new indicators or comments specifically on the Charter principles. If you have comments about improving the Charter principles, we encourage you to participate in the updating process of the Charter principles.

*Since the last time we wrote a blog post, we have changed the name to more accurately represent the document, from Assessment Guide to Measurement Guide.

New Report: Avoiding data use silos – How governments can simplify the open licensing landscape

- December 14, 2017 in licence, Open Data, Policy, research

Licence proliferation continues to be a major challenge for open data. When licensors decide to create custom licences instead of using standard open licences, it creates a number of problems. Users of open data may find it difficult and cumbersome to understand all legal arrangements. More importantly though, legal uncertainties and compatibility issues with many different licenses can have chilling effects on the reuse of data. This can create ‘data use silos’, a situation where users are legally allowed to only combine some data with one another, as most data would be legally impossible to use under the same terms. This counteracts efforts such as the European Digital Single Market strategy, prevents the free flow of (public sector) information and impedes the growth of data economies. Standardised licences can smoothen this process by clearly stating usage rights. Our latest report  ‘Avoiding data use silos – How governments can simplify the open licensing landscape’ explains why reusable standard licences, or putting the data in the public domain are the best options for governments. While the report has a focus on government, many of the recommendations can also apply to public sector bodies as well as publishers of works more broadly. The lack of centralised coordination within governments is a key driver of licence proliferation. Different phases along the licensing process influence government choices what open licences to apply – including clearance of copyright, policy development, and the development and application of individual licences. Our report also outlines how governments can harmonise the decision-making around open licences and ensure their compatibility. We aim to provide the ground for a renewed discussion around what good open licensing means – and inspire follow-up research on specific blockages of open licensing. We propose following best practices and recommendations for governments who wish to make their public sector information as reusable as possible:
  1. Publish clear notices that concisely inform users about their rights to reuse, combine and distribute information, in case data is exempt from copyright or similar rights.
  2. Align licence policies via inter-ministerial committees and collaborations with representative bodies for lower administrative levels. Consider appointing an agency overseeing and reviewing licensing decisions.
  3. Precisely define reusable standard licences in your policy tools. Clearly define a small number of highly compatible legal solutions. We recommend putting data into the public domain using Creative Commons Zero, or applying a standard open license like Creative Commons BY 4.0.
  4. If you still opt to use custom licences, carefully verify if provisions cause incompatibilities with other licences. Add compatibility statements explicitly naming the licences and licence versions compatible with a custom licence, and keep the licence text short, simple, and reader-friendly.

Custom licences used across a sample of 20 governments