You are browsing the archive for Policy.

EU Council backs controversial copyright crackdown

- April 15, 2019 in copyright, eu, Featured, Internet, News, Policy

The Council of the European Union today backed a controversial copyright crackdown in a ‘deeply disappointing’ vote that could impact on all internet users. Six countries voted against the proposal which has been opposed by 5million people through a Europe-wide petition – Italy, Luxembourg, Netherlands, Poland, Finland and Sweden.
Three more nations abstained, but the UK voted for the crackdown and there were not enough votes for a blocking minority. The proposal is expected to lead to the introduction of ‘filters’ on sites such as YouTube, which will automatically remove content that could be copyrighted. While entertainment footage is most likely to be affected, academics fear it could also restrict the sharing of knowledge, and critics argue it will have a negative impact on freedom of speech and expression online. EU member states will have two years to implement the law, and the regulations are still expected to affect the UK despite Brexit. The Open Knowledge Foundation said the battle is not over, with the European elections providing an opportunity to elect ‘open champions’. Catherine Stihler, chief executive of the Open Knowledge Foundation, said:
“This is a deeply disappointing result which will have a far-reaching and negative impact on freedom of speech and expression online. The controversial crackdown was not universally supported, and I applaud those national governments which took a stand and voted against it. We now risk the creation of a more closed society at the very time we should be using digital advances to build a more open world where knowledge creates power for the many, not the few.

But the battle is not over. Next month’s European elections are an opportunity to elect a strong cohort of open champions at the European Parliament who will work to build a more open world.”

EU copyright vote a ‘massive blow’ for internet users

- March 26, 2019 in copyright, eu, Featured, Internet, News, Policy

MEPs have today voted to press ahead with a controversial copyright crackdown in a ‘massive blow’ for all internet users. Despite a petition with over 5 million signatures and scores of protests across Europe attended by tens of thousands of people, MEPs voted by 348 to 274 in favour of the changes. It is expected to lead to the introduction of ‘filters’ on sites such as YouTube, which will automatically remove content that could be copyrighted. While entertainment footage is most likely to be affected, academics fear it could also restrict the sharing of knowledge, and critics argue it will have a negative impact on freedom of speech and expression online. EU member states will have two years to implement the law, and the regulations are still expected to affect the UK despite Brexit. Catherine Stihler, chief executive of the Open Knowledge Foundation, said:
“This vote is a massive blow for every internet user in Europe. MEPs have rejected pleas from millions of EU citizens to save the internet, and chose instead to restrict freedom of speech and expression online. We now risk the creation of a more closed society at the very time we should be using digital advances to build a more open world where knowledge creates power for the many, not the few.

But while this result is deeply disappointing, the forthcoming European elections provide an opportunity for candidates to stand on a platform to seek a fresh mandate to reject this censorship.”

Open data governance and open governance: interplay or disconnect?  

- February 20, 2019 in Open Data, open data governance, Policy, research

Authors: Ana Brandusescu, Carlos Iglesias, Danny Lämmerhirt, Stefaan Verhulst (in alphabetical order) The presence of open data often gets listed as an essential requirement toward “open governance”. For instance, an open data strategy is reviewed as a key component of many action plans submitted to the Open Government Partnership. Yet little time is spent on assessing how open data itself is governed, or how it embraces open governance. For example, not much is known on whether the principles and practices that guide the opening up of government – such as transparency, accountability, user-centrism, ‘demand-driven’ design thinking – also guide decision-making on how to release open data. At the same time, data governance has become more complex and open data decision-makers face heightened concerns with regards to privacy and data protection. The recent implementation of the EU’s General Data Protection Regulation (GDPR) has generated an increased awareness worldwide of the need to prevent and mitigate the risks of personal data disclosures, and that has also affected the open data community. Before opening up data, concerns of data breaches, the abuse of personal information, and the potential of malicious inference from publicly available data may have to be taken into account. In turn, questions of how to sustain existing open data programs, user-centrism, and publishing with purpose gain prominence. To better understand the practices and challenges of open data governance, we have outlined a research agenda in an earlier blog post. Since then, and perhaps as a result, governance has emerged as an important topic for the open data community. The audience attending the 5th International Open Data Conference (IODC) in Buenos Aires deemed governance of open data to be the most important discussion topic. For instance, discussions around the Open Data Charter principles during and prior to the IODC acknowledged the role of an integrated governance approach to data handling, sharing, and publication. Some conclude that the open data movement has brought about better governance, skills, technologies of public information management which becomes an enormous long-term value for government. But what does open data governance look like?

Understanding open data governance

To expand our earlier exploration and broaden the community that considers open data governance, we convened a workshop at the Open Data Research Symposium 2018. Bringing together open data professionals, civil servants, and researchers, we focused on:
  • What is open data governance?
  • When can we speak of “good” open data governance, and
  • How can the research community help open data decision-makers toward “good” open data governance?
In this session, open data governance was defined as the interplay of rules, standards, tools, principles, processes and decisions that influence what government data is opened up, how and by whom. We then explored multiple layers that can influence open data governance. In the following, we illustrate possible questions to start mapping the layers of open data governance. As they reflect the experiences of session participants, we see them as starting points for fresh ethnographic and descriptive research on the daily practices of open data governance in governments.

Figure: Schema of an open data governance model

The Management layer

Governments may decide about the release of data on various levels. Studying the management side of data governance could look at decision-making methods and devices. For instance, one might analyze how governments gauge public interest in their datasets – through data request mechanisms, user research, or participatory workshops? What routine procedures do governments put in place to interact with other governments and the public? For instance, how do governments design routine processes to open data requests? How are disputes over open data release settled? How do governments enable the public to address non-publication? One might also study cost-benefit calculations and similar methodologies to evaluate data, and how they inform governments what data counts as crucial and is expected to bring returns and societal benefits. Understanding open data governance would also require to study the ways in which open data creation, cleaning, and publication are managed itself. Governments may choose to organise open data publication and maintenance in house, or seek collaborative approaches, otherwise known from data communities like OpenStreetMaps. Another key component is funding and sustainability. Funding might influence management on multiple layers – from funding capacity building, to investing in staff innovations and alternative business models for government agencies that generate revenue from high value datasets. What do these budget and sustainability models look like? How are open data initiatives currently funded, under what terms, for how long, by whom and for what? And how do governments reconcile the publication of high value datasets with the need to provide income for public government bodies? These questions gain importance as governments move towards assessing and publishing high value datasets. Open governance and management: To what extent is management guided by open governance? For instance, how participatory, transparent, and accountable are decision-making processes and devices? How do governments currently make space for more open governance in their management processes? Do governments practice more collaborative data management with communities, for example to maintain, update, verify government data?   

The Legal and Policy layer

The interplay between legal and policy frameworks: Open data policies operate among other legal and policy frameworks, which can complement, enable, or limit the scope of open data. New frameworks such as GDPR, but also existing right to information and freedom of expression frameworks prompt the question of how the legal environment influences the behaviour and daily decision-making around open data. To address such questions, one could study the discourse and interplay between open data policies as well as tangential policies like smart city or digitalisation policies. Implementation of law and policies: Furthermore, how are open data frameworks designed to guide the implementation open data? How do they address governmental devolution? Open data governance needs to stretch across all government levels to unlock data from all government levels. What approaches are experimented with to coordinate the implementation of policies across jurisdictions and government branches? To what agencies do open data policies apply, and how do they enable or constrain choices around open data? What agencies define and move forward open data, and how does this influence adoption and sustainability of open data initiatives? Open governance of law and policy: Besides studying the interaction of privacy protection, right to information, and open data policies, how could open data benefit from policies enabling open governance and civic participation? Do governments develop more integrated strategies for open governance and open data, and if so, what policies and legal mechanisms are in place? If so, how do these laws and policies enable other aspects of open data governance, including more participatory management, more substantive and legally supported citizen participation?  

The Technical and Standards layer

Governments may have different technical standards in place for data processing and publication, from producing data, to quality assurance processes. Some research has looked into the ways data standards for open data alter the way governments process information. Others have argued that the development of data standards is reference how governments envisage citizens, primarily catering to tech-literate audiences. (Data) standards do not only represent, but intervene in the way governments work. Therefore, they could substantially alter the ways government publishes information. Understood this way, how do standards enable resilience against change, particularly when facing shifting political leadership? On the other hand, most government data systems are not designed for open data. Too often, governments are struggling to transform huge volumes of government data into open data using manual methods. Legacy IT systems that have not been built to support open data create additional challenges to developing technical infrastructure, but there is no single global solution to data infrastructure. How could then governments transform their technical infrastructure to allow them to publish open data efficiently? Open governance and the technical / standards layer: If standards can be understood as  bridge building devices, or tools for cooperation, how could open governance inform the creation of technical standards? Do governments experiment with open standards, and if so, what standards are developed, to what end, using what governance approach?

The Capacity layer

Staff innovations may play an important role in open data governance. What is the role of chief data officers in improving open data governance? Could the usual informal networks of open data curators within government and a few open data champions make open data success alone? What role do these innovations play in making decisions about open data and personal data protection? Could governments rely solely on senior government officials to execute open data strategies? Who else is involved in the decision-making around open data release? What are the incentives and disincentives for officials to increase data sharing? As one session participant mentioned: “I have never experienced that a civil servant got promoted for sharing data”. This begs the question if and how governments currently assess performance metrics that support opening up data. What other models could help reward data sharing and publication? In an environment of decreased public funding, are there opportunities for governments to integrate open data publication in existing engagement channels with the public? Open governance and capacity: Open governance may require capacities in government, but could also contribute new capacities. This can apply to staff, but also resources such as time or infrastructure. How do governments provide and draw capacity from open governance approaches, and what could be learnt for other open data governance approaches?    

Next steps

With this map of data governance aspects as a starting point, we would like to conduct empirical research to explore how open data governance is practised. A growing body of ethnographic research suggests that tech innovations such as algorithmic decision-making, open data, or smart city initiatives are ‘multiples’ — meaning that they can be practiced in many ways by different people, arising in various contexts. With such an understanding, we would like to develop empirical case studies to elicit how open data governance is practised. Our proposed research approach includes the following steps:
  • Universe mapping: Identifying public sector officials and civil servants involved in deciding how data gets managed, shared and published openly (this helps to get closer to the actual decision-makers, and to learn from them).
  • Describing how and on what basis (legal, organisational & bureaucratic, technological, financial, etc.) people make decisions on what gets published and why.
  • Observe and describe different approaches to do open data governance, looking at enabling and limiting factors of opening up data.
  • Describe gaps and areas of improvement with regards to open data governance, as well as best practices.
This may surface how open data governance becomes salient for governments, under what circumstances and why. If you are a government official, or civil servant working with (open) data, and would like to share your experiences, we would like to hear from you!  

Celebrating the public domain in 2019

- January 29, 2019 in open culture, Open GLAM, OpenGLAM, Policy, Public Domain

2019 is a special year for the public domain, the out-of-copyright material that everyone is free to enjoy, share, and build upon without restriction. Normally, each year on the 1st of January a selection of works (books, films, artworks, musical scores and more) enter the public domain because their copyright expires – which is most commonly 70 years after the creator’s death depending on where in the world you are. This year, for the first time in more than twenty years, new material entered the public domain in the US, namely all works that were published in the year 1923. Due to complicated legal proceedings, the last new release of public domain material in the US was in 1998, for all works dating from 1922. But from now on, each following year we will expect to see a new batch of material freed of copyright restrictions (so content from the year 1924 will become available from 2020 onwards, content from 1925 in 2021, and so on). This is good news for everyone, since the availability of such open cultural data enables citizens from across the world to enjoy this material, understand their cultural heritage and re-use it to produce new works of art. The Public Domain Review, an online journal & not-for-profit project dedicated to promoting and celebrating the public domain, curated their Class of 2019: a top pick of artists and writers whose works entered the public domain this year. A full overview of the 2019 release is available here. A great way to celebrate this public domain content in 2019 could be to organise events, workshops or hackathons using this material on Open Data Day, the annual celebration of open data on Saturday 2 March 2019. If you are planning an event, you can add it to the global map via the Open Data Day registration form. Coinciding with this mass release of public domain works, the Public Domain Manifesto that was been produced within the context of COMMUNIA, the European Thematic Network on the digital public domain, has now been made available via a renewed website at publicdomainmanifesto.org. Describing the public domain material as “raw material from which new knowledge is derived and new cultural works are created”, the manifesto aims to stress the importance of the wealth of the public domain to both citizens and policy-makers, to make sure its legal basis remains strong and everyone will be able to access and reuse the material in the future. The manifesto describes the key principles that are needed to actively maintain the public domain and the voluntary commons in our society, for example to keep public domain works in the Public Domain by not claiming exclusive rights to technical reproductions of works. It also formulates a number of recommendations to protect the public domain from legal obstacles and assure it can function to the benefit of education, cultural heritage and scientific research in a meaningful way. There are currently over 3.000 signatures of the manifesto, but additional support is important to strengthen the movement: you show your support by signing the Public Domain Manifesto here.

What data counts in Europe? Towards a public debate on Europe’s high value data and the PSI Directive

- January 16, 2019 in Open Government Data, Open Standards, Policy, research

This blogpost was co-authored by Danny Lämmerhirt and Pierre Chrzanowski (*author note at the bottom) January 22 will mark a crucial moment for the future of open data in Europe. That day, the final trilogue between European Commission, Parliament, and Council is planned to decide over the ratification of the updated PSI Directive. Among others, the European institutions will decide over what counts as ‘high value’ data. What essential information should be made available to the public and how those data infrastructures should be funded and managed are critical questions for the future of the EU. As we will discuss below, there are many ways one might envision the collective ‘value’ of those data. This is a democratic question and we should not be satisfied by an ill and broadly defined proposal. We therefore propose to organise a public debate to collectively define what counts as high value data in Europe.

What does PSI Directive say about high value datasets?  

The European Commission provides several hints in the current revision of the PSI Directive on how it envisions high value datasets. They are determined by one of the following ‘value indicators’:
  • The potential to generate significant social, economic, or environmental benefits,
  • The potential to generate innovative services,
  • The number of users, in particular SMEs,  
  • The revenues they may help generate,  
  • The data’s potential for being combined with other datasets
  • The expected impact on the competitive situation of public undertakings.
Given the strategic role of open data for Europe’s Digital Single Market, these indicators are not surprising. But as we will discuss below, there are several challenges defining them. Also, there are different ways of understanding the importance of data. The annex of the PSI Directive also includes a list of preliminary high value data, drawing primarily from the key datasets defined by Open Knowledge International’s (OKI’s) Global Open Data Index, as well as the G8 Open Data Charter Technical Annex. See the proposed list in the table below. List of categories and high-value datasets:
Category Description
1. Geospatial Data Postcodes, national and local maps (cadastral, topographic, marine, administrative boundaries).
2. Earth observation and environment Space and situ data (monitoring of the weather and of the quality of land and water, seismicity, energy consumption, the energy performance of buildings and emission levels).
3. Meteorological data Weather forecasts, rain, wind and atmospheric pressure.
4. Statistics National, regional and local statistical data with main demographic and economic indicators (gross domestic product, age, unemployment, income, education).
5. Companies Company and business registers (list of registered companies, ownership and management data, registration identifiers).
6. Transport data Public transport timetables of all modes of transport, information on public works and the state of the transport network including traffic information.
  According to the proposal, regardless of who provide them, these datasets shall be available for free, machine-readable and accessible for download, and where appropriate, via APIs. The conditions for re-use shall be compatible with open standard licences.

Towards a public debate on high value datasets at EU level

There has been attempts by EU Member States to define what constitutes high-value data at national level, with different results. In Denmark, basic data has been defined as the five core information public authorities use in their day-to-day case processing and should release. In France, the law for a Digital Republic aims to make available reference datasets that have the greatest economic and social impact. In Estonia, the country relies on the X-Road infrastructure to connect core public information systems, but most of the data remains restricted. Now is the time for a shared and common definition on what constitute high-value datasets at EU level. And this implies an agreement on how we should define them. However, as it stands, there are several issues with the value indicators that the European Commission proposes. For example, how does one define the data’s potential for innovative services? How to confidently attribute revenue gains to the use of open data? How does one assess and compare the social, economic, and environmental benefits of opening up data? Anyone designing these indicators must be very cautious, as metrics to compare social, economic, and environmental benefits may come with methodical biases. Research found for example, that comparing economic and environmental benefits can unfairly favour data of economic value at the expense of fuzzier social benefits, as economic benefits are often more easily quantifiable and definable by default. One form of debating high value datasets could be to discuss what data gets currently published by governments and why. For instance, with their Global Open Data Index, Open Knowledge International has long advocated for the publication of disaggregated, transactional spending figures. Another example is OKI’s Open Data For Tax Justice initiative which wanted to influence the requirements for multinational companies to report their activities in each country (so-called ‘Country-By-Country-Reporting’), and influence a standard for publicly accessible key data.   A public debate of high value data should critically examine the European Commission’s considerations regarding the distortion of competition. What market dynamics are engendered by opening up data? To what extent do existing markets rely on scarce and closed information? Does closed data bring about market failure, as some argue (Zinnbauer 2018)? Could it otherwise hamper fair price mechanisms (for a discussion of these dynamics in open access publishing, see Lawson and Gray 2016)? How would open data change existing market dynamics? What actors proclaim that opening data could purport market distortion, and whose interests do they represent? Lastly, the European Commission does not yet consider cases of government agencies  generating revenue from selling particularly valuable data. The Dutch national company register has for a long time been such a case, as has the German Weather Service. Beyond considering competition, a public debate around high value data should take into account how marginal cost recovery regimes currently work.

What we want to achieve

For these reasons, we want to organise a public discussion to collectively define
  1. i) What should count as a high value datasets, and based on what criteria,
  2. ii) What information high value datasets should include,
  3. ii) What the conditions for access and re-use should be.
The PSI Directive will set the baseline for open data policies across the EU. We are therefore at a critical moment to define what European societies value as key public information. What is at stake is not only a question of economic impact, but the question of how to democratise European institutions, and the role the public can play in determining what data should be opened.

How you can participate

  1. We will use the Open Knowledge forum as main channel for coordination, exchange of information and debate. To join the debate, please add your thoughts to this thread or feel free to start a new discussion for specific topics.
  2. We gather proposals for high value datasets in this spreadsheet. Please feel free to use it as a discussion document, where we can crowdsource alternative ways of valuing data.
  3. We use the PSI Directive Data Census to assess the openness of high value datasets.
We also welcome any reference to scientific paper, blogpost, etc. discussing the issue of high-value datasets. Once we have gathered suggestions for high value datasets, we would like to assess how open proposed high-value datasets are. This will help to provide European countries with a diagnosis of the openness of key data.     Author note: Danny Lämmerhirt is senior researcher on open data, data governance, data commons as well as metrics to improve open governance. He has formerly worked with Open Knowledge International, where he led its research activities, including the methodology development of the Global Open Data Index 2016/17. His work focuses, among others, on the role of metrics for open government, and the effects metrics have on the way institutions work and make decisions. He has supervised and edited several pieces on this topic, including the Open Data Charter’s Measurement Guide. Pierre Chrzanowski is Data Specialist with the World Bank Group and a co-founder of Open Knowledge France local group. As part of his work, he developed the Open Data for Resilience Initiative (OpenDRI) Index, a tool to assess the openness of key datasets for disaster risk management projects. He has also participated in the impact assessment prior to the new PSI Directive proposal and has contributed to the Global Open Data Index as well as the Web Foundation’s Open Data Barometer.

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.

New Report: Avoiding data use silos – How governments can simplify the open licensing landscape

- December 14, 2017 in licence, Open Data, Policy, research

Licence proliferation continues to be a major challenge for open data. When licensors decide to create custom licences instead of using standard open licences, it creates a number of problems. Users of open data may find it difficult and cumbersome to understand all legal arrangements. More importantly though, legal uncertainties and compatibility issues with many different licenses can have chilling effects on the reuse of data. This can create ‘data use silos’, a situation where users are legally allowed to only combine some data with one another, as most data would be legally impossible to use under the same terms. This counteracts efforts such as the European Digital Single Market strategy, prevents the free flow of (public sector) information and impedes the growth of data economies. Standardised licences can smoothen this process by clearly stating usage rights. Our latest report  ‘Avoiding data use silos – How governments can simplify the open licensing landscape’ explains why reusable standard licences, or putting the data in the public domain are the best options for governments. While the report has a focus on government, many of the recommendations can also apply to public sector bodies as well as publishers of works more broadly. The lack of centralised coordination within governments is a key driver of licence proliferation. Different phases along the licensing process influence government choices what open licences to apply – including clearance of copyright, policy development, and the development and application of individual licences. Our report also outlines how governments can harmonise the decision-making around open licences and ensure their compatibility. We aim to provide the ground for a renewed discussion around what good open licensing means – and inspire follow-up research on specific blockages of open licensing. We propose following best practices and recommendations for governments who wish to make their public sector information as reusable as possible:
  1. Publish clear notices that concisely inform users about their rights to reuse, combine and distribute information, in case data is exempt from copyright or similar rights.
  2. Align licence policies via inter-ministerial committees and collaborations with representative bodies for lower administrative levels. Consider appointing an agency overseeing and reviewing licensing decisions.
  3. Precisely define reusable standard licences in your policy tools. Clearly define a small number of highly compatible legal solutions. We recommend putting data into the public domain using Creative Commons Zero, or applying a standard open license like Creative Commons BY 4.0.
  4. If you still opt to use custom licences, carefully verify if provisions cause incompatibilities with other licences. Add compatibility statements explicitly naming the licences and licence versions compatible with a custom licence, and keep the licence text short, simple, and reader-friendly.

Custom licences used across a sample of 20 governments

Who Will Shape the Future of the Data Society?

- October 5, 2016 in data infrastructures, Events, Featured, Featured Project, iodc16, Open Data, Open Government Data, Policy, research

This piece was originally posted on the blog of the International Open Data Conference 2016, which takes place in Madrid, 6-7th October 2016. The contemporary world is held together by a vast and overlapping fabric of information systems. These information systems do not only tell us things about the world around us. They also play a central role in organising many different aspects of our lives. They are not only instruments of knowledge, but also engines of change. But what kind of change will they bring? Contemporary data infrastructures are the result of hundreds of years of work and thought. In charting the development of these infrastructures we can learn about the rise and fall not only of the different methods, technologies and standards implicated in the making of data, but also about the articulation of different kinds of social, political, economic and cultural worlds: different kinds of “data worlds”. future-data-pablo Beyond the rows and columns of data tables, the development of data infrastructures tell tales of the emergence of the world economy and global institutions; different ways of classifying populations; different ways of managing finances and evaluating performance; different programmes to reform and restructure public institutions; and how all kinds of issues and concerns are rendered into quantitative portraits in relation to which progress can be charted – from gender equality to child mortality, biodiversity to broadband access, unemployment to urban ecology. The transnational network assembled in Madrid for the International Open Data Conference has the opportunity to play a significant role in shaping the future of these data worlds. Many of those present have made huge contributions towards an agenda of opening up datasets and developing capacities to use them. Thanks to these efforts there is now global momentum around open data amongst international organisations, national governments, local administrations and civil society groups – which will have an enduring impact on how data is made public. Perhaps, around a decade after the first stirrings of interest in what we know know as “open data”, it is time to have a broader conversation around not only the opening up and use of datasets, but also the making of data infrastructures: of what issues are rendered into data and how, and the kinds of dynamics of collective life that these infrastructures give rise to. How might we increase public deliberation around the calibration and direction of these engines of change? Anyone involved with the creation of official data will be well aware that this is not a trivial proposition. Not least because of the huge amount of effort and expense that can be incurred in everything from developing standards, commissioning IT systems, organising consultation processes and running the social, technical and administrative systems which can be required to create and maintain even the smallest and simplest of datasets. Reshaping data worlds can be slow and painstaking work. But unless we instate processes to ensure alignment between data infrastructures and the concerns of their various publics, we risk sustaining systems which are at best disconnected from and at worst damaging towards those whom they are intended to benefit. What might such social shaping of data infrastructures look like? Luckily there is no shortage of recent examples – from civil society groups campaigning for changes in existing information systems (such as advocacy around the UK’s company register), to cases of citizen and civil society data leading to changes in official data collection practices, to the emergence of new tools and methods to work with, challenge and articulate alternatives to official data. Official data can also be augmented by “born digital” data derived from a variety of different platforms, sources and devices which can be creatively repurposed in the service of studying and securing progress around different issues. While there is a great deal of experimentation with data infrastructures “in the wild”, how might institutions learn from these initiatives in order to make public data infrastructures more responsive to their publics? How can we open up new spaces for participation and deliberation around official information systems at the same time as building on the processes and standards which have developed over decades to ensure the quality, integrity and comparability of official data? How might participatory design methods be applied to involve different publics in the making of public data? How might official data be layered with other “born digital” data sources to develop a richer picture around issues that matter? How do we develop the social, technical and methodological capacities required to enable more people to take part not just in using datasets, but also reshaping data worlds? Addressing these questions will be crucial to the development of a new phase of the open data movement – from the opening up of datasets to the opening up of data infrastructures. Public institutions may find they have not only new users, but new potential contributors and collaborators as the sites where public data is made begin to multiply and extend outside of the public sector – raising new issues and challenges related to the design, governance and political economics of public information systems. The development of new institutional processes, policies and practices to increase democratic engagement around data infrastructures may be more time consuming than some of the comparatively simpler steps that institutions can take to open up their datasets. But further work in this area is vital to secure progress on a wide range of issues – from tackling tax base erosion to tracking progress towards commitments made at the recent Paris climate negotiations. As a modest contribution to advancing research and practice around these issues, a new initiative called the Public Data Lab is forming to convene researchers, institutions and civil society groups with an interest in the making of data infrastructures, as well as the development of capacities that are required for more people to not only take part in the data society, but also to more meaningfully participate in shaping its future.

Open Access: Why do scholarly communication platforms matter and what is the true cost of gold OA?

- July 15, 2016 in Featured, Open Access, Open Research, Open Science, openmaccess, Our Work, PASTEUR4OA, Policy

During the past 2,5 years Open Knowledge has been a partner in PASTEUR4OA, a project focused on aligning open access policies for European Union research. As part of the work, a series of advocacy resources was produced that can be used by stakeholders to promote the development and reinforcement of such open access policies. The final two briefing papers, written by Open Knowledge, have been published this week and deal with two pressing issues around open access today:  the financial opacity of open access publishing and its potential harmful effects for the research community, and the expansion of open and free scholarly communication platforms in the academic world – explaining the new dependencies that may arise from those platforms and why this matters for the open access movement. 

Revealing the true cost of gold OA

financing“Reducing the costs of readership while increasing access to research outputs” has been a rallying cry for open access publishing, or Gold OA. Yet, the Gold OA market is largely opaque and makes it hard for us to evaluate how the costs of readership actually develop. Data on both the costs of subscriptions (for hybrid OA journals) and of APCs are hard to gather. If they can be obtained, they only offer partial but very different insights into the market. This is a problem for efficient open access publishing. Funders, institutions, and individual researchers are therefore increasingly concerned that a transition to Gold OA could leave research community open for exploitative financial practices and prevent effective market coordination. Which factors contribute to the current opacity in the market? Which approaches are taken to foster financial transparency of Gold OA? And what are recommendations to funders, institutions, researchers and publishers to increase transparency? The paper Revealing the true costs of Gold OA – Towards a public data infrastructure of scholarly publishing costs, written by researchers of Open Knowledge International, King’s College London and the University of London, presents the current state of financial opacity in scholarly journal publishing. It describes what information is needed in order to obtain a bigger, more systemic picture of financial flows, and to understand how much money is going into the system, where this money comes from, and how these financial flows might be adjusted to support alternative kinds of publishing models.
 

 Why do scholarly communication platforms matter for open access?

Over the past two decades, open access advocates have made significant gains in securing public access to infrastructuresthe formal outputs of scholarly communication (e.g. peer reviewed journal articles). The same period has seen the rise of platforms from commercial publishers and technology companies that enable users to interact and share their work, as well as providing analytics and services around scholarly communication.
How should researchers and policymakers respond to the rise of these platforms? Do commercial platforms necessarily work the interests of the scholarly community? How and to what extent do these proprietary platforms pose a threat to open scholarly communication? What might public alternatives look like?
The paper Infrastructures for Open Scholarly Communication provides a brief overview of the rise of scholarly platforms – describing some of their main characteristics as well as debates and controversies surrounding them. It argues that in order to prevent new forms of enclosure, it is essential that public policymakers should be concerned with the provision of public infrastructures for scholarly communication as well as public access to the outputs of research. It concludes with a review of some of the core elements of such infrastructures, as well as recommendations for further work in this area.