You are browsing the archive for Open Government Data.

Transforming the UK’s data ecosystem: Open Knowledge Foundation’s thoughts on the National Data Strategy

- July 17, 2019 in National Data Strategy, Open Data, Open Government Data, Open Knowledge, Policy

Following an open call for evidence issued by the UK’s Department for Digital, Culture, Media and Sport, Open Knowledge Foundation submitted our thoughts about what the UK can do in its forthcoming National Data Strategy to “unlock the power of data across government and the wider economy, while building citizen trust in its use”. We also signed a joint letter alongside other UK think tanks, civil and learned societies calling for urgent action from government to overhaul its use of data. Below our CEO Catherine Stihler explains why the National Data Strategy needs to be transformative to ensure that British businesses, citizens and public bodies can play a full role in the interconnected global knowledge economy of today and tomorrow: Today’s digital revolution is driven by data. It has opened up extraordinary access to information for everyone about how we live, what we consume, and who we are. But large unaccountable technology companies have also monopolised the digital age, and an unsustainable concentration of wealth and power has led to stunted growth and lost opportunities. Governments across the world must now work harder to give everyone access to key information and the ability to use it to understand and shape their lives; as well as making powerful institutions more accountable; and ensuring vital research information that can help us tackle challenges such as poverty and climate change is available to all. In short, we need a future that is fair, free and open. The UK has a golden opportunity to lead by example, and the Westminster government is currently developing a long-anticipated National Data Strategy. Its aim is to ensure all citizens and organisations trust the data ecosystem, are sufficiently skilled to operate effectively within it, and can get access to high-quality data when they need it. Laudable aims, but they must come with a clear commitment to invest in better data and skills. The Open Knowledge Foundation I am privileged to lead was launched 15 years ago to pioneer the way that we use data, working to build open knowledge in government, business and civil society – and creating the technology to make open material useful. This week, we have joined with a group of think tanks, civil and learned societies to make a united call for sweeping reforms to the UK’s data landscape. In order for the strategy to succeed, there needs to be transformative, not incremental, change and there must be leadership from the very top, with buy-in from the next Prime Minister, Culture Secretary and head of the civil service. All too often, piecemeal incentives across Whitehall prevent better use of data for the public benefit. A letter signed by the Open Knowledge Foundation, the Institute for Government, Full Fact, Nesta, the Open Data Institute, mySociety, the Royal Statistical Society, the Open Contracting Partnership, 360Giving, OpenOwnership, and the Policy Institute at King’s College London makes this clear. We have called for investment in skills to convert data into real information that can be acted upon; challenged the government to earn the public’s trust, recognising that the debate about how to use citizens’ data must be had in public, with the public; proposed a mechanism for long-term engagement between decision-makers, data users and the public on the strategy and its goals; and called for increased efforts to fix the government’s data infrastructure so organisations outside the government can benefit from it. Separately, we have also submitted our own views to the UK Government, calling for a focus on teaching data skills to the British public. Learning such skills can prove hugely beneficial to individuals seeking employment in a wide range of fields including the public sector, government, media and voluntary sector.  But at present there is often a huge amount of work required to clean up data in order to make it usable before insights or stories can be gleaned from it.  We believe that the UK government could help empower the wider workforce by instigating or backing a fundamental data literacy training programme open to local communities working in a range of fields to strengthen data demand, use and understanding.  Without such training and knowledge, large numbers of UK workers will be ill-equipped to take on many jobs of the future where products and services are devised, built and launched to address issues highlighted by data. Empowering people to make better decisions and choices informed by data will boost productivity, but not without the necessary investment in skills. We have also told the government that one of the most important things it can do to help businesses and non-profit organisations best share the data they hold is to promote open licencing. Open licences are legal arrangements that grant the general public rights to reuse, distribute, combine or modify works that would otherwise be restricted under intellectual property laws. We would also like to see the public sector pioneering new ways of producing and harnessing citizen-generated data efforts by organising citizen science projects through schools, libraries, churches and community groups.  These local communities could help the government to collect high-quality data relating to issues such as air quality or recycling, while also leading the charge when it comes to increasing the use of central government data. We live in a knowledge society where we face two different futures: one which is open and one which is closed. A closed future is one where knowledge is exclusively owned and controlled leading to greater inequality and a closed society. But an open future means knowledge is shared by all – freely available to everyone, a world where people are able to fulfil their potential and live happy and healthy lives. The UK National Data Strategy must emphasise the importance and value of sharing more, better quality information and data openly in order to make the most of the world-class knowledge created by our institutions and citizens.  Without this commitment at all levels of society, British businesses, citizens and public bodies will fail to play a full role in the interconnected global knowledge economy of today and tomorrow.

Missed opportunities in the EU’s revised open data and re-use of public sector information directive

- July 9, 2019 in European Union, Open Data, Open Government Data, Open Research

Published by the European Union on June 26th, the revised directive on open data and the re-use of public sector information – or PSI Directive – set out an updated set of rules relating to public sector documents, publicly funded research data and “high-value” datasets which should be made available for free via application programming interfaces or APIs. EU member states have until July 2021 to incorporate the directive into law.  While Open Knowledge Foundation is encouraged to see some of the new provisions, we have concerns – many of which we laid out in a 2018 blogpost – about missed opportunities for further progress towards a fair, free and open future across the EU. Open data stickers Lack of public input Firstly, the revised directive gives responsibility for choosing which high-value datasets to publish over to member states but there are no established mechanisms for the public to provide input into the decisions.  Broad thematic categories – geospatial; earth observation and environment; meteorological; statistics; companies and company ownership; and mobility – are set out for these datasets but the specifics will be determined over the next two years via a series of further implementing acts. Datasets eventually deemed to be high-value shall be made “available free of charge … machine readable, provided via APIs and provided as a bulk download, where relevant”. Despite drawing on our Global Open Data Index to generate a preliminary list of high-value datasets, this decision flies in the face of years of findings from the Index showing how important it is for governments to engage with the public as much and as early as possible to generate awareness and increase levels of reuse of open data. We fear that this could lead to a further loss of public trust by opening the door for special interests, lobbyists and companies to make private arguments against the release of valuable datasets like spending records or beneficial ownership data which is often highly disaggregated and allows monetary transactions to be linked to individuals. Partial definition of high-value data Secondly, defining the value of data is also not straightforward. Papers from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement about what data’s “value” is. What counts as high-value data should not only be based on quantitative indicators such as potential income generation, breadth of business applications or numbers of beneficiaries – as the revised directive sets out – but also use qualitative assessments and expert judgment from multiple disciplines. Currently less than a quarter of the data with the biggest potential for social impact is available as truly open data even from countries seen as open data leaders, according to the latest Open Data Barometer report from our colleagues at the World Wide Web Foundation. Why? Because “governments are not engaging enough with groups beyond the open data and open government communities”.   Lack of clarity on recommended licenses Thirdly, in line with the directive’s stated principle of being “open by design and by default”, we hope to see countries avoiding future interoperability problems by abiding by the requirement to use open standard licences when publishing these high-value datasets. It’s good to see that the EU Commission itself has recently adopted Creative Commons licences when publishing its own documents and data.  But we feel – in line with our friends at Communia – that the Commission should have made clear exactly which open licences they endorsed under the updated directive, by explicitly recommending the adoption of Open Definition compliant licences from Creative Commons or Open Data Commons to member states. The directive also missed the opportunity to give preference to public domain dedication and attribution licences in accordance with the EU’s own LAPSI 2.0 licensing guidelines, as we recommended. The European Data Portal indicates that there could be up to 90 different licences currently used by national, regional, or municipal governments. Their quality assurance report also shows that they can’t automatically detect the licences used to publish the vast majority of datasets published by open data portals from EU countries. If they can’t work this out, the public definitely won’t be able to: meaning that any and all efforts to use newly-released data will be restrained by unnecessarily onerous reuse conditions. The more complicated or bespoke the licensing, the more likely data will end up unused in silos, our research has shown. 27 of the 28 EU member states may now have national open data policies and portals but, once discovered, it is currently likely that – in addition to confusing licencing – national datasets lack interoperability. For while the EU has substantial programmes of work on interoperability under the European Interoperability Framework, they are not yet having a major impact on the interoperability of open datasets. Open Knowledge Foundation research report: Avoiding data use silos More FAIR data Finally, we welcome the provisions in the directive obliging member states to “[make] publicly funded research data openly available following the principle of open by default and compatible with FAIR principles.” We know there is much work to be done but hope to see wide adoption of these rules and that the provisions for not releasing publicly-funded data due to “confidentiality” or “legitimate commercial interests” will not be abused. The next two years will be a crucial period to engage with these debates across Europe and to make sure that EU countries embrace the directive’s principle of openness by default to release more, better information and datasets to help citizens strive towards a fair, free and open future.

Missed opportunities in the EU’s revised open data and re-use of public sector information directive

- July 9, 2019 in European Union, Open Data, Open Government Data, Open Research

Published by the European Union on June 26th, the revised directive on open data and the re-use of public sector information – or PSI Directive – set out an updated set of rules relating to public sector documents, publicly funded research data and “high-value” datasets which should be made available for free via application programming interfaces or APIs. EU member states have until July 2021 to incorporate the directive into law.  While Open Knowledge Foundation is encouraged to see some of the new provisions, we have concerns – many of which we laid out in a 2018 blogpost – about missed opportunities for further progress towards a fair, free and open future across the EU. Open data stickers Lack of public input Firstly, the revised directive gives responsibility for choosing which high-value datasets to publish over to member states but there are no established mechanisms for the public to provide input into the decisions.  Broad thematic categories – geospatial; earth observation and environment; meteorological; statistics; companies and company ownership; and mobility – are set out for these datasets but the specifics will be determined over the next two years via a series of further implementing acts. Datasets eventually deemed to be high-value shall be made “available free of charge … machine readable, provided via APIs and provided as a bulk download, where relevant”. Despite drawing on our Global Open Data Index to generate a preliminary list of high-value datasets, this decision flies in the face of years of findings from the Index showing how important it is for governments to engage with the public as much and as early as possible to generate awareness and increase levels of reuse of open data. We fear that this could lead to a further loss of public trust by opening the door for special interests, lobbyists and companies to make private arguments against the release of valuable datasets like spending records or beneficial ownership data which is often highly disaggregated and allows monetary transactions to be linked to individuals. Partial definition of high-value data Secondly, defining the value of data is also not straightforward. Papers from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement about what data’s “value” is. What counts as high-value data should not only be based on quantitative indicators such as potential income generation, breadth of business applications or numbers of beneficiaries – as the revised directive sets out – but also use qualitative assessments and expert judgment from multiple disciplines. Currently less than a quarter of the data with the biggest potential for social impact is available as truly open data even from countries seen as open data leaders, according to the latest Open Data Barometer report from our colleagues at the World Wide Web Foundation. Why? Because “governments are not engaging enough with groups beyond the open data and open government communities”.   Lack of clarity on recommended licenses Thirdly, in line with the directive’s stated principle of being “open by design and by default”, we hope to see countries avoiding future interoperability problems by abiding by the requirement to use open standard licences when publishing these high-value datasets. It’s good to see that the EU Commission itself has recently adopted Creative Commons licences when publishing its own documents and data.  But we feel – in line with our friends at Communia – that the Commission should have made clear exactly which open licences they endorsed under the updated directive, by explicitly recommending the adoption of Open Definition compliant licences from Creative Commons or Open Data Commons to member states. The directive also missed the opportunity to give preference to public domain dedication and attribution licences in accordance with the EU’s own LAPSI 2.0 licensing guidelines, as we recommended. The European Data Portal indicates that there could be up to 90 different licences currently used by national, regional, or municipal governments. Their quality assurance report also shows that they can’t automatically detect the licences used to publish the vast majority of datasets published by open data portals from EU countries. If they can’t work this out, the public definitely won’t be able to: meaning that any and all efforts to use newly-released data will be restrained by unnecessarily onerous reuse conditions. The more complicated or bespoke the licensing, the more likely data will end up unused in silos, our research has shown. 27 of the 28 EU member states may now have national open data policies and portals but, once discovered, it is currently likely that – in addition to confusing licencing – national datasets lack interoperability. For while the EU has substantial programmes of work on interoperability under the European Interoperability Framework, they are not yet having a major impact on the interoperability of open datasets. Open Knowledge Foundation research report: Avoiding data use silos More FAIR data Finally, we welcome the provisions in the directive obliging member states to “[make] publicly funded research data openly available following the principle of open by default and compatible with FAIR principles.” We know there is much work to be done but hope to see wide adoption of these rules and that the provisions for not releasing publicly-funded data due to “confidentiality” or “legitimate commercial interests” will not be abused. The next two years will be a crucial period to engage with these debates across Europe and to make sure that EU countries embrace the directive’s principle of openness by default to release more, better information and datasets to help citizens strive towards a fair, free and open future.

What data counts in Europe? Towards a public debate on Europe’s high value data and the PSI Directive

- January 16, 2019 in Open Government Data, Open Standards, Policy, research

This blogpost was co-authored by Danny Lämmerhirt and Pierre Chrzanowski (*author note at the bottom) January 22 will mark a crucial moment for the future of open data in Europe. That day, the final trilogue between European Commission, Parliament, and Council is planned to decide over the ratification of the updated PSI Directive. Among others, the European institutions will decide over what counts as ‘high value’ data. What essential information should be made available to the public and how those data infrastructures should be funded and managed are critical questions for the future of the EU. As we will discuss below, there are many ways one might envision the collective ‘value’ of those data. This is a democratic question and we should not be satisfied by an ill and broadly defined proposal. We therefore propose to organise a public debate to collectively define what counts as high value data in Europe.

What does PSI Directive say about high value datasets?  

The European Commission provides several hints in the current revision of the PSI Directive on how it envisions high value datasets. They are determined by one of the following ‘value indicators’:
  • The potential to generate significant social, economic, or environmental benefits,
  • The potential to generate innovative services,
  • The number of users, in particular SMEs,  
  • The revenues they may help generate,  
  • The data’s potential for being combined with other datasets
  • The expected impact on the competitive situation of public undertakings.
Given the strategic role of open data for Europe’s Digital Single Market, these indicators are not surprising. But as we will discuss below, there are several challenges defining them. Also, there are different ways of understanding the importance of data. The annex of the PSI Directive also includes a list of preliminary high value data, drawing primarily from the key datasets defined by Open Knowledge International’s (OKI’s) Global Open Data Index, as well as the G8 Open Data Charter Technical Annex. See the proposed list in the table below. List of categories and high-value datasets:
Category Description
1. Geospatial Data Postcodes, national and local maps (cadastral, topographic, marine, administrative boundaries).
2. Earth observation and environment Space and situ data (monitoring of the weather and of the quality of land and water, seismicity, energy consumption, the energy performance of buildings and emission levels).
3. Meteorological data Weather forecasts, rain, wind and atmospheric pressure.
4. Statistics National, regional and local statistical data with main demographic and economic indicators (gross domestic product, age, unemployment, income, education).
5. Companies Company and business registers (list of registered companies, ownership and management data, registration identifiers).
6. Transport data Public transport timetables of all modes of transport, information on public works and the state of the transport network including traffic information.
  According to the proposal, regardless of who provide them, these datasets shall be available for free, machine-readable and accessible for download, and where appropriate, via APIs. The conditions for re-use shall be compatible with open standard licences.

Towards a public debate on high value datasets at EU level

There has been attempts by EU Member States to define what constitutes high-value data at national level, with different results. In Denmark, basic data has been defined as the five core information public authorities use in their day-to-day case processing and should release. In France, the law for a Digital Republic aims to make available reference datasets that have the greatest economic and social impact. In Estonia, the country relies on the X-Road infrastructure to connect core public information systems, but most of the data remains restricted. Now is the time for a shared and common definition on what constitute high-value datasets at EU level. And this implies an agreement on how we should define them. However, as it stands, there are several issues with the value indicators that the European Commission proposes. For example, how does one define the data’s potential for innovative services? How to confidently attribute revenue gains to the use of open data? How does one assess and compare the social, economic, and environmental benefits of opening up data? Anyone designing these indicators must be very cautious, as metrics to compare social, economic, and environmental benefits may come with methodical biases. Research found for example, that comparing economic and environmental benefits can unfairly favour data of economic value at the expense of fuzzier social benefits, as economic benefits are often more easily quantifiable and definable by default. One form of debating high value datasets could be to discuss what data gets currently published by governments and why. For instance, with their Global Open Data Index, Open Knowledge International has long advocated for the publication of disaggregated, transactional spending figures. Another example is OKI’s Open Data For Tax Justice initiative which wanted to influence the requirements for multinational companies to report their activities in each country (so-called ‘Country-By-Country-Reporting’), and influence a standard for publicly accessible key data.   A public debate of high value data should critically examine the European Commission’s considerations regarding the distortion of competition. What market dynamics are engendered by opening up data? To what extent do existing markets rely on scarce and closed information? Does closed data bring about market failure, as some argue (Zinnbauer 2018)? Could it otherwise hamper fair price mechanisms (for a discussion of these dynamics in open access publishing, see Lawson and Gray 2016)? How would open data change existing market dynamics? What actors proclaim that opening data could purport market distortion, and whose interests do they represent? Lastly, the European Commission does not yet consider cases of government agencies  generating revenue from selling particularly valuable data. The Dutch national company register has for a long time been such a case, as has the German Weather Service. Beyond considering competition, a public debate around high value data should take into account how marginal cost recovery regimes currently work.

What we want to achieve

For these reasons, we want to organise a public discussion to collectively define
  1. i) What should count as a high value datasets, and based on what criteria,
  2. ii) What information high value datasets should include,
  3. ii) What the conditions for access and re-use should be.
The PSI Directive will set the baseline for open data policies across the EU. We are therefore at a critical moment to define what European societies value as key public information. What is at stake is not only a question of economic impact, but the question of how to democratise European institutions, and the role the public can play in determining what data should be opened.

How you can participate

  1. We will use the Open Knowledge forum as main channel for coordination, exchange of information and debate. To join the debate, please add your thoughts to this thread or feel free to start a new discussion for specific topics.
  2. We gather proposals for high value datasets in this spreadsheet. Please feel free to use it as a discussion document, where we can crowdsource alternative ways of valuing data.
  3. We use the PSI Directive Data Census to assess the openness of high value datasets.
We also welcome any reference to scientific paper, blogpost, etc. discussing the issue of high-value datasets. Once we have gathered suggestions for high value datasets, we would like to assess how open proposed high-value datasets are. This will help to provide European countries with a diagnosis of the openness of key data.     Author note: Danny Lämmerhirt is senior researcher on open data, data governance, data commons as well as metrics to improve open governance. He has formerly worked with Open Knowledge International, where he led its research activities, including the methodology development of the Global Open Data Index 2016/17. His work focuses, among others, on the role of metrics for open government, and the effects metrics have on the way institutions work and make decisions. He has supervised and edited several pieces on this topic, including the Open Data Charter’s Measurement Guide. Pierre Chrzanowski is Data Specialist with the World Bank Group and a co-founder of Open Knowledge France local group. As part of his work, he developed the Open Data for Resilience Initiative (OpenDRI) Index, a tool to assess the openness of key datasets for disaster risk management projects. He has also participated in the impact assessment prior to the new PSI Directive proposal and has contributed to the Global Open Data Index as well as the Web Foundation’s Open Data Barometer.

Paris Peace Forum Hackathon: A new chance to talk about open data

- November 27, 2018 in Events, Open Data, Open Government Data, open-government, paris peace forum

A few weeks ago we had the chance to attend the first edition of the Paris Peace Forum. The goal of this new initiative is to exchange and discuss concrete global governance solutions. More than 10,000 people attended, 65 Heads of State and Government were present, and 10 international organizations leaders convened for those three days at La Grande Halle de La Villette.   In parallel, the Paris Peace Forum hosted a hackathon to find new approaches to different challenges proposed by four different organizations. Hosted by the awesome Datactivist team, during these three days we worked on: Transparency of international organizations budgets, Transparency of major international event budgets, Transparency of public procurement procedures and Communication of financial data to the public. We had an attendance of about 80 participants, both experts in different topics, students from France and people interested in collaborating on building solutions. The approach was simple: Let’s look at the problems and see what kind of data will be useful. Day one The first day of the hackathon we got to hear the challenges that each organization had for us. Then we form teams based on the interests of the participants. This left us with smaller teams that would get to work on their projects along with the mentors. On that first day we also had the presence of two Heads of State to talk about innovation and technology. The first day concluded with a few ideas of what we wanted to do as well as a better understanding of the data that we could use. Day two Day two was the most intense. The teams got to decide what their solution would be and build it, or at least get to a minimum viable product. This was no simple task. Some teams had a hard time deciding what kind of solution they wanted to build. Some teams made user personas and user stories, some authors looked at data and built their solutions from there and some others started from a very specific set of problems related to their challenge. By the end of this day the teams had to present their projects to the other teams as well as to the mentors with at least some advances on their final projects. ​Day three Day three was a day full of excitement, but also for the mentors since we had to take one final project to present on the main stage of the Paris Peace Forum. During the morning the teams tweaked and fixed their projects and prepared their pitches, then presented to the mentors. Selecting only one final project for each of the challenges was a challenge by itself. But in the end we ended up with four really great projects:
  • Contract Fit selected by Open State Foundation
  • Tackling Climate Change – selected by the World Bank
  • LA PORTE – selected by the Open Contracting Partnership
  • Know your chances – selected by ETALAB
Each of these teams presented their projects at the main stage of the Paris Peace Forum. You can see the video here. This was a really interesting first edition of a hackathon in such a high level event covering such important topics. I was really happy to see so much engagement from both participants and mentors. It was also great to see the amazing job that our hosts made at putting all this together. We expect to see this exercise of innovation become a crucial part of future instances of the Peace Forum.  

The next target user group for the open data movement is governments

- September 18, 2018 in Open Data, Open Government Data

Here’s an open data story that might sound a bit counterintuitive. Last month a multinational company was negotiating with an African government to buy an asset. The company, which already owned some of the asset but wanted to increase its stake, said the extra part was worth $6 million. The government’s advisers said it was worth at least three times that. The company disputed that. The two sides met for two days and traded arguments, sometimes with raised voices, but the meeting broke up inconclusively. A week later, half way round the world, the company’s headquarters issued a new investor presentation. Like all publicly listed companies, its release was filed with the appropriate stock market regulator, and sent out by email newsletter. Where a government adviser picked it up. In it, the company  advertised to investors how it had increased the value of its asset – the asset discussed in Africa – by four times in 18 months, and gave a valuation of the asset now. Their own valuation, it turned out, was indeed roughly three times the $6 million the company had told the government it was worth. Probably, the negotiators were not in touch with the investor relations people. But the end result was that the company had blown its negotiating position because, in effect, as a whole institution, it didn’t think a small African government could understand disclosure practise on an international stock market, subscribe to newsletters, or read their website. The moral of the story is: we need to expand the way we think about governments and open data. In the existing paradigm, governments are seen as the targets of advocacy campaigns, to release data they hold for public good, and enact legislation which binds themselves, and others, to release. Civil society tries hunts for internal champions within government, international initiatives (EITI, OGP etc) seek to bind governments in to emergent best practise, and investigative journalists and whistleblowers highlight the need for better information by dramatic cases of all the stuff that goes wrong and is covered up. And all of that is as it should be. But what we see regularly in our work at OpenOil is that there is also huge potential to engage government – at all levels – as users of open data. Officials in senior positions are sitting day after day, month after month, trying to make difficult decisions, under the false impression that they have little or no data. Often they don’t have clear understanding and access to data produced by other parts of their own government, and they are unaware of the host of broader datasets and systems. Initiatives like EITI which were founded to serve the public interest in data around natural resources have found a new and receptive audience in various government departments seeking to get a joined up view of their own data. And imagine if governments were regular and systematic users of open data and knowledge systems, how it might affect their interaction with advocacy campaigns. Suddenly, this would not be a one way street – governments would be getting something out of open data, not just responding to what, from their perspective, often seems like the incessant demands of activists. It could become more of a mutual backscratching dynamic. There is a paradox at the heart of much government thinking about information. In institutions with secretive cultures, there can be a weird ellipsis of the mind in which information which is secret must be important, and information which is open must be, by definition, worthless. Working on commercial analysis of assets managed by governments, we often find senior officials who believe they can’t make any progress because their commercial partners, the multinationals, hold all the data and don’t release it. While it is true that there is a stark asymmetry of information, we have half a dozen cases where the questions the government needed to answer could be addressed by data downloadable from the Internet. You have to know where to look of course. But it’s not rocket science. In one case, a finance ministry official had all the government’s “secret” data sitting on his laptop but we decided to go ahead and model a major mining project using public statements by the company anyway because the permissions needed from multiple departments to show the data to anyone else, let alone incorporate them in a model which might be published, would take months or years. Of course reliance on open data is likely to leave gaps and involves careful questions of interpretation. But our experience is that these have never been “deal breakers” – we have never had to abandon an analytical project because we couldn’t achieve good enough results with public data. Because the test of any analytical project is not “is it perfect?” but “does it take us on from where we are now, and can we comfortably state what we think the margins of error are?”. The potential is not confined to the Global South. Government at all levels and in all parts of the world could benefit greatly from more strategic use of open data. And it is in the interest of the open data movement to help them.

A short story about Open Washing

- August 20, 2018 in IODC, iodc18, Open Data, Open Government Data, openwashing

Great news! The International Open Data Conference (IODC) accepted my proposal about Open Washing. The moment I heard this I wanted to write something to invite everyone to our session. It will be a follow-up to the exchange we had during IODC in 2015. First a couple disclaimers: This text is not exactly about data. Open Washing is not an easy conversation to have. It’s not a comfortable topic for anyone, whether you work in government or civil society. Sometimes we decide to avoid it (I’m looking at you, OGP Summit!). To prepare this new session I went through the history of our initial conversation. I noticed that my awesome co-host, Ana Brandusescu summarised everything here. I invite you to read that blogpost and then come back. Or keep reading and then read the other post. Either way, don’t miss Ana’s post. What comes next is a story. I hope this story will illustrate why these uncomfortable conversations are important. Second disclaimer: everything in this story is true. It is a fact that these things happened. Some of them are still happening. It is not a happy story, and I’m sorry if some people might feel offended by me telling it. There was once a country that had a pretty young democracy. That country was ruled by one political party for 70 years and then, 18 years ago decided it was enough. Six years ago, that political party came back. They won the presidential election. How this happened is questionable but goes beyond the reach of this story right now. When this political party regained power the technocrats thought this was good news. Some international media outlets thought the new president would even “save” the country. The word “save” may sound like too much but there was a big wave of violence that had built from previous years. Economic development was slow and social issues were boiling. There was a big relationship of this to corruption in many levels of government. In this context, there seemed to be a light at the end of the tunnel. The president’s office decided to make open government a priority. Open data would be a tool to promote proactive transparency and economic development. They signed all the international commitments they could. They chaired international spaces for everything transparency related. They set up a team with young and highly prepared professionals to turn all this into reality. But then, the tunnel seemed to extend and the light seemed dimmer. In spite of these commitments some things that weren’t supposed to happen, happened. Different journalistic researches found out what seemed like acts of corruption. A government contractor gave the president 7 million dollar house during the campaign. The government awarded about 450 million USD in irregular contracts. Most of these contracts didn’t even result in actual execution of works or delivery of goods. They spied on people from the civil society groups that collaborated with them. 45 journalists, who play a big role in this story, were murdered in the last 6 years. For doing their job. For asking questions that may be uncomfortable for some people. There is a lot more to the story but I will leave it here. That doesn’t mean it ends here. It’s still happening. It seems like this political party doesn’t care about using open washing anymore. They don’t care anymore because they’re leaving. But we should care because we stay. We need to talk and discuss this in the open. The story of this country, my country, is very particular and surreal but holds a lot of lessons. This is probably the worst invitation you’ve ever received. But I know there are a lot of lessons and knowledge out there. So if you are around, come to our session during IODC. If you’re not, talk about this issue where you live. Or reach out to others who might be interested. It probably won’t be comfortable but you will for sure bring a new perspective to your work. This is also an invitation to try it.

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.

Open Council Data of more than 100 Dutch municipalities reused in app WhereGovernment

- March 7, 2018 in netherlands, open council data, Open Data, Open Geodata, Open Government Data

This blog has been reposted from the Open State Foundation blog. More than a hundred Dutch municipalities release Open Council Data, including all documents of the municipal council – decisions, agendas, motions, amendments and policy documents – easily and collectively accessible. The data is now available for reuse in applications. Recently, the first app that reuses the data, WhereGovernment, was launched. 

Strengthen local democracy

Citizens, entrepreneurs, journalists, civil servants, journalists, scientists and all other interested parties can use Open Council Data to check easily what is going on in municipalities around a specific theme. Rural, regional, by municipality or even by neighborhood. In 2015 Open State Foundation, together with the Ministry of the Interior and five municipalities (Heerde, Oude IJsselstreek, Den Helder, Utrecht and Amstelveen), started a pilot to provide access to information as open data. In cooperation with VNG Realisatie and Argu, work was done on standardisation and upscaling. The goal is to strengthen local democracy.

Reusable local government data

The council information was already public, but only available per municipality and often not easy to find or reuse. Of 102 municipalities – including Amsterdam and Utrecht, but also smaller municipalities such as Binnenmaas and Dongen – all council documents can now be found on the Open Council Information website. These documents are available as open data: standardised and reusable. For example, app builders, websites, media and other parties can use and publish the information quickly and easily.

WhereGovernment app

To explore the possibilities of the Open Council Data, VNG Realisatie organised a competition in 2017 to develop the best app: the App Challenge Open Council Information. The first prize went to the webapp WaarOverheid of developer Qollap, which places council information on the map based on the basis of smart algorithms. This allows residents to see what is going on in their neighbourhood – or in a completely different neighbourhood. The app has been further developed with the prize money. From today – in the run-up to the municipal elections of 21 March 2018 – WaarGovernment can be used by everyone. Everything about the app WaarOverheid can be found on waaroverheid.nl.

Gold mine

Robert van Dijk, council clerk of the municipality of Teylingen and chairman of the advisory group Open Council Information, is enthusiastic about the results: ‘We can continue to talk about the theme of open government, but in order to achieve it we have to take action. The information society is a fact. Citizens can access unimaginable information via digital channels, but the government lags behind. And that while we are sitting on a huge amount of data. Society demands transparency from us, we have to get away from the back rooms. This is the instrument for that. In this way we can very effectively strengthen our democracy and make open government and open accountability possible. I see Open Council information as a gold mine. This standardisation is the starting point for upcoming projects and apps. If all municipalities join in later, nobody will have to use information from 380 islands to know which trends are going on. In short: a wonderful project.’ Open Council Information is part of the Digital Agenda 2020 and the Open Government Action Plan of the Netherlands (action point 6) with the Association of Netherlands Municipalities (VNG) in association with Open State Foundation, the driver of the Open Council Information project, and various local authorities and the Ministry of Interior and Kingdom Relations.