You are browsing the archive for Open Data.

Open Washing: digging deeper into the tough questions

- October 25, 2018 in IODC, iodc18, Open Data, openwashing

This blog was written by James McKinney, Oscar Montiel and Ana Brandusescu For the second time in history, the International Open Data Conference (IODC) opened a space for us to talk about #openwashing. The insights from IODC16 have been brilliantly summarised by Ana Brandusescu, also a host of this year’s session. On this occasion, we dug deeper into some of the issues and causes of open washing. We expect and hope this to be a discussion we can have more than once every couple of years at a conference, so we invite you all to contact the authors and let us know your thoughts! In order to discuss open washing in a very limited time, we framed the discussion around Heimstädt’s paper from 2017. To go beyond data publication, we asked participants to think about four key questions:
  1. How does a particular context encourage or discourage open washing?
  2. How does openness serve, or not serve, non-technical communities?
  3. How is a lack of openness tied to culture?
  4. What is our role as civil society organization/infomediary or government in tackling open washing?
This last question was key to try and frame open washing as something beyond blaming one group or another as the sole culprit of this practice. To accommodate the large number of Spanish and English speakers, we split into two language groups. Here, we summarize the key points of each discussion.

English group

Lack of power

Participants described scenarios in which publishers lacked the power to publish (whether by design or not). For example, an international non-profit organization (INGO) receives donor funding to hire a local researcher. The INGO has an open data policy, but when you request the data collected by the researcher, the INGO refers you to the donor (citing intellectual property clauses of the funding agreement), who then refers you to the researcher (wishing to respect the embargo on an upcoming article). In short, the INGO has an open data policy, but it lacks the power to publish this data and others like it. In this and many other cases, the open data program limited itself to data the organization owns, without looking more comprehensively at how the organization manages intellectual property rights to data it finances, purchases, licenses, etc. Such scenarios become open washing when, whether deliberately or through negligence, a government fails to secure the necessary intellectual property rights to publish data of high value or of high interest. This risk is acute for state-owned enterprises, public-private partnerships, procured services and privatized services. Common examples relate to address data. For example, Canada Post’s postal code data is the country’s most requested dataset, but Canada’s Directive on Open Government doesn’t apply to Canada Post, as it’s a state-owned enterprise. Similarly, when the United Kingdom privatized the Royal Mail, it didn’t retain the postcode data as a public dataset. Besides limits to the application of open data policies, another way in which organizations lack power is with respect to their enforcement. To be effective, policies must have consequences for noncompliance. (See, for example, Canada’s Directive on Open Government.) One more way in which power is limited is less legal and more social. Few organizations take responsibility for failing to respect their open data principles, but acknowledging failure is a first step toward improvement. Similarly, few actors call out their own and/or others’ failures, which leads to a situation in which failures are silent and unaddressed. Opportunities:
  • Open data programs should consider the intellectual property management of not only the data an organization owns, but also the data it finances, purchases, licenses, etc.
  • Open data programs should extend to all of government, including state-owned enterprises, public-private partnerships, procured services and privatized services.
  • To be enforceable, open data policies must have consequences for noncompliance.

Lack of knowledge or capacity

Participants also described scenarios in which publishers lacked the knowledge or capacity to publish effectively. Data is frequently made open but not made useful, for lack of care for who might use it. For example, open by default policies can incentivize ‘dumping’ as much data as possible into a catalog, but opening data shouldn’t be ‘like taking trash out.’ In addition, few publishers measure quality or prioritize datasets for release with stakeholder input, in order to improve the utility of datasets. In many cases, public servants have good intentions and are working with limited resources to overcome these challenges, in which case they aren’t open washing. However, their efforts may be ‘washed’ by others. For example, a minister might over-sell the work, out of a desire to claim success after putting in substantial effort. Or, a ranking or an initiative like the Open Government Partnership might celebrate the work, despite its shortcomings – giving a ‘star’ for openness, without a real change in openness. Opportunities: Make rankings more resistant to open washing. For example, governments can read the assessment methodology of the Open Data Barometer and ‘game’ a high score. Is there a way to identify, measure and/or account for open washing within such methodologies? Are there any inspiring methods from, for example, fighting bid rigging?

Other opportunities

While the discussion focused on the areas above, participants shared other ideas to address open washing, including to:
  • Make it a common practice to disclose the reason a dataset is not released, so that it is harder for governments to quietly withhold a dataset from publication.
  • Balance advocacy with collaboration. For example, if a department is open washing, make it uncomfortable in public, while nurturing a working relationship with supportive staff in private, in order to push for true openness. That said, advocacy has risks, which may not be worth the reward in all cases of open washing.
 

Spanish group

Political discourse

Participants described how, in their countries, the discourse around openness came from the top-down and was led by political parties. In many cases, a political party formed government and branded its work and ways of working as ‘open’. This caused their efforts to be perceived as partisan, and therefore at greater risk of being reversed when an opposing party formed government.  This also meant that public servants, especially in middle and lower-level positions, didn’t see the possible outcomes of openness in their activities as an important part of their regular work, but as extra, politically-motivated work within their already busy schedules. Opportunities: Make openness a non-partisan issue. Encourage a bottom-up discourse.

Implementation challenges

Participants described many challenges in implementing openness:
  • A lack of technical skills and resources.
  • A focus on quantity over quality.
  • Governments seeing openness as an effort that one or two agencies can deliver, instead of as an effort that requires all agencies to change how they work.
  • Governments opening data only in ways and formats with which they are already familiar, and working only with people they already know and trust.
  • A fear of being judged.
Opportunities: Co-design data formats.  Author standardized manuals for collecting and publishing data.

The value of data

A final point was the lack of a broad appreciation that data is useful and important. As long as people inside and outside government don’t see its value, there will be little motivation to open data and to properly govern and manage it. Opportunities: Research government processes and protocols for data governance and management.  

Wrapping up

At IODC18, we created a space to discuss open washing. We advanced the conversation on some factors contributing to it, and identified some opportunities to address it. However, we could only touch lightly on a few of the many facets of open washing. We look forward to hearing your thoughts on these discussions and on open washing in general! You can contact us via Twitter or email:

Youth Data Champions: Empowerment, Leadership and Data

- October 2, 2018 in nepal, OK Nepal, Open Data, training

This post was jointly written by Shubham Ghimire, Chief Operating Officer and Nikesh Balami, Chief Executive Officer of Open Knowledge Nepal as a part of the Youth Empowerment, Youth Leadership and Data Workshop. It has been reposted from the Open Knowledge Nepal blog. This summer,  the PAHICHAN – Youth Empowerment, Youth Leadership and Data workshop was conducted in 6 districts of 3 different provinces of Nepal, where more than 126  energetic youths were trained and sensitized on the concept of open data. The aim was to create a network of young data leaders who will lead and support the development of their communities through the use of open data as evidence for youth-led and data-driven development. The three days in-house workshop were conducted in Itahari, Bhojpur, Butwal, Nepalgunj, Dhangadhi and Dadeldhura from 12th July to 14th August 2018. During the 3 days workshop, the participants were informed about concepts of youth rights, leadership skills, and were oriented about the use of open data, visualization and mapping as evidence to tackle issues in their community. The session about youth empowerment and leadership was facilitated by the YUWA team and the hands-on workshop on data, visualization and mapping were facilitated by Open Knowledge Nepal, Nepali in Data and NAXA. The team was accompanied by the representative of the Data for Development Program in Nepal and local partners. The following local partners helped in coordination in organizing a residential workshop successfully:
Districts Local Partner
Itahari Youth Development Centre Itahari
Bhojpur HEEHURLDE-Nepal
Butwal Rotaract Club of Butwal
Nepalgunj Cheers Creative Nepal – CCN and District Youth Club Network
Dhangadi Far West Multipurpose Center
Dadeldhura Social Unity Club

Open Knowledge Nepal’s Session at PAHICHAN – Youth Empowerment, Youth Leadership and Data Workshop

On the first day of the workshop, Open Knowledge Nepal delivered a session on ‘Open Data in Nepal’, where the history, current situation, definition, importance, working methodology and different open data initiatives from government, CSOs and private sectors were included. On the second day after the orientation about data-driven brainstorming, participants were divided into groups and each group was asked to come up with a problem in their community. Then the groups started working on their identified issues where they explored existing data of the problems, hidden opportunities and probable solutions using data, the challenges, impact, and identification of stakeholders to solve the raised issue. On the final day, the participants further worked in two groups to plan evidence-based campaigns on the issues they have worked on in the second day. Most of the groups planned to do awareness campaigns by making use of data, infographics, and maps. Each group was provided with seed money of NPR 7500  to implement their action plan within one month. We realized the data-driven brainstorming session was very fruitful for the young participants and definitely helped them in understanding local community issues through the use of open data. Now, these participants can easily plan and conduct small impactful projects,  evidence-based action plans, and campaigns with limited resources. A list of issues which were selected for the brainstorming session:
Districts Community Issues
Itahari Illiteracy, Unemployment, Substance Abuse, Caste Discrimination, Pollution
Bhojpur Quality Education, Migration, Physical Infrastructure, Gender Discrimination, Unemployment
Butwal Substance Abuse, Quality Education
Nepalgunj Substance Abuse, Cleanness
Dhangadi Youth Unemployment, Substance Abuse
Dadeldhura Good Governance, Substance Abuse (Alcohol Consumption)

Project Impact

  • Human Resources: The increase in the data demanding human resources, who can now understand and use the available data to tackle the local issues of their community.
  • Data Champions: All the 126 youth data champions are now capable of effectively planning and running evidence-based action plans and campaigns in their community.
  • Community Projects: The campaigns/projects led by each team on the community issues are making a difference in the community by awareness and advocacy through the use of infographics, mapping, and open data.
  • Future: We can mobilize these youth data champions for awareness and advocacy campaigns at the local level.

Major Takeaway

  • Digital Divide: In urban areas like Itahari, Butwal, and Nepalgunj most of the participants have the basic understanding of the overall topics but participants from the peri-urban region like Bhojpur, Dhangadhi, and Dadeldhura were not familiar about the topics and it was difficult for most of them to understand the subject.
  • Female Participation: One of the positive factors is that the female participation rate is higher than the male. Participants were energetic, enthusiastic and curious throughout the workshop.
  • Access to Internet: Due to the lack of internet facility in peri-urban areas, a lot of things were left unexplored.
  • Continuity: Many participants requested to organize similar kinds of events and hands-on workshop frequently. The workshop has definitely helped in strengthening the demand side of the data.
  • Practical Implementation: Participants learned the importance of evidence-based action plans and data-driven campaigns and development, but more of these kinds of the workshop are needed to teach them about the practical implementation.

Lesson Learned

  • Educational diversity of participants: We realized that most of the participants were from the same background. It would be better if there were participants from different backgrounds.
  • Onsite improvisation: We had to adjust and improvise our presentations and sessions according to the understanding level of the participants.
  • Digital literacy: You need to have a basic knowledge of technology to understand the use and value of data, visualizations, and mapping. But we felt that most of the participants in peri-urban areas lack the basic understanding. So we think it may not have been that much fruitful for them.
  The workshop was organized by YUWA and Data for Development in Nepal in coordination with Nepal in Data, Open Knowledge Nepal and NAXA, funded by UK Department for International Development implemented by The Asia Foundation and Development Initiatives.

Youth Data Champions: Empowerment, Leadership and Data

- October 2, 2018 in nepal, OK Nepal, Open Data, training

This post was jointly written by Shubham Ghimire, Chief Operating Officer and Nikesh Balami, Chief Executive Officer of Open Knowledge Nepal as a part of the Youth Empowerment, Youth Leadership and Data Workshop. It has been reposted from the Open Knowledge Nepal blog. This summer,  the PAHICHAN – Youth Empowerment, Youth Leadership and Data workshop was conducted in 6 districts of 3 different provinces of Nepal, where more than 126  energetic youths were trained and sensitized on the concept of open data. The aim was to create a network of young data leaders who will lead and support the development of their communities through the use of open data as evidence for youth-led and data-driven development. The three days in-house workshop were conducted in Itahari, Bhojpur, Butwal, Nepalgunj, Dhangadhi and Dadeldhura from 12th July to 14th August 2018. During the 3 days workshop, the participants were informed about concepts of youth rights, leadership skills, and were oriented about the use of open data, visualization and mapping as evidence to tackle issues in their community. The session about youth empowerment and leadership was facilitated by the YUWA team and the hands-on workshop on data, visualization and mapping were facilitated by Open Knowledge Nepal, Nepali in Data and NAXA. The team was accompanied by the representative of the Data for Development Program in Nepal and local partners. The following local partners helped in coordination in organizing a residential workshop successfully:
Districts Local Partner
Itahari Youth Development Centre Itahari
Bhojpur HEEHURLDE-Nepal
Butwal Rotaract Club of Butwal
Nepalgunj Cheers Creative Nepal – CCN and District Youth Club Network
Dhangadi Far West Multipurpose Center
Dadeldhura Social Unity Club

Open Knowledge Nepal’s Session at PAHICHAN – Youth Empowerment, Youth Leadership and Data Workshop

On the first day of the workshop, Open Knowledge Nepal delivered a session on ‘Open Data in Nepal’, where the history, current situation, definition, importance, working methodology and different open data initiatives from government, CSOs and private sectors were included. On the second day after the orientation about data-driven brainstorming, participants were divided into groups and each group was asked to come up with a problem in their community. Then the groups started working on their identified issues where they explored existing data of the problems, hidden opportunities and probable solutions using data, the challenges, impact, and identification of stakeholders to solve the raised issue. On the final day, the participants further worked in two groups to plan evidence-based campaigns on the issues they have worked on in the second day. Most of the groups planned to do awareness campaigns by making use of data, infographics, and maps. Each group was provided with seed money of NPR 7500  to implement their action plan within one month. We realized the data-driven brainstorming session was very fruitful for the young participants and definitely helped them in understanding local community issues through the use of open data. Now, these participants can easily plan and conduct small impactful projects,  evidence-based action plans, and campaigns with limited resources. A list of issues which were selected for the brainstorming session:
Districts Community Issues
Itahari Illiteracy, Unemployment, Substance Abuse, Caste Discrimination, Pollution
Bhojpur Quality Education, Migration, Physical Infrastructure, Gender Discrimination, Unemployment
Butwal Substance Abuse, Quality Education
Nepalgunj Substance Abuse, Cleanness
Dhangadi Youth Unemployment, Substance Abuse
Dadeldhura Good Governance, Substance Abuse (Alcohol Consumption)

Project Impact

  • Human Resources: The increase in the data demanding human resources, who can now understand and use the available data to tackle the local issues of their community.
  • Data Champions: All the 126 youth data champions are now capable of effectively planning and running evidence-based action plans and campaigns in their community.
  • Community Projects: The campaigns/projects led by each team on the community issues are making a difference in the community by awareness and advocacy through the use of infographics, mapping, and open data.
  • Future: We can mobilize these youth data champions for awareness and advocacy campaigns at the local level.

Major Takeaway

  • Digital Divide: In urban areas like Itahari, Butwal, and Nepalgunj most of the participants have the basic understanding of the overall topics but participants from the peri-urban region like Bhojpur, Dhangadhi, and Dadeldhura were not familiar about the topics and it was difficult for most of them to understand the subject.
  • Female Participation: One of the positive factors is that the female participation rate is higher than the male. Participants were energetic, enthusiastic and curious throughout the workshop.
  • Access to Internet: Due to the lack of internet facility in peri-urban areas, a lot of things were left unexplored.
  • Continuity: Many participants requested to organize similar kinds of events and hands-on workshop frequently. The workshop has definitely helped in strengthening the demand side of the data.
  • Practical Implementation: Participants learned the importance of evidence-based action plans and data-driven campaigns and development, but more of these kinds of the workshop are needed to teach them about the practical implementation.

Lesson Learned

  • Educational diversity of participants: We realized that most of the participants were from the same background. It would be better if there were participants from different backgrounds.
  • Onsite improvisation: We had to adjust and improvise our presentations and sessions according to the understanding level of the participants.
  • Digital literacy: You need to have a basic knowledge of technology to understand the use and value of data, visualizations, and mapping. But we felt that most of the participants in peri-urban areas lack the basic understanding. So we think it may not have been that much fruitful for them.
  The workshop was organized by YUWA and Data for Development in Nepal in coordination with Nepal in Data, Open Knowledge Nepal and NAXA, funded by UK Department for International Development implemented by The Asia Foundation and Development Initiatives.

The next target user group for the open data movement is governments

- September 18, 2018 in Open Data, Open Government Data

Here’s an open data story that might sound a bit counterintuitive. Last month a multinational company was negotiating with an African government to buy an asset. The company, which already owned some of the asset but wanted to increase its stake, said the extra part was worth $6 million. The government’s advisers said it was worth at least three times that. The company disputed that. The two sides met for two days and traded arguments, sometimes with raised voices, but the meeting broke up inconclusively. A week later, half way round the world, the company’s headquarters issued a new investor presentation. Like all publicly listed companies, its release was filed with the appropriate stock market regulator, and sent out by email newsletter. Where a government adviser picked it up. In it, the company  advertised to investors how it had increased the value of its asset – the asset discussed in Africa – by four times in 18 months, and gave a valuation of the asset now. Their own valuation, it turned out, was indeed roughly three times the $6 million the company had told the government it was worth. Probably, the negotiators were not in touch with the investor relations people. But the end result was that the company had blown its negotiating position because, in effect, as a whole institution, it didn’t think a small African government could understand disclosure practise on an international stock market, subscribe to newsletters, or read their website. The moral of the story is: we need to expand the way we think about governments and open data. In the existing paradigm, governments are seen as the targets of advocacy campaigns, to release data they hold for public good, and enact legislation which binds themselves, and others, to release. Civil society tries hunts for internal champions within government, international initiatives (EITI, OGP etc) seek to bind governments in to emergent best practise, and investigative journalists and whistleblowers highlight the need for better information by dramatic cases of all the stuff that goes wrong and is covered up. And all of that is as it should be. But what we see regularly in our work at OpenOil is that there is also huge potential to engage government – at all levels – as users of open data. Officials in senior positions are sitting day after day, month after month, trying to make difficult decisions, under the false impression that they have little or no data. Often they don’t have clear understanding and access to data produced by other parts of their own government, and they are unaware of the host of broader datasets and systems. Initiatives like EITI which were founded to serve the public interest in data around natural resources have found a new and receptive audience in various government departments seeking to get a joined up view of their own data. And imagine if governments were regular and systematic users of open data and knowledge systems, how it might affect their interaction with advocacy campaigns. Suddenly, this would not be a one way street – governments would be getting something out of open data, not just responding to what, from their perspective, often seems like the incessant demands of activists. It could become more of a mutual backscratching dynamic. There is a paradox at the heart of much government thinking about information. In institutions with secretive cultures, there can be a weird ellipsis of the mind in which information which is secret must be important, and information which is open must be, by definition, worthless. Working on commercial analysis of assets managed by governments, we often find senior officials who believe they can’t make any progress because their commercial partners, the multinationals, hold all the data and don’t release it. While it is true that there is a stark asymmetry of information, we have half a dozen cases where the questions the government needed to answer could be addressed by data downloadable from the Internet. You have to know where to look of course. But it’s not rocket science. In one case, a finance ministry official had all the government’s “secret” data sitting on his laptop but we decided to go ahead and model a major mining project using public statements by the company anyway because the permissions needed from multiple departments to show the data to anyone else, let alone incorporate them in a model which might be published, would take months or years. Of course reliance on open data is likely to leave gaps and involves careful questions of interpretation. But our experience is that these have never been “deal breakers” – we have never had to abandon an analytical project because we couldn’t achieve good enough results with public data. Because the test of any analytical project is not “is it perfect?” but “does it take us on from where we are now, and can we comfortably state what we think the margins of error are?”. The potential is not confined to the Global South. Government at all levels and in all parts of the world could benefit greatly from more strategic use of open data. And it is in the interest of the open data movement to help them.

A short story about Open Washing

- August 20, 2018 in IODC, iodc18, Open Data, Open Government Data, openwashing

Great news! The International Open Data Conference (IODC) accepted my proposal about Open Washing. The moment I heard this I wanted to write something to invite everyone to our session. It will be a follow-up to the exchange we had during IODC in 2015. First a couple disclaimers: This text is not exactly about data. Open Washing is not an easy conversation to have. It’s not a comfortable topic for anyone, whether you work in government or civil society. Sometimes we decide to avoid it (I’m looking at you, OGP Summit!). To prepare this new session I went through the history of our initial conversation. I noticed that my awesome co-host, Ana Brandusescu summarised everything here. I invite you to read that blogpost and then come back. Or keep reading and then read the other post. Either way, don’t miss Ana’s post. What comes next is a story. I hope this story will illustrate why these uncomfortable conversations are important. Second disclaimer: everything in this story is true. It is a fact that these things happened. Some of them are still happening. It is not a happy story, and I’m sorry if some people might feel offended by me telling it. There was once a country that had a pretty young democracy. That country was ruled by one political party for 70 years and then, 18 years ago decided it was enough. Six years ago, that political party came back. They won the presidential election. How this happened is questionable but goes beyond the reach of this story right now. When this political party regained power the technocrats thought this was good news. Some international media outlets thought the new president would even “save” the country. The word “save” may sound like too much but there was a big wave of violence that had built from previous years. Economic development was slow and social issues were boiling. There was a big relationship of this to corruption in many levels of government. In this context, there seemed to be a light at the end of the tunnel. The president’s office decided to make open government a priority. Open data would be a tool to promote proactive transparency and economic development. They signed all the international commitments they could. They chaired international spaces for everything transparency related. They set up a team with young and highly prepared professionals to turn all this into reality. But then, the tunnel seemed to extend and the light seemed dimmer. In spite of these commitments some things that weren’t supposed to happen, happened. Different journalistic researches found out what seemed like acts of corruption. A government contractor gave the president 7 million dollar house during the campaign. The government awarded about 450 million USD in irregular contracts. Most of these contracts didn’t even result in actual execution of works or delivery of goods. They spied on people from the civil society groups that collaborated with them. 45 journalists, who play a big role in this story, were murdered in the last 6 years. For doing their job. For asking questions that may be uncomfortable for some people. There is a lot more to the story but I will leave it here. That doesn’t mean it ends here. It’s still happening. It seems like this political party doesn’t care about using open washing anymore. They don’t care anymore because they’re leaving. But we should care because we stay. We need to talk and discuss this in the open. The story of this country, my country, is very particular and surreal but holds a lot of lessons. This is probably the worst invitation you’ve ever received. But I know there are a lot of lessons and knowledge out there. So if you are around, come to our session during IODC. If you’re not, talk about this issue where you live. Or reach out to others who might be interested. It probably won’t be comfortable but you will for sure bring a new perspective to your work. This is also an invitation to try it.

SaveOurAir: An experiment in data-activation

- July 17, 2018 in Open Data

Contemporary cities seem to be in a race to be increasingly ‘smart’ and data-driven. At smart city Expos around the world, visitors are presented with new visual modes of modes of knowing and governing. Dashboards providing birds-eye views of the real-time movement of objects in the city, are perhaps the most iconic of these visualizations. Often presented in a sort of control room environment, they illustrate the promise of instantaneous overviews things like cars, people and trans bins. These are all relevant objects for organizing dynamic cityscapes and creating an efficient city. In short – they seem like the bureaucratic engineers dream of an ordering tool. While such visualizations may be good for a form of cybernetic control of the city, it seems that there is a need for different ways of depicting the contemporary city as well. SaveOurAir – a project funded by EU’s Organicities program – was born out of this idea. What if we could activate data about air pollution in other ways that just providing generic overviews of pollution on a map? Would it be possible to tell more ‘local’ data stories about the issue of air pollution? Can the ‘local’ even be operationalized visually in other ways than points of latitude and longitude? If yes, what would the relevant visual metaphors be and which types of data would they draw upon? From October 2016 to May 2017 we decided to pursue these questions together with a selection of people with a stake in the discussion about air pollution. In this context, the ‘we’ included researchers from The Public Data Lab (see publicdatalab.org) and the ‘people’ included activists, teachers, politicians and public officials from England and Denmark. Through two week-long co-design workshops we co-produced three different functional prototypes that each have its own distinct way of telling a local data-story about air pollution. Full details of all the work can be found on SaveOurAir website, but we will quickly go through each in turn. First, in the MyAir project we produced a teaching kit that enables pupils in upper secondary school to explore issues of air pollution with reference to their own daily whereabouts. Armed with a mobile pollution monitor and their mobile phone, pupils in the Danish town of Gentofte constructed personalized and interactive map maps that contained data about the routes they had traveled and the amount of pollution they had been exposed to at various points on these routes. The aim of the teaching kit is to stimulate pupils to do inquiry into the sources of air pollution that influences their personal lives. You can explore the project here.  

Pictures of the air sensors and the interface for data-exploration made available by the MyAir software

Second, n the Mobilizing Our Air project we produced an online platform which portrays activist groups concerned with air pollution in the Borough of Camden, London. The activist groups’ interests are categorized into interest tags of which the platform’s user can select as many as wished for. Based on the user’s selection the platform shows a geographical map which outlines the location of the activist/activist group. The platform has three main goals: 1. the platform supports local activist groups to connect with each other through visibility on the platform, 2. the platform informs individuals interested in activism about activist groups in their vicinity and invites them to join the movements, 3. the platform brings the topic of air pollution to other activist groups’ attention which already support the cause of better air quality. You can explore the project here.

Four interfaces of the platform prototype for Mobilizing Our Air: groups, groups profile, campaigns, and stories.

Third, in the project entitled The Hot Potato Machine we wanted to understand and visually explore different ways of responding to and apportioning responsibility for complex issues such as air pollution. As actors involved in air pollution will often pass the blame, ‘the hot potato’, to someone else we wanted to create a visual interface for exploring what different actors say about each other in relation to how to tackle air pollution. Rather than focusing on measurements of pollutants, we were interested in how digital data might tell us about different ways of seeing air pollution as an issue, different imagined solutions, the fabric of relationships around it, and where there might be tensions, differences, knots and spaces for movement. Our prototype focuses on a specific “issue story” revealing different views on who is responsible for reducing air pollution from diesel taxis. You can explore the project here.  

Interfaces of the platform prototype for the Hot Potato Machine

These three projects represent the experimental outcome of the SaveOurAir project and they are good illustrations of the way we approach data in the Public Data Lab. We strive to organize social research as an open process – a process in which the research methods are developed at the same time as their results (the prototypes). But neither the methods nor their results are the specific object of our research. Instead, what we hope to hatch through our interventions are new “data publics”: publics that are not just the passive object of commercial and institutional monitoring, but who produce their own data actively and “by design”. For more info, contact Anders Koed Madsen.

Closing feedback loops for more & better open data in Switzerland

- July 10, 2018 in Events, OK Switzerland, Open Data, Switzerland

Last week, the annual open data conference in Switzerland took place in St. Gallen. In this post, Oleg Lavrovsky, activist for Open Knowledge and board member of the Swiss Chapter, shares a look back at the event showcasing the latest developments in the country, with results of the first Open Data Student Awards. For more coverage, photos and links visit Opendata.ch/2018. The #opendatach conference is, for the dedicated, a 24 hour event – starting this year around 6pm on Monday, when Rufus Pollock joined us in Zürich, lasting until 6pm on Tuesday July 3, as a light apéro and quick clean-up closed the doors on the eighth annual gathering of the Swiss Open Knowledge community. A group of organizers and core contributors spent a balmy afternoon perched in the loft at the Impact Hub, debating the state of the nation – which a recent ch.okfn.org blog post recounts – reviewing the recommendations of our Task Force, distributing and discussing the new book. A short night later we were on an early train with Hannes Gassert, checking waypoints over cups of green tea. Finally we arrive on site in St.Gallen, the economic and political center of eastern Switzerland, and host to a modern, internationally renowned University – whose main building was rapidly transformed into our favourite habitat: a burgeoning centre for activism, critical thought and debate.
After quickly saying hello we set to work on setting up the rooms, dodging streams of students rushing to class. In one hacky corner of the event, an unconference showcase sponsored by the local IT community featured 9 projects, submitted through an online platform (hack.opendata.ch), and whose team members were attending the conference. A colorful showcase wall, next to the entrance to the main room where keynotes took place, engendered imaginative discussion, giving participants a chance to find and meet the makers of innovative projects made with open data.

Photo credit: Ernie Deane, CC BY-SA 3.0

You’ll find excellent coverage of the morning’s plenary sessions in the Netzwoche article, highlighting the readiness which our host city St. Gallen demonstrated to support open government data (OGD), sharing a preview of their new open data platform. We learned insights from the cross-border collaboration that has taken place over the past years between the OGD administrations of the cities of St. Gallen and Vienna. Balancing out the mood in the room, we got to hear compelling remarks from a project leader who has so far been frustrated in his attempts to gain funding and political support for his open political data initiative:
“The biggest problem, however, is not the lack of access to data or lack of know-how among those involved. The parliamentary services now provide a good API, so that linking and interpreting various data is feasible. What is lacking above all is sustainability, and in particular sustainable financing.” –Daniel Black, smartmonitor

Keynotes

In the keynote by André Golliez, his upcoming departure from the role as president of Opendata.ch was announced, and he shared his vision for the recently founded Swiss Data Alliance. In this, he strives to make open data a key component of data policy and data infrastructure development in Swiss government and industry. Looking back on how open data has fared in politics since Barack Obama, he expressed worries about the pendulum turning in another direction, and encouraged us not to take things for granted. Hitting closer to home, André spoke about the right to data portability, specifically mentioning revisions to the Swiss Data Protection Act which follow the EU’s GDPR – encouraging our community to get involved in the debate and political process. In our final – much anticipated – morning keynote, Rufus Pollock came on stage to share his renewed vision for openness activism, introducing the main ideas from his new book, The Open Revolution, which he was selling and signing in the conference hall. In Switzerland, we have been keeping close track on developments in the open knowledge movement, influencing our own ongoing organizational transformation as a new generation of activists, policymakers, data wranglers push the project forward. The ideas within the book have been a cause of ceaseless debate for the weeks before the conference, and will surely continue through the summer. Some people complain about seeing the relevance, and we have been enjoying the ensuing debate. Even if Rufus did not manage to convince everyone in the room – if the language barrier, stories from foreign shores, or his radical-common-sense philosophy fail to attract immediate policy or media attention (NB: we eagerly await publication of an interview in the next issue of Das Magazin – follow @tagi_magi), they are certainly leaving a deep impression on our community. 105 copies of the new book distributed at name-your-price along with free digital downloads have put a progressive, challenging text into able hands, and the bold ideas within are helping to reignite and refresh our personal and collective commitment to activism for a fair and sustainable information society.

The workshops

After lunch, we hosted six afternoon workshop tracks (Open Data Startups, Open Smart Cities, Open Data in Science, Linked Open Data, Open Mobility Data, and Blockchain for Open Data), which you can read about, and download presentations from (as well as those of the keynotes), on the conference website. I made a short presentation on Frictionless Data (slides here) in the Science track, which showcased four projects working with, or fostering the development and use of, open data for scientific purposes – and will elaborate a little bit on this workshop here. Marcel Salathé, our workshop lead and a founder of the open foodrepo.org initiative, demonstrated the open data science challenge platform crowdAI developed at EPFL, which connects data science experts and enthusiasts with open data to solve specific problems, through challenges. My talk was about containerization formats for open data, introducing Frictionless Data – which addresses this issue through simple specifications and software – and my work on supporting these standards in the Julia language. Donat Agosti spoke about Plazi, addressing the need of transforming scientific data from publications, books, and other unstructured formats into a persistent and openly accessible digital taxonomic literature. Finally, Rok Roškar introduced the Swiss Data Science Center and its Renku platform, a highly scalable & secure open software platform designed to foster multidisciplinary data (science) collaborations. It was a privilege to take part, and I appreciated the learnings shared and eager discussions. The question came up of how many standardization initiatives it really takes, as well as whether and how improvements to the platform for data sharing really address the fundamental issues in science, and how the open data community can help improve access to high quality experimental data, reproducibility, and collaboration. We are following up on some of these questions already.

Open Data Student Award

And then it was, finally, time to hand over the Open Data Student Award, a project that took months of preparation, three days of 3D printing, hours of nail-bitingly intense jury duty, and only 15 minutes allowed to sum it all up. The jury team – consisting of Prof. Stefan Keller (CH Open), Andreas Amsler (OGD Canton of Zürich) and myself (Opendata.ch) – were impressed with the projects, each truly exemplary.
Every student and supervisor participating this year deserves recognition for making an effort to use, re-publish and to promote open data. In addition to being put on the big screen at the annual conference in St. Gallen and discussed by all the people gathered there, the projects are being given extra attention through community channels.

Congratulations to Jonas Oesch from FHNW Windisch, whose winning project The Hitchhiker’s Guide to Swiss Open Government Data educates readers in an exemplary way about open data, applying open source technical ingenuity and skillful design to a problem that is critical to the open data community.
The open data community is looking for answers to the question of how to better represent the diversity of datasets, putting them into new clothes, so to speak. The hitchhiker’s guide to Swiss Open Government Data is a project that points the way in such a direction.
Details about all the projects can be found on the official announcement. Additionally, we have shared some background and sources of the award open source for you to peruse. We are happy to get feedback and to hear your ideas for where to take the un/conference and award next year! Just drop us a line in the Open Knowledge Switzerland forum.

Wrapping up

As the football match got going that would eventually see our country rather unconvincingly exit the World Cup, we gave the floor to the people doing much of the day-to-day leg work to convince and support data providers to open up their troves to the Swiss public. Jean-Luc Cochard and Andreas Kellerhalls from the Swiss Federal Archives took turns to recap the situation in Switzerland. The OGD strategy for 2019-2023 is being prepared in the Federal Department of Home Affairs, to be ratified by stakeholder departments over the summer. Our association will make a position statement with and on behalf of the user community in the coming months. The presentations demonstrated both a continued commitment to public service, as well as an admission of where we are coming short, an analysis of some of the many roadblocks and challenges technical, political and cultural, that are part of the strategy review. The next 4 years promise renewal, responsibility, and many lessons to apply across the board.

Photo credit: Ernie Deane, CC BY-SA 3.0

We know that not all the actors on the OGD stage are doing a great job, yet – and that to improve the status quo, we need to continue improving awareness and knowledge of the issues. Our role in facilitating cooperation across the digital divide and improving data literacy in Switzerland will be an important stepping stone to future success. Pointing the way to such opportunities was the final keynote of the day, from Walter Palmetshofer (@vavoida), who joined us for the whole 24 hour marathon, and helped to end our conference with a bright acknowledgement of public interest: in good sportsmanship, international cooperation, and sustainable projects to build THINGS THAT MATTER. Walter shared with us the most interesting results, learnings and statistics from the first highly successful years of the Open Data Incubator Europe (ODINE), and let us take home tantalizing glimpses into 57 inspiring startups – each of which could well be at home in Switzerland, to each of which we should be keen to open data, open doors, and learn from.

Changing Minds by Using Open Data

- July 2, 2018 in Open Data, open-education, WG Open Education

This blog has been rewritten from the original post on our Open Education Working Group blog and is co-authored by Javiera Atenas, Erdinç Saçan & Robert Schuwer.   The Greek philosopher Pythagoras once said:
“if you want to multiply joy, then you have to share.”

This also applies to data. Who shares data, gets a multitude of joy – value – in return. This post is based on the practical application at Fontys University of Applied Sciences, School of ICT in Eindhoven, the Netherlands by Erdinç Saçan & Robert Schuwer of the pedagogical approach developed by Javiera Atenas and Leo Havemann from the Open Education Working Group focused in the use of Open Data as Open Educational Resources in which they argue that that while Open Data is not always OER, it certainly becomes OER when used within pedagogical contexts. Open data has been highlighted as a key to information transparency and scientific advancement. Students who are exposed to the use of open data have access to the same raw materials that scientists and policy-makers use. This enables them to engage with real problems at both local and global levels. Educators who make use of open data in teaching and learning encourage students to think as researchers, as journalists, as scientists, and as policy makers and activists. They also provide a meaningful context for gaining experience in research workflows and processes, as well as learning good practices in data management, analysis and reporting. The pedagogic deployment of open data as OER thus supports the development of critical, analytical, collaborative and citizenship skills, and has enormous potential to generate new knowledge. ICT is not just about technology – it’s about coming up with solutions to solve problems or to help people, businesses, communities and governments. Developing ICT solutions means working with people to find a solution. Students in Information & Communication Technology learn how to work with databases, analysing data and making dashboards that will help the users to make the right decisions. Data collections are required for these learning experiences. You can create these data collections (artificially) yourself or use “real” data collections, openly available (like those from Statistics Netherlands (CBS)). In education, data is becoming increasingly important, both in policy, management and in the education process itself. The scientific research that supports education is becoming increasingly dependent on data. Data leads to insights that help improve the quality of education (Atenas & Havemann, 2015). But in the current era where a neo-liberal approach of education seems to dominate, the “Bildung” component of education is considered more important than ever. The term Bildung is attributed to Willem van Humboldt (1767-1835). It refers to general evolution of all human qualities, not only acquiring knowledge, but also developing skills for moral judgments and critical thinking.

Study

In (Atenas & Havemann, 2015), several case studies are described where the use of open data contributes to developing the Bildung component of education. To contribute to these cases and eventually extend experiences, a practical study has been conducted. The study had the following research question:
“How can using open data in data analysis learning tasks contribute to the Bildung component of the ICT Bachelor Program of Fontys School of ICT in the Netherlands?”
In the study, an in-depth case study is executed, using an A / B test method. One group of students had a data set with artificial data available, while the other group worked with a set of open data from the municipality of Utrecht. A pre-test and post-test should reveal whether a difference in development of the Bildung component can be measured. Both tests were conducted by a survey. Additionally, some interviews have been conducted afterwards to collect more in-depth information and explanations for the survey results. For our A/B test, we used three data files from the municipality of Utrecht (a town in the center of the Netherlands, with ~350,000 inhabitants). These were data from all quarters in Utrecht:
  • Crime figures
  • Income
  • Level of Education
(Source: https://utrecht.dataplatform.nl/data) We assumed, all students had opinions on correlations between these three types of data, e.g. “There is a proportional relation between crime figures and level of education” or “There is an inversely proportional relation between income and level of education”. We wanted to see which opinions students had before they started working with the data and if these opinions were influenced after they had analyzed the data. A group of 40 students went to work with the data. The group was divided into 20 students who went to work with real data and 20 went to work with ‘fake’ data. Students were emailed with the three data files and the following assignment: “check CSV (Excel) file in the attachment. Please try this to do an analysis. Try to draw a minimum of 1, a maximum of 2 conclusions from it… this can be anything. As long as it leads to a certain conclusion based on the figures.” In addition, there was also a survey in which we tried to find out how students currently think about correlations between crime, income and educational level. Additionally, some students were interviewed to get some insights into the figures collected by the survey.

Results

For the survey, 40 students have been approached. The response consisted of 25 students. All students indicated that working with real data is more fun, challenging and concrete. It motivates them. Students who worked with fake data did not like this as much. In interviews they indicated that they prefer, for example, to work with cases from companies rather than cases invented by teachers. In the interviews, the majority of students indicated that by working with real data they have come to a different understanding of crime and the reasons for it. They became aware of the social impact of data and they were triggered to think about social problems. To illustrate, here some responses students gave in interviews: “Before I started working with the data, I had always thought that there was more crime in districts with a low income and less crime in districts with a high income. After I have analyzed the data, I have seen that this is not immediately the case. So my thought about this has indeed changed. It is possible, but it does not necessarily have to be that way.” (M. K.) “At first, I also thought that there would be more crime in communities with more people with a lower level of education than in communities with more people with a higher level of education. In my opinion, this image has changed in part. I do not think that a high or low level of education is necessarily linked to this, but rather to the situation in which they find themselves. So if you are highly educated, but things are really not going well (no job, poor conditions at home), then the chance of criminality is greater than if someone with a low level of education has a job.” ( A. K.) “I think it has a lot of influence. You have an image and an opinion beforehand. But the real data either shows the opposite or not. And then you think, “Oh yes, this is it.’. And working with fake data, is not my thing. It has to provide real insights.” (M.D.)  

Conclusion

Our experiment provided positive indications that contributing to the Bildung component of education by using open data in data analysis exercises is possible. Next steps to develop are both extending these experiences to larger groups of students and to more topics in the curriculum.

References

 

About the authors

Javiera Atenas: PhD in Education and co-coordinator of the Open Knowledge Open Education Working Group, responsible for the promotion of Open Data, Open Policies and Capacity Building in Open Education. She is also a Senior Fellow of the Higher Education Academy and the Education Lead at the Latin American Initiative for Open Data [ILDA] as well as an academic and researcher with interest in the use of Open Data as Open Educational Resources and in critical pedagogy. Erdinç Saçan is a Senior Teacher of ICT & Business and the Coordinator of the Minor Digital Marketing at Fontys University of Applied Sciences, School of ICT in Eindhoven, the Netherlands. He previously worked at Corendon, TradeDoubler and Prijsvrij.nl. @erdincsacan ‏   Robert Schuwer is Professor Open Educational Resources at Fontys University of Applied Sciences, School of ICT in Eindhoven, the Netherlands and holds the UNESCO Chair on Open Educational Resources and Their Adoption by Teachers, Learners and Institutions. @OpenRobert55

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.

Europe’s proposed PSI Directive: A good baseline for future open data policies?

- June 21, 2018 in eu, licence, Open Data, Open Government Data, Open Standards, Policy, PSI, research

Some weeks ago, the European Commission proposed an update of the PSI Directive**. The PSI Directive regulates the reuse of public sector information (including administrative government data), and has important consequences for the development of Europe’s open data policies. Like every legislative proposal, the PSI Directive proposal is open for public feedback until July 13. In this blog post Open Knowledge International presents what we think are necessary improvements to make the PSI Directive fit for Europe’s Digital Single Market.    In a guest blogpost Ton Zijlstra outlined the changes to the PSI Directive. Another blog post by Ton Zijlstra and Katleen Janssen helps to understand the historical background and puts the changes into context. Whilst improvements are made, we think the current proposal is a missed opportunity, does not support the creation of a Digital Single Market and can pose risks for open data. In what follows, we recommend changes to the European Parliament and the European Council. We also discuss actions civil society may take to engage with the directive in the future, and explain the reasoning behind our recommendations.

Recommendations to improve the PSI Directive

Based on our assessment, we urge the European Parliament and the Council to amend the proposed PSI Directive to ensure the following:
  • When defining high-value datasets, the PSI Directive should not rule out data generated under market conditions. A stronger requirement must be added to Article 13 to make assessments of economic costs transparent, and weigh them against broader societal benefits.
  • The public must have access to the methods, meeting notes, and consultations to define high value data. Article 13 must ensure that the public will be able to participate in this definition process to gather multiple viewpoints and limit the risks of biased value assessments.
  • Beyond tracking proposals for high-value datasets in the EU’s Interinstitutional Register of Delegated Acts, the public should be able to suggest new delegated acts for high-value datasets.  
  • The PSI Directive must make clear what “standard open licences” are, by referencing the Open Definition, and explicitly recommending the adoption of Open Definition compliant licences (from Creative Commons and Open Data Commons) when developing new open data policies. The directive should give preference to public domain dedication and attribution licences in accordance with the LAPSI 2.0 licensing guidelines.
  • Government of EU member states that already have policies on specific licences in use should be required to add legal compatibility tests with other open licences to these policies. We suggest to follow the recommendations outlined in the LAPSI 2.0 resources to run such compatibility tests.
  • High-value datasets must be reusable with the least restrictions possible, subject at most to requirements that preserve provenance and openness. Currently the European Commission risks to create use silos if governments will be allowed to add “any restrictions on re-use” to the use terms of high-value datasets.  
  • Publicly funded undertakings should only be able to charge marginal costs.
  • Public undertakings, publicly funded research facilities and non-executive government branches should be required to publish data referenced in the PSI Directive.

Conformant licences according to the Open Definition, opendefinition.org/licenses

Our recommendations do not pose unworkable requirements or disproportionately high administrative burden, but are essential to realise the goals of the PSI directive with regards to:
  1. Increasing the amount of public sector data available to the public for re-use,
  2. Harmonising the conditions for non-discrimination, and re-use in the European market,
  3. Ensuring fair competition and easy access to markets based on public sector information,
  4. Enhancing cross-border innovation, and an internal market where Union-wide services can be created to support the European data economy.

Our recommendations, explained: What would the proposed PSI Directive mean for the future of open data?

Publication of high-value data

The European Commission proposes to define a list of ‘high value datasets’ that shall be published under the terms of the PSI Directive. This includes to publish datasets in machine-readable formats, under standard open licences, in many cases free of charge, except when high-value datasets are collected by public undertakings in environments where free access to data would distort competition. “High value datasets” are defined as documents that bring socio-economic benefits, “notably because of their suitability for the creation of value-added services and applications, and the number of potential beneficiaries of the value-added services and applications based on these datasets”. The EC also makes reference to existing high value datasets, such as the list of key data defined by the G8 Open Data Charter. Identifying high-quality data poses at least three problems:
  1. High-value datasets may be unusable in a digital Single Market: The EC may “define other applicable modalities”, such as “any conditions for re-use”. There is a risk that a list of EU-wide high value datasets also includes use restrictions violating the Open Definition. Given that a list of high value datasets will be transposed by all member states, adding “any conditions” may significantly hinder the reusability and ability to combine datasets.
  2. Defining value of data is not straightforward. Recent papers, from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement what data’s “value” is. What counts as high value data should not only be based on quantitative indicators such as growth indicators, numbers of apps or numbers of beneficiaries, but use qualitative assessments and expert judgement from multiple disciplines.
  3. Public deliberation and participation is key to define high value data and to avoid biased value assessments. Impact assessments and cost-benefit calculations come with their own methodical biases, and can unfairly favour data with economic value at the expense of fuzzier social benefits. Currently, the PSI Directive does not consider data created under market conditions to be considered high value data if this would distort market conditions. We recommend that the PSI Directive adds a stronger requirement to weigh economic costs against societal benefits, drawing from multiple assessment methods (see point 2). The criteria, methods, and processes to determine high value must be transparent and accessible to the broader public to enable the public to negotiate benefits and to reflect the viewpoints of many stakeholders.

Expansion of scope

The new PSI Directive takes into account data from “public undertakings”. This includes services in the general interest entrusted with entities outside of the public sector, over which government maintains a high degree of control. The PSI Directive also includes data from non-executive government branches (i.e. from legislative and judiciary branches of governments), as well as data from publicly funded research. Opportunities and challenges include:
  • None of the data holders which are planned to be included in the PSI Directive are obliged to publish data. It is at their discretion to publish data. Only in case they want to publish data, they should follow the guidelines of the proposed PSI directive.
  • The PSI Directive wants to keep administrative costs low. All above mentioned data sectors are exempt from data access requests.
  • In summary, the proposed PSI Directive leaves too much space for individual choice to publish data and has no “teeth”. To accelerate the publication of general interest data, the PSI Directive should oblige data holders to publish data. Waiting several years to make the publication of this data mandatory, as happened with the first version of the PSI Directive risks to significantly hamper the availability of key data, important for the acceleration of growth in Europe’s data economy.    
  • For research data in particular, only data that is already published should fall under the new directive. Even though the PSI Directive will require member states to develop open access policies, the implementation thereof should be built upon the EU’s recommendations for open access.

Legal incompatibilities may jeopardise the Digital Single Market

Most notably, the proposed PSI Directive does not address problems around licensing which are a major impediment for Europe’s Digital Single Market. Europe’s data economy can only benefit from open data if licence terms are standardised. This allows data from different member states to be combined without legal issues, and enables to combine datasets, create cross-country applications, and spark innovation. Europe’s licensing ecosystem is a patchwork of many (possibly conflicting) terms, creating use silos and legal uncertainty. But the current proposal does not only speak vaguely about standard open licences, and makes national policies responsible to add “less restrictive terms than those outlined in the PSI Directive”. It also contradicts its aim to smoothen the digital Single Market encouraging the creation of bespoke licences, suggesting that governments may add new licence terms with regards to real-time data publication. Currently the PSI Directive would allow the European Commission to add “any conditions for re-use” to high-value datasets, thereby encouraging to create legal incompatibilities (see Article 13 (4.a)). We strongly recommend that the PSI Directive draws on the EU co-funded LAPSI 2.0 recommendations to understand licence incompatibilities and ensure a compatible open licence ecosystem.   I’d like to thank Pierre Chrzanowksi, Mika Honkanen, Susanna Ånäs, and Sander van der Waal for their thoughtful comments while writing this blogpost.   Image adapted from Max Pixel   ** Its’ official name is the Directive 2003/98/EC on the reuse of public sector information.