You are browsing the archive for Open Data.

The Open Human Genome, twenty years on

- June 26, 2020 in Open Data, Open Knowledge Foundation, Open Science

On 26th June 2000, the “working draft” of the human genome sequence was announced to great fanfare. Its availability has gone on to revolutionise biomedical research. But this iconic event, twenty years ago today, is also a reference point for the value and power of openness and its evolution.

Biology’s first mega project

Back in 1953, it was discovered that DNA was the genetic material of life. Every cell of every organism contains a copy of its genome, a long sequence of DNA letters, containing a complete set of instructions for that organism. The first genome of a free-living organism – a bacteria – was only determined in 1995 and contained just over half a million letters. At the time sequencing machines determined 500 letter fragments, 100 at a time, with each run taking hours. Since the human genome contains about three billion letters, sequencing it was an altogether different proposition, going on to cost of the order of three billion dollars.

A collective international endeavour, and a fight for openness

It was sequenced through a huge collective effort by thousands of scientists across the world in many stages, over many years. The announcement on 26th June 2000 was only of a draft – but still sufficiently complete to be analysed as a whole. Academic articles describing it wouldn’t be published for another year, but the raw data was completely open, freely available to all.

It might not have been so, as some commercial forces, seeing the value of the genome, tried to shut down government funding in the US and privatise access. However openness won out, thanks largely to the independence and financial muscle of Wellcome (which paid for a third of the sequencing at the Wellcome Sanger Institute) and the commitment of the US National Institutes of Health. Data for each fragment of DNA was released onto the internet just 24hrs after it had been sequenced, with the whole genome accessible through websites such as Ensembl.

Openness for data, openness for publications

Scientists publish. Other scientists try to build on their work. However, as science has become increasingly data rich, access to the data has become as important as publication. In biology, long before genomes, there were efforts by scientists, funders and publishers to link publication with data deposition in public databases hosted by organisations such as EBI and NCBI. However, publication can take years and if a funder has made a large grant for data generation, should the research community have to wait until then?

The Human Genome Sequence, with its 24-hour data release model was at the vanguard of “pre-publication” data release in biology. Initially the human genome was seen as a special case – scientists worried about raw unchecked data being released to all or that others might beat them to publication if such data release became general – but gradually the idea took hold. Dataset generators have found that transparency has generally been beneficial to them and that community review of raw data has allowed errors to be spotted and corrected earlier. Pre-publication data release is now well established where funders are paying for data generation that has value as a community resource, including most genome related projects. And once you have open access data, you can’t help thinking about open access publication too. The movement to change the academic publishing business model to open access dates back to the 1990s, but long before open access became mandated by funders and governments it became the norm for genome related papers.

Big data comes to biology, forcing it to grow up fast

Few expected the human genome to be sequenced so quickly. Even fewer expected the price to sequence one to have dropped to less than $1000 today, or to only take 24 hours on a single machine. “Next Generation” sequencing technology has led to million-fold reductions in price and similar gains in output per machine in less than 20 years. This is the most rapid improvement in any technology, far exceeding the improvements in computing in the same period. The genomes of tens of thousands of different organisms have been sequenced as a result.  Furthermore, the change in output and price has made sequencing a workhorse technology throughout biological and biomedical research – every cell of an organism has an identical copy of its genome, but each cell (37 trillion in each human) is potentially doing something different, which can also be captured by sequencing. Public databases have therefore been filling up with sequence data, doubling in size as much as every six months, as scientists probe how organisms function. Sequence is not the only biological data type being collected on a large scale, but it has been the driver to making biology a big data science.

Genomics and medicine, openness and privacy

Every individual’s genome is slightly different and some of those difference may cause disease. Clinical geneticists have been testing Individual genes of patients to find for cause of rare diseases for more than twenty years, but sequencing the whole genome to simplify the hunt is now affordable and practical. Right now our understanding of the genome is only sufficient to inform clinical care for a small number of conditions, but it’s already enough for the UK NHS to roll out whole genome sequencing as part of the new Genome Medicine Service, after testing this in the 100,000 genomes project. It is the first national healthcare system in the world to do this.

How much could your healthcare be personalised and improved through analysis of your genome? Right now, an urgent focus is on whether genome differences affects the severity of COVID-19 infections. Ultimately, understanding how the human genome works and how DNA differences affect health will depend on research on the genomes of large numbers of individuals alongside their medical records. Unlike the original reference human genome, this is not open data but highly sensitive, private, personal data. 

The challenge has become to build systems that can allow research but are trusted by individuals sufficiently for them to consent to their data being used. What was developed for the 100,000 genomes project, in consultation with participants, was a research environment that functions as a reading library – researchers can run complex analysis on de-identified data within a secure environment but cannot take individual data out. They are restricted to just the statistical summaries of their research results. This Trusted Research Environment model is now being looked at for other sources of sensitive health data.

The open data movement has come a long way in twenty years, showing the benefits to society of organisational transparency that results from data sharing and the opportunities that come from data reuse. The Reference Human Genome Sequence as a public good has been part of that journey. However, not all data can be open, even if the ability to analyse it has great value to society. If we want to benefit from the analysis of private data, we have to find a middle ground which preserves some of strengths of openness, such as sharing analytical tools and summary results, while adapting to constrained analysis environments designed to protect privacy sufficiently to satisfy the individuals whose data it is.

Professor Tim Hubbard is a board member of the Open Knowledge Foundation and was one of the organisers of the sequencing of the human genome.

Meet the organisations who have been awarded Open Data Day 2020 mini-grants

- February 20, 2020 in Open Data, Open Data Day, Open Data Day 2020

Open Data Day 2020 The Open Knowledge Foundation is happy to announce that dozens of organisations from all over the world have been awarded mini-grants to support the running of events that celebrate Open Data Day on Saturday 7th March 2020. Thanks to the generous support of this year’s mini-grant funders – Datopian, the Foreign & Commonwealth Office, Hivos, the Latin American Open Data Initiative (ILDA), Mapbox, Open Contracting Partnership and Resource Watch – the Open Knowledge Foundation will be giving out a total of 67 mini-grants to the organisations listed below in order to help them run great events on or around Open Data Day. We received 246 mini-grant applications this year and were greatly impressed by the quality of the events being organised all over the world. Learn more about Open Data Day, discover events taking place and find out how to get technical assistance or connect with the global open data community by checking out the information at the bottom of this blogpost. Here are the organisations whose Open Data Day events will be supported by mini-grants divided up by the tracks their events are devoted to:

Environmental data

  • Escuela de Fiscales’ event in Argentina will promote the use of open data for the training, dissemination and development of civic activism in the preservation of the environment in the community
  • Nigeria’s Adamawa Agricultural Development Programme will sensitise fishery stakeholders – especially fishermen –  on the importance of stock taking to prevent overfishing in our water bodies and how to update the fisheries database using open data
  • Afonte Jornalismo de Dados (Afonte Data Journalism) in Brazil will provide awareness about environmental politics and empower the community to use public and open data
  • The Department of Agriculture at the Asuogyaman District Assembly in Ghana will host local farming organisations to create awareness on the need for data to be open and to show the effect of climate change on agriculture and related livelihoods using rainfall data 
  • An Open Data Day event planned at Tangaza University College in Kenya will discuss how to tackle climate change challenges with data
  • The University of Dodoma in Tanzania will invite girls from a local school to a geospatial open data networking event to instill environmental thinking among young girls
  • The Open Internet for Democracy team and Creative Commons Venezuela chapter will join forces to train a group of environmental journalists about open and reliable data sources they can use to develop stories
  • iWatch Africa will host a forum in Ghana to leveraging the power of public domain satellite and drone imagery to track deforestation and water pollution in West Africa
  • Ghana’s Africa Open Data and Internet Research Foundation will run a hackathon on how local communities can use open data for sustainable development especially to improve sanitation issues
  • Sustenta in Mexico will share knowledge about sustainable development, climate change and sustainability
  • Grafoscopio / HackBo (Colombia) will bring together two citizen science communities working on air quality issues and reproducible research, data activism, visualisation and storytelling
  • Youth for Environmental Development (Malawi) will inspire university students to take action and contribute to environmental protection through mapping
  • WikiRate from Germany will engage the public in the research and collection of open data about how companies are impacting climate change
  • Liga de Defensa del Medio Ambiente (LIDEMA) from Bolivia plan to identify open data sources that can help address socio-environmental conflicts
  • The Open Cities Lab team in South Africa will create an open and accessible space for community scientists to meet, network and collaborate on an air quality project
  • Costa Rica’s ACCESA will help attendants identify and visualise new and unexpected relationships and connections between land-use and territorial planning, on the one hand, and climate change and decarbonisation
  • Young Volunteers for the Environment from Togo will promote the use of open data in environmental protection
  • Técnicas Rudas will collectively explore Mexico’s mandated public data on construction projects and their environmental impacts

Tracking public money flows

  • Spotlight for Transparency and Accountability Initiative in Nigeria will host an event to increasing understanding of and access to local budget data
  • EldoHub will hold a hackathon to develop tools and systems which can facilitate county governments’ involvement in Kenya’s transparency, accountability and public participation
  • FollowTheMoney Kaduna will use contracting data including responses to FOI letters and on the spot assessment of projects and infrastructures across communities in Kaduna state, Nigeria
  • The Alliance of Independent Journalists in Bandung, Indonesia will use open contracting data to encourage collaboration among civil society groups to access and monitor public budgets
  • Afroleadership in Cameroon will organise a training on the analysis of budget data by civil society using open data
  • The event run by Construction Sector Transparency Initiative (CoST Malawi) will call for greater transparency and accountability in public budget management the through Open Contracting for Infrastructure Data Standard
  • Somalia’s Bareedo Platform will encourage uptake of local public contracting data
  • The Perkumpulan Inisiatif in Indonesia plan to host a youth open budget hack clinic building on the principles of public participation in fiscal policy from the Global Initiative for Fiscal Transparency
  • Dataphyte in Nigeria will support change agents to track and use budget, procurement and revenue data to demand accountability
  • The Collective of Journalists for Peace and Freedom of Expression from Mexico will design a workshop to explain all the contracts of the City Council of Mazatlan, Sinaloa
  • Diálogos will visualise the volume of public procurement of the main ministries of the Government of Guatemala 
  • The 1991 Open Data Incubator will facilitate a workshop and discussions to share the experiences of many parties working with or producing open data in Ukraine
  • LEAD University in Costa Rica will organise an event for data science students to meet public officials behind the National Public Procurement Portal
  • Russia’s Infoculture will hold a conference on open data and information transparency
  • The Kikandwa Rural Communities Development Organisation showcase the Uganda Budget Information website and how to use it to report, track and monitor public funds
  • The Centre for Information, Peace and Security in Africa will work with journalists in Tanzania to promote openness in public contract in terms of transparency and integrity on public expenditures and value for public money
  • Bolivia’s CONSTRUIR Foundation will organise a data camp to advocate for more and better public contracting data
  • Ojoconmipisto in Guatemala will teach students and journalists how to investigate and tell stories from public budget and contracting data
  • Sluggish Hackers will use their event to investigate how to track public money flows from the National Assembly or local assemblies in South Korea

Open mapping

  • Exegetic Analytics in South Africa will expose the South African R community to a range of resources for working with open spatial data
  • OpenMap Development Tanzania will spread awareness on the usefulness of open data for development among participants through workshops, trainings, break-out sessions and a mapathon 
  • Spain’s TuTela Learning Network will map the housing situation of migrant women in Granada
  • ODI Leeds in the UK will host a data surgery to assist attendees with their data, converting the data into GeoJSON files and mapping it
  • Girolabs from Paraguay will show initiatives using and producing open data
  • Open Knowledge Belgium will use open data to build a map visualising the streets names of Hasselt by gender
  • BloGoma (Blogosphère Gomatracienne) in the Democratic Republic of the Congo will using open mapping solutions to increase young people’s knowledge of free local HIV-related services
  • OpenStreetMap Kenya and Map Kibera will empower young people in Kibera Slum, (Africa’s largest urban slum) with skills in open mapping
  • The University of Pretoria in South Africa will develop a complete map of minibus taxi routes in Mamelodi East with the local knowledge of school learners
  • Brazil’s Federal University of Bahia wants to popularise open data mapping systems, especially OpenStreetMap, among undergraduate students and young people from vulnerable areas of Salvador
  • Youth Innovation Lab in Nepal will showcase crowdsourced streetlights data for Kathmandu collected by digital volunteers to influence policy for the maintenance of streetlights
  • Transparência Hackday Portugal / Open Knowledge Portugal will host a morning of hacking and learning, followed by an afternoon of quick talks and networking

Data for equal development

  • Footprints Bridge International will focus on how open data can help create jobs for rural youth and women in Ghana
  • The Bangladesh chapter of Creative Commons will host a mini conference to discuss the benefits of open source projects and open government data in the country
  • Nigeria’s National Institute for Freshwater Fisheries Research is developing an event to sensitise agricultural stakeholders on the need and benefits of data for equal development
  • MapBeks will organise a mapping party in the Philippines to highlight HIV facilities and LGBT-friendly spaces on OpenStreetMap
  • Khumbo Bangala Chirembo is a librarian at the Lilongwe University of Agriculture and Natural Resources in Malawi. He will host a workshop for other librarians to raise awareness of open data and its benefits
  • Mexico’s Future Lab aims to give visibility to women and the LGBT community in local decision making within government, business and civil society using open data
  • Zimbabwe Library Association’s Open Data Day event will highlight the importance of open data in promoting and supporting the girl child as well as raising the negative effects of gender-based violence against women and the role that libraries can play in providing current awareness to communities
  • Young Professionals for Agricultural Development (YPARD) in the Democratic Republic of the Congo will help young and female agricultural entrepreneurs explore how they can use open data to create new businesses
  • Datasketch in Colombia will organise a series of lightning talks from social entrepreneurs and journalists to share their work using open data
  • Youths in Technology and Development Uganda plan to share innovative data tools and a FAIR open data road map to measure progress against the SDGs in the country
  • Mexico’s Ligalab will open a space for local speakers to present their open data projects and for the community to gather and engage with local issues towards equal development
  • The Association SUUDU ANDAL in Burkina Faso plan to emphasise the importance of open data for development and accountability during their event
  • Argentina’s Fundación Conocimiento Abierto will run a Data Camp on gender and diversity before spending the next few months developing local apps using open data
  • NaimLab Peru are organising an event for undergraduate students to share the open data work being done by local and national organisations
  • YWCA Honduras’ event will host a focus group for local women from middle and low income backgrounds to discuss and generate data on female migration in Honduras
  • We Are Capable Data for Good Namibia (WACDGN) will train young Namibians in using data science skills for sustainable development projects
  • Tutator from Bolivia will use their event to understand the impact of open data in the livelihood of the beneficiaries of social services

About Open Data Day

Open Data Day is the annual event where we gather to reach out to new people and build new solutions to issues in our communities using open data. The tenth Open Data Day will take place on Saturday 7th March 2020. If you have started planning your Open Data Day event already, please add it to the global map on the Open Data Day website using this form You can also connect with others and spread the word about Open Data Day using the #OpenDataDay or #ODD2020 hashtags. Alternatively you can join the Google Group to ask for advice or share tips. To get inspired with ideas for events, you can read about some of the great events which took place on Open Data Day 2019 in our wrap-up blog post.

Technical support

As well as sponsoring the mini-grant scheme, Datopian will be providing technical support on Open Data Day 2020. Discover key resources on how to publish any data you’re working with via datahub.io and how to reach out to the Datopian team for assistance via Gitter by reading their Open Data Day blogpost.

Need more information?

If you have any questions, you can reach out to the Open Knowledge Foundation team by emailing network@okfn.org or on Twitter via @OKFN. There’s also the Open Data Day Google Group where you can connect with others interested in taking part.

Announcing the launch of the Open Data Day 2020 mini-grant scheme

- January 16, 2020 in Featured, Open Data, Open Data Day, Open Data Day 2020, Open Knowledge Foundation

Open Data Day 2020 We are happy to announce the launch of the mini-grant scheme for Open Data Day 2020. This scheme will provide small funds to support the organisation of open data-related events across the world on Saturday 7th March 2020. Thanks to the generous support of this year’s mini-grant funders – Datopian, the Foreign & Commonwealth Office, Hivos, the Latin American Open Data Initiative (ILDA), Mapbox, Open Contracting Partnership and the World Resources Institute – the Open Knowledge Foundation will be able to give out 60 mini-grants this year. Applications for the mini-grant scheme must be submitted before midnight GMT on Sunday 9th February 2020 via filling in this form. To be awarded a mini-grant, your event must fit into one of the four tracks laid out below. Event organisers can only apply once and for just one track. Open Data Day 2020 mini-grant tracks Mini-grant tracks for Open Data Day 2020 Each year, the Open Data Day mini-grant scheme looks to highlight and support particular types of open data events by focusing applicants on a number of thematic tracks. This year’s tracks are:
  • Environmental data: Use open data to illustrate the urgency of the climate emergency and spur people into action to take a stand or make changes in their lives to help the world become more environmentally sustainable.
  • Tracking public money flows: Expand budget transparency, dive into public procurement, examine tax data or raise issues around public finance management by submitting Freedom of Information requests.
  • Open mapping: Learn about the power of maps to develop better communities.
  • Data for equal development: How can open data be used by communities to highlight pressing issues on a local, national or global level? Can open data be used to track progress towards the Sustainable Development Goals or SDGs?
What is a mini-grant? A mini-grant is a small fund of between $200 and $300 USD to help support groups organising Open Data Day events. Event organisers can only apply once and for just one track. The mini-grants cannot be used to fund government events, whether national or local. We can only support civil society actions. We encourage governments to find local groups and engage with them if they want to organise events and apply for a mini-grant. The funds will only be delivered to the successful grantees after their event takes place and once the Open Knowledge Foundation team receives a draft blogpost about the event for us to publish on blog.okfn.org. In case the funds are needed before 7th March 2020, we will assess whether or not we can help on a case-by-case basis. About Open Data Day Open Data Day is the annual event where we gather to reach out to new people and build new solutions to issues in our communities using open data. The tenth Open Data Day will take place on Saturday 7th March 2020. If you have started planning your Open Data Day event already, please add it to the global map on the Open Data Day website using this form You can also connect with others and spread the word about Open Data Day using the #OpenDataDay or #ODD2020 hashtags. Alternatively you can join the Google Group to ask for advice or share tips. To get inspired with ideas for events, you can read about some of the great events which took place on Open Data Day 2019 in our wrap-up blog post. Technical support As well as sponsoring the mini-grant scheme, Datopian will be providing technical support on Open Data Day 2020. Discover key resources on how to publish any data you’re working with via datahub.io and how to reach out to the Datopian team for assistance via Gitter by reading their Open Data Day blogpost. Need more information? If you have any questions, you can reach out to the Open Knowledge Foundation team by emailing network@okfn.org or on Twitter via @OKFN. There’s also the Open Data Day Google Group where you can connect with others interested in taking part.

Οι Ανοιχτές Πηγές Εκπαίδευσης (Open Education Resources – OER) λαμβάνουν διεθνή υποστήριξη με ψήφο της UNESCO

- December 11, 2019 in News, Open Data, ανοικτά δεδομένα, ανοικτή πρόσβαση, κοινωνία πολιτών

    Ο διεθνής οργανισμός UNESCO ενέκρινε μέτρο για την ενθάρρυνση της ανάπτυξης και της περαιτέρω εξέλιξης των ανοικτών εκπαιδευτικών πηγών (OpenEducationResources – OER)  – το ελεύθερο, κοινόχρηστο εκπαιδευτικό περιεχόμενο που έχει κερδίσει ισχυρό προβάδισμα σε πολλά σχολικά συστήματα στις Ηνωμένες Πολιτείες. Η συγκεκριμένη σύσταση πολιτικής, που εγκρίθηκε πρόσφατα  από την UNESCO, προωθεί τη συνεργασία […]

csv,conf returns for version 5 in May

- October 15, 2019 in #CSVconf, Events, Frictionless Data, News, Open Data, Open Government Data, Open Research, Open Science, Open Software

Save the data for csv,conf,v5! The fifth version of csv,conf will be held at the University of California, Washington Center in Washington DC, USA, on May 13 and 14, 2020.    If you are passionate about data and its application to society, this is the conference for you. Submissions for session proposals for 25-minute talk slots are open until February 7, 2020, and we encourage talks about how you are using data in an interesting way (like to uncover a crossword puzzle scandal). We will be opening ticket sales soon, and you can stay updated by following our Twitter account @CSVconference.   csv,conf is a community conference that is about more than just comma-sepatated-values – it brings together a diverse group to discuss data topics including data sharing, data ethics, and data analysis from the worlds of science, journalism, government, and open source. Over two days, attendees will have the opportunity to hear about ongoing work, share skills, exchange ideas (and stickers!) and kickstart collaborations.   
csv,conf,v4

Attendees of csv,conf,v4

First launched in July 2014,  csv,conf has expanded to bring together over 700 participants from 30 countries with backgrounds from varied disciplines. If you’ve missed the earlier years’ conferences, you can watch previous talks on topics like data ethics, open source technology, data journalism, open internet, and open science on our YouTube channel. We hope you will join us in Washington D.C. in May to share your own data stories and join the csv,conf community!   Csv,conf,v5 is supported by the Sloan Foundation through OKFs Frictionless Data for Reproducible Research grant as well as by the Gordon and Betty Moore Foundation, and the Frictionless Data team is part of the conference committee. We are happy to answer all questions you may have or offer any clarifications if needed. Feel free to reach out to us on csv-conf-coord@googlegroups.com, on twitter @CSVconference or our dedicated community slack channel   We are committed to diversity and inclusion, and strive to be a supportive and welcoming environment to all attendees. To this end, we encourage you to read the Conference Code of Conduct.
Rojo the Comma Llama

While we won’t be flying Rojo the Comma Llama to DC for csv,conf,v5, we will have other mascot surprises in store.

Transforming the UK’s data ecosystem: Open Knowledge Foundation’s thoughts on the National Data Strategy

- July 17, 2019 in National Data Strategy, Open Data, Open Government Data, Open Knowledge, Policy

Following an open call for evidence issued by the UK’s Department for Digital, Culture, Media and Sport, Open Knowledge Foundation submitted our thoughts about what the UK can do in its forthcoming National Data Strategy to “unlock the power of data across government and the wider economy, while building citizen trust in its use”. We also signed a joint letter alongside other UK think tanks, civil and learned societies calling for urgent action from government to overhaul its use of data. Below our CEO Catherine Stihler explains why the National Data Strategy needs to be transformative to ensure that British businesses, citizens and public bodies can play a full role in the interconnected global knowledge economy of today and tomorrow: Today’s digital revolution is driven by data. It has opened up extraordinary access to information for everyone about how we live, what we consume, and who we are. But large unaccountable technology companies have also monopolised the digital age, and an unsustainable concentration of wealth and power has led to stunted growth and lost opportunities. Governments across the world must now work harder to give everyone access to key information and the ability to use it to understand and shape their lives; as well as making powerful institutions more accountable; and ensuring vital research information that can help us tackle challenges such as poverty and climate change is available to all. In short, we need a future that is fair, free and open. The UK has a golden opportunity to lead by example, and the Westminster government is currently developing a long-anticipated National Data Strategy. Its aim is to ensure all citizens and organisations trust the data ecosystem, are sufficiently skilled to operate effectively within it, and can get access to high-quality data when they need it. Laudable aims, but they must come with a clear commitment to invest in better data and skills. The Open Knowledge Foundation I am privileged to lead was launched 15 years ago to pioneer the way that we use data, working to build open knowledge in government, business and civil society – and creating the technology to make open material useful. This week, we have joined with a group of think tanks, civil and learned societies to make a united call for sweeping reforms to the UK’s data landscape. In order for the strategy to succeed, there needs to be transformative, not incremental, change and there must be leadership from the very top, with buy-in from the next Prime Minister, Culture Secretary and head of the civil service. All too often, piecemeal incentives across Whitehall prevent better use of data for the public benefit. A letter signed by the Open Knowledge Foundation, the Institute for Government, Full Fact, Nesta, the Open Data Institute, mySociety, the Royal Statistical Society, the Open Contracting Partnership, 360Giving, OpenOwnership, and the Policy Institute at King’s College London makes this clear. We have called for investment in skills to convert data into real information that can be acted upon; challenged the government to earn the public’s trust, recognising that the debate about how to use citizens’ data must be had in public, with the public; proposed a mechanism for long-term engagement between decision-makers, data users and the public on the strategy and its goals; and called for increased efforts to fix the government’s data infrastructure so organisations outside the government can benefit from it. Separately, we have also submitted our own views to the UK Government, calling for a focus on teaching data skills to the British public. Learning such skills can prove hugely beneficial to individuals seeking employment in a wide range of fields including the public sector, government, media and voluntary sector.  But at present there is often a huge amount of work required to clean up data in order to make it usable before insights or stories can be gleaned from it.  We believe that the UK government could help empower the wider workforce by instigating or backing a fundamental data literacy training programme open to local communities working in a range of fields to strengthen data demand, use and understanding.  Without such training and knowledge, large numbers of UK workers will be ill-equipped to take on many jobs of the future where products and services are devised, built and launched to address issues highlighted by data. Empowering people to make better decisions and choices informed by data will boost productivity, but not without the necessary investment in skills. We have also told the government that one of the most important things it can do to help businesses and non-profit organisations best share the data they hold is to promote open licencing. Open licences are legal arrangements that grant the general public rights to reuse, distribute, combine or modify works that would otherwise be restricted under intellectual property laws. We would also like to see the public sector pioneering new ways of producing and harnessing citizen-generated data efforts by organising citizen science projects through schools, libraries, churches and community groups.  These local communities could help the government to collect high-quality data relating to issues such as air quality or recycling, while also leading the charge when it comes to increasing the use of central government data. We live in a knowledge society where we face two different futures: one which is open and one which is closed. A closed future is one where knowledge is exclusively owned and controlled leading to greater inequality and a closed society. But an open future means knowledge is shared by all – freely available to everyone, a world where people are able to fulfil their potential and live happy and healthy lives. The UK National Data Strategy must emphasise the importance and value of sharing more, better quality information and data openly in order to make the most of the world-class knowledge created by our institutions and citizens.  Without this commitment at all levels of society, British businesses, citizens and public bodies will fail to play a full role in the interconnected global knowledge economy of today and tomorrow.

Missed opportunities in the EU’s revised open data and re-use of public sector information directive

- July 9, 2019 in European Union, Open Data, Open Government Data, Open Research

Published by the European Union on June 26th, the revised directive on open data and the re-use of public sector information – or PSI Directive – set out an updated set of rules relating to public sector documents, publicly funded research data and “high-value” datasets which should be made available for free via application programming interfaces or APIs. EU member states have until July 2021 to incorporate the directive into law.  While Open Knowledge Foundation is encouraged to see some of the new provisions, we have concerns – many of which we laid out in a 2018 blogpost – about missed opportunities for further progress towards a fair, free and open future across the EU. Open data stickers Lack of public input Firstly, the revised directive gives responsibility for choosing which high-value datasets to publish over to member states but there are no established mechanisms for the public to provide input into the decisions.  Broad thematic categories – geospatial; earth observation and environment; meteorological; statistics; companies and company ownership; and mobility – are set out for these datasets but the specifics will be determined over the next two years via a series of further implementing acts. Datasets eventually deemed to be high-value shall be made “available free of charge … machine readable, provided via APIs and provided as a bulk download, where relevant”. Despite drawing on our Global Open Data Index to generate a preliminary list of high-value datasets, this decision flies in the face of years of findings from the Index showing how important it is for governments to engage with the public as much and as early as possible to generate awareness and increase levels of reuse of open data. We fear that this could lead to a further loss of public trust by opening the door for special interests, lobbyists and companies to make private arguments against the release of valuable datasets like spending records or beneficial ownership data which is often highly disaggregated and allows monetary transactions to be linked to individuals. Partial definition of high-value data Secondly, defining the value of data is also not straightforward. Papers from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement about what data’s “value” is. What counts as high-value data should not only be based on quantitative indicators such as potential income generation, breadth of business applications or numbers of beneficiaries – as the revised directive sets out – but also use qualitative assessments and expert judgment from multiple disciplines. Currently less than a quarter of the data with the biggest potential for social impact is available as truly open data even from countries seen as open data leaders, according to the latest Open Data Barometer report from our colleagues at the World Wide Web Foundation. Why? Because “governments are not engaging enough with groups beyond the open data and open government communities”.   Lack of clarity on recommended licenses Thirdly, in line with the directive’s stated principle of being “open by design and by default”, we hope to see countries avoiding future interoperability problems by abiding by the requirement to use open standard licences when publishing these high-value datasets. It’s good to see that the EU Commission itself has recently adopted Creative Commons licences when publishing its own documents and data.  But we feel – in line with our friends at Communia – that the Commission should have made clear exactly which open licences they endorsed under the updated directive, by explicitly recommending the adoption of Open Definition compliant licences from Creative Commons or Open Data Commons to member states. The directive also missed the opportunity to give preference to public domain dedication and attribution licences in accordance with the EU’s own LAPSI 2.0 licensing guidelines, as we recommended. The European Data Portal indicates that there could be up to 90 different licences currently used by national, regional, or municipal governments. Their quality assurance report also shows that they can’t automatically detect the licences used to publish the vast majority of datasets published by open data portals from EU countries. If they can’t work this out, the public definitely won’t be able to: meaning that any and all efforts to use newly-released data will be restrained by unnecessarily onerous reuse conditions. The more complicated or bespoke the licensing, the more likely data will end up unused in silos, our research has shown. 27 of the 28 EU member states may now have national open data policies and portals but, once discovered, it is currently likely that – in addition to confusing licencing – national datasets lack interoperability. For while the EU has substantial programmes of work on interoperability under the European Interoperability Framework, they are not yet having a major impact on the interoperability of open datasets. Open Knowledge Foundation research report: Avoiding data use silos More FAIR data Finally, we welcome the provisions in the directive obliging member states to “[make] publicly funded research data openly available following the principle of open by default and compatible with FAIR principles.” We know there is much work to be done but hope to see wide adoption of these rules and that the provisions for not releasing publicly-funded data due to “confidentiality” or “legitimate commercial interests” will not be abused. The next two years will be a crucial period to engage with these debates across Europe and to make sure that EU countries embrace the directive’s principle of openness by default to release more, better information and datasets to help citizens strive towards a fair, free and open future.

Missed opportunities in the EU’s revised open data and re-use of public sector information directive

- July 9, 2019 in European Union, Open Data, Open Government Data, Open Research

Published by the European Union on June 26th, the revised directive on open data and the re-use of public sector information – or PSI Directive – set out an updated set of rules relating to public sector documents, publicly funded research data and “high-value” datasets which should be made available for free via application programming interfaces or APIs. EU member states have until July 2021 to incorporate the directive into law.  While Open Knowledge Foundation is encouraged to see some of the new provisions, we have concerns – many of which we laid out in a 2018 blogpost – about missed opportunities for further progress towards a fair, free and open future across the EU. Open data stickers Lack of public input Firstly, the revised directive gives responsibility for choosing which high-value datasets to publish over to member states but there are no established mechanisms for the public to provide input into the decisions.  Broad thematic categories – geospatial; earth observation and environment; meteorological; statistics; companies and company ownership; and mobility – are set out for these datasets but the specifics will be determined over the next two years via a series of further implementing acts. Datasets eventually deemed to be high-value shall be made “available free of charge … machine readable, provided via APIs and provided as a bulk download, where relevant”. Despite drawing on our Global Open Data Index to generate a preliminary list of high-value datasets, this decision flies in the face of years of findings from the Index showing how important it is for governments to engage with the public as much and as early as possible to generate awareness and increase levels of reuse of open data. We fear that this could lead to a further loss of public trust by opening the door for special interests, lobbyists and companies to make private arguments against the release of valuable datasets like spending records or beneficial ownership data which is often highly disaggregated and allows monetary transactions to be linked to individuals. Partial definition of high-value data Secondly, defining the value of data is also not straightforward. Papers from Oxford University, to Open Data Watch and the Global Partnership for Sustainable Development Data demonstrate disagreement about what data’s “value” is. What counts as high-value data should not only be based on quantitative indicators such as potential income generation, breadth of business applications or numbers of beneficiaries – as the revised directive sets out – but also use qualitative assessments and expert judgment from multiple disciplines. Currently less than a quarter of the data with the biggest potential for social impact is available as truly open data even from countries seen as open data leaders, according to the latest Open Data Barometer report from our colleagues at the World Wide Web Foundation. Why? Because “governments are not engaging enough with groups beyond the open data and open government communities”.   Lack of clarity on recommended licenses Thirdly, in line with the directive’s stated principle of being “open by design and by default”, we hope to see countries avoiding future interoperability problems by abiding by the requirement to use open standard licences when publishing these high-value datasets. It’s good to see that the EU Commission itself has recently adopted Creative Commons licences when publishing its own documents and data.  But we feel – in line with our friends at Communia – that the Commission should have made clear exactly which open licences they endorsed under the updated directive, by explicitly recommending the adoption of Open Definition compliant licences from Creative Commons or Open Data Commons to member states. The directive also missed the opportunity to give preference to public domain dedication and attribution licences in accordance with the EU’s own LAPSI 2.0 licensing guidelines, as we recommended. The European Data Portal indicates that there could be up to 90 different licences currently used by national, regional, or municipal governments. Their quality assurance report also shows that they can’t automatically detect the licences used to publish the vast majority of datasets published by open data portals from EU countries. If they can’t work this out, the public definitely won’t be able to: meaning that any and all efforts to use newly-released data will be restrained by unnecessarily onerous reuse conditions. The more complicated or bespoke the licensing, the more likely data will end up unused in silos, our research has shown. 27 of the 28 EU member states may now have national open data policies and portals but, once discovered, it is currently likely that – in addition to confusing licencing – national datasets lack interoperability. For while the EU has substantial programmes of work on interoperability under the European Interoperability Framework, they are not yet having a major impact on the interoperability of open datasets. Open Knowledge Foundation research report: Avoiding data use silos More FAIR data Finally, we welcome the provisions in the directive obliging member states to “[make] publicly funded research data openly available following the principle of open by default and compatible with FAIR principles.” We know there is much work to be done but hope to see wide adoption of these rules and that the provisions for not releasing publicly-funded data due to “confidentiality” or “legitimate commercial interests” will not be abused. The next two years will be a crucial period to engage with these debates across Europe and to make sure that EU countries embrace the directive’s principle of openness by default to release more, better information and datasets to help citizens strive towards a fair, free and open future.

Statement from the Open Knowledge Foundation Board on the future of the CKAN Association

- June 6, 2019 in ckan, Open Data, Open Knowledge, Open Knowledge Foundation

The Open Knowledge Foundation (OKF) Board met on Monday evening to discuss the future of the CKAN Association.

The Board supported the CKAN Stewardship proposal jointly put forward by Link Digital and Datopian. As two of the longest serving members of the CKAN Community, it was felt their proposal would now move CKAN forward, strengthening both the platform and community.

In appointing joint stewardship to Link Digital and Datopian, the Board felt there was a clear practical path with strong leadership and committed funding to see CKAN grow and prosper in the years to come.

OKF will remain the ‘purpose trustee’ to ensure the Stewards remain true to the purpose and ethos of the CKAN project. The Board would like to thank everyone who contributed to the deliberations and we are confident CKAN has a very bright future ahead of it.

If you have any questions, please get in touch with Steven de Costa, managing director of Link Digital, or Paul Walsh, CEO of Datopian, by emailing stewards@ckan.org.

Fighting for a more open world: our CEO’s keynote speech at Open Belgium 2019

- March 4, 2019 in Open Data, Open Knowledge, Open Knowledge International, Talks

On Monday 4th March 2019, Catherine Stihler, the new chief executive of Open Knowledge International, will deliver a keynote speech – Fighting for a more open world – at the Open Belgium 2019 conference in Brussels. Read the speech below and visit the Open Belgium website or follow the hashtag to learn more about the event. Catherine Stihler, CEO of Open Knowledge International Thanks to Open Knowledge Belgium for inviting me to speak today. It is great to be you with you all in what is my fourth week in my new role as Chief Executive of Open Knowledge International. This is the first time I have been in Brussels since serving for 20 years as an MEP for Scotland. During that time, I worked on copyright reform and around openness with a key focus on intellectual property rights and freedom of expression. Digital skills and data use have always been a personal passion, and I’m excited to meet so many talented people using those skills to fight for a more open world. It is a privilege to be part of an organisation and movement that have set the global standard for genuinely free and open sharing of information. There have been many gains in recent years that have made our society more open, with experts – be they scientists, entrepreneurs or campaigners – using data for the common good. But I join OKI at a time when openness is at risk. The acceptance of basic facts is under threat, with many expert views dismissed and a culture of ‘anti-intellectualism’ from those on the extremes of politics. Facts are simply branded as ‘fake news’. The rise of the far right and the far left brings with it an authoritarian approach that could return us to a closed society. The way forward is to resuscitate the three foundations of tolerance, facts and ideas, to prevent the drift to the extremes. I want to see a fairer and open society where help harness the power of open data and unleash its potential for the public good. We at Open Knowledge International want to see enlightened societies around the world, where everyone has access to key information and the ability to use it to understand and shape their lives; where powerful institutions are held accountable; and where vital research that can help us tackle challenges – such as inequality, poverty and climate change – is available to all. To reach these goals, we need to work to raise the profile of open knowledge and instil it as an important value in the organisations and sectors we work in. In order to achieve this, we will need to change cultures, policies and business models of organisations large and small to make opening up and using information possible and desirable. This means building the capacity to understand, share, find and use data, across civil society and government. We need to create and encourage collaborations across government, business and civil society to use data to rebalance power and tackle major challenges. We need tools – technical, legal and educational – to make working with data easier and more effective. Yet, in many countries, societies are shifting in the other direction making it harder and harder to foster collaboration, discover compromises and make breakthroughs. Freedom House has recorded global declines in political rights and civil liberties for an alarming 13 consecutive years, from 2005 to 2018. Last year, CIVICUS found that nearly six in ten countries are seriously restricting people’s fundamental freedoms of association, peaceful assembly and expression. And, despite some governments releasing more data than before,  our most recent Global Open Data Index found that only 11% of the data published in 2017 was truly open, down from 16% of the data surveyed in 2013. Our fear is that these trends towards closed societies will exacerbate inequality in many countries as declining civic rights, the digital divide, ‘dirty data and restrictions on the free and open exchange of information combine in new and troubling ways. Opaque technological approaches – informed by both public and, more often, private data – are increasingly being suggested as solutions to some of the world’s toughest issues from crime prevention to healthcare provision and from managing welfare or food aid projects to policing border security, most recently evidenced in the debate around the Northern Irish border and Brexit. Yet if citizens cannot understand, trust or challenge data-driven decisions taken by governments and private organisations due to a lack of transparency or the challenge of a right of redress to the data held on individuals or businesses, then racist, sexist and xenophobic biases risk being baked into public systems – and the right to privacy will be eroded. We need to act now and ensure that legislation emphasising open values keeps pace with technological advances so that they can be harnessed in ways which protect – rather than erode – citizens’ rights. And we need people in future to be able to have an open and honest exchange of information with details, context and metadata helping to make any potential biases more transparent and rectifiable. As Wafa Ben Hassine, policy counsel for Access Now, said recently, “we need to make sure humans are kept in the loop … [to make sure] that there is oversight and accountability” of any systems using data to make decisions for public bodies. Moving on to another pressing issue, I am very concerned about the EU’s deal on copyright reform – which is due to go before the European Parliament for a vote this month – and the effects that this will have on society. The agreement will require platforms such as YouTube, Twitter or Google to take down user-generated content that could breach intellectual property and install filters to prevent people from uploading copyrighted material. That means memes, GIFs and music remixes may be taken down because the copyright does not belong to the uploader. It could also restrict the sharing of vital research and facts, allowing ‘fake news’ to spread. This is an attack on openness and will lead to a chilling effect on freedom of speech across the EU. It does not enhance citizens’ rights and could lead to Europe becoming a more closed society – restricting how we share research that could lead to medical breakthroughs or how we share facts. I know that there is a detailed session focused on copyright reform at 12:30pm in this room so please join that if you want to learn more. So what can we do about these issues? First, we are calling on all candidates in May’s European Parliament elections to go to pledge2019.eu to make a public pledge that they will oppose Article 13 of the EU’s chilling copyright reforms. This is an issue that is not going to go away, regardless of the plenary vote this spring. When the new Parliament sits, in July, the MEPs representing voters for the next five years will have an opportunity to take action. Second, in coordination with our colleagues at Mozilla and other organisations, we want tech companies like Facebook to introduce a number of improved transparency measures to safeguard against interference in the coming European elections, and I have written to Facebook’s vice-president of global affairs and my former MEP colleague Sir Nick Clegg to request more openness from the social media platform. Facebook have responded but you can add your voice to Mozilla’s ongoing campaign to keep up the pressure and make sure change happens. Third, we encourage you to visit responsibledata.io to join the Responsible Data community which works to respond to the ethical, legal, social and privacy-related challenges that come from using data in new and different ways. This community was first convened by our friends at the Engine Room – who have done great work on this issue – alongside our School of Data who were one of the founding partners. Fourth,  get everyone to use established, recognised open licences when releasing data or content. This should be a simple ask for governments and organisations across the world but our research has found that legally cumbersome custom licenses strangle innovation and the reuse of data. Fifth, when you are choosing MEP candidates to vote for in May, ask yourself: what have they done to push for openness in our country? Have they signed up to key transparency legislation? Voiced support for access to information and freedom of expression? If you’re not sure, email and ask them. We need a strong cohort of open advocates at the European Parliament to address the coming issues around privacy, transparency and data protection. At Open Knowledge International, we will help fight the good fight by continuing our work to bring together communities around the world to celebrate and prove the value of being open in the face of prevailing winds. Two days ago, with support from OKI, Open Data Day took place with hundreds of events taking place all over the world. From open mapping in South America to open science and research in Francophone Africa, grassroot organisations came out in growing numbers to share their belief in the value of open data. Our next big event is the fourth iteration of csv,conf, a community conference for data makers featuring stories about data sharing and data analysis from science, journalism, government, and open source. By popular demand, this year will see the return of the infamous comma llama. We are also very proud of the fantastic work by the Open Knowledge network teams around the globe to nurture open communities from Open Knowledge Finland’s creation of the MyData conference and movement to the investigations by journalists and developers enabled by Open Knowledge Germany and OpenCorporates’ recent release of data on 5.1 million German companies. And here in Belgium, it’s fantastic to hear about the hundreds of students who participated in Open Knowledge Belgium’s Open Summer of Code last year to create innovative open source projects as well as to be inspired by the team’s work on HackYourFuture Belgium, a coding school for refugees. To finish my speech, I want to echo Claire Melamed of the Global Partnership for Sustainable Development Data: “People’s voices turned into numbers have power … and data has a power to reveal the truth about people’s lives even when words and pictures have failed.” So whether you’re interested in open government, open education or any of the other fascinating topics being explored today, I hope that you connect with people who will help you fight for openness, fight for the truth and fight for the rights of people in this country and beyond.