You are browsing the archive for Interviews.

Working with UNHCR to better collect, archive and re-use data about some of the world’s most vulnerable people

- January 7, 2022 in ckan, Interviews, News, OKI Projects, Open Knowledge, Open Knowledge Foundation

Since 2018, the team at Open Knowledge Foundation has been working with the Raw Internal Data Library (RIDL) project team at UNHCR to build an internal library of data to support evidence-based decision making by UNHCR and its partners.

What’s this about? 

The United Nations High Commissioner for Refugees (UNHCR) is a global organisation ‘dedicated to saving lives, protecting rights and building a better future for refugees, forcibly displaced communities and stateless people’.

Around the world, at least 82 million people have been forced to flee their homes. Many of these people are refugees and asylum seekers. Over half are internally displaced within the border of their own country. The vast majority of these people are hosted in developing countries. Learn more here.

UNHCR has a presence in 125 countries, with 90%+ of staff based in the field. An important dimension of their work involves collecting and using data – to understand what’s happening, to which people, where it’s happening and what should be done about it. 

In the past, managing this data has been a huge challenge. Data was collected in a decentralised manner. It was then stored, archived, and processed in a decentralised manner. This meant that much of the value of this data was lost. Insights were undiscovered. Opportunities missed. 

In 2019, the UNHCR released its Data Transformation Strategy 2020 – 2025 – with the vision of UNHCR becoming ‘a trusted leader on data and information related to refugees and other affected populations, thereby enabling actions that protect, include and empower’.

The Raw Internal Data Library (RIDL)  supports this strategy by creating a safe, organized place for UNHCR to store its data , with metadata that helps staff find the data they need and enables them to re-use it in multiple types of analysis. 

Since 2018, the team at Open Knowledge Foundation have been working with the RIDL team to build this library using CKAN –  the open source data management system. 

OKF spoke with Mariann Urban at UNHCR Global Data Service about the project to learn more. 

Here is an extract of that interview, which has been edited for length and clarity.

Hi Mariann. Can you start by telling us why data is important for UNHCR

MU/UNHCR: That’s a great question. Pretty much everyone at UNHCR now recognises that good data is the key to achieving meaningful solutions for displaced people. It’s important to enable evidence-based decision making and to deliver our mandate. And also, it helps us raise awareness and demonstrate the impact of our work. Data is at the foundation of what UNHCR does. It’s also important for building strong partnerships with governments and other organisations. When we share this data, anonymised where necessary, it allows our partners to design their programmes better. Data is critical to generate better knowledge and insights. Secondary usage includes indicator baseline analysis, trend analysis, forecasting, modeling etc. Data is really valuable!

What kinds of datasets does UNHCR collect and use?

MU/UNHCR: We have people working in countries all over the world, most of them in the field. Every year UNHCR spends a huge amount of money collecting data. It’s a huge investment. Much of this data collection happens at the field level, organised by our partners in operations. They collect a multitude of operational data each year.

You must have lots of interesting data. Can you give us an example of one important dataset?

MU/UNHCR: One of the most valuable datasets is our registration data. Registering refugees and asylum seekers is the primary responsibility of governments. But if they require help, UNHCR provides support in that area.

In the past, How was data collected, archived and used at UNHCR?

MU/UNHCR: Let me give you an example about how it used to be. In the past, let’s imagine, there was a data collection exercise in Cameroon. Our colleagues finished the exercise, and the data stayed in the partner organisation, or sometimes with the actual person collecting the data. It was stored on hard drives, shared drives, email accounts etc. Then, the next person who wanted to work with the data, or a similar data set probably had no access to this data, to use as a baseline, or for trends analysis.

That sounds like a problem.

MU/UNHCR: Yes! This was the problem statement that led to the idea of the Raw Internal Data Library (RIDL). Of course, we already have corporate data archiving solutions. But we realised we needed something more.

Tell us more about RIDL

MU/UNHCR: The main goal of RIDL is to stop data loss. We know that the organisation cannot capitalise on data if they are lost or forgotten, or not stored in a format that is interoperable, machine-readable, and does not include a minimum set of metadata to ensure appropriate further use.

RIDL is built on CKAN. Why is that?

MU/UNHCR: Our team had some experience with CKAN, which is already used in the humanitarian data community. UNHCR has been an active user of OCHA’s Humanitarian Data Exchange (HDX) platform to share aggregate data externally and we closely collaborate with its technical team. After a market research, we realised that CKAN was also a good solution for an internal library – the data is internal, but it needs to be visible to a lot of people inside the organisation. 

What about external partners and the media? Can they access RIDL datasets?

MU/UNHCR: There are some complicated issues around privacy and security. Some of the data we collect is extremely sensitive. We have to be strong custodians of this data to ensure it is used appropriately. Once we analyse the data, we can take the next step and share it externally, of course. Sometimes our data include personal identifiers, it therefore must be cleaned and anonymised to ensure that data subjects are not identifiable. Once we have a dataset that is anonymised – we use our Microdata Library to publish it externally. Thus RIDL is the first step in a long chain of sharing our data with partners, governments, researchers and the media. 

RIDL is a technological solution. But I imagine there is some cultural change required for UNHCR to reach its vision of becoming a data-enabled organisation.

MU/UNHCR: Yes of course, achieving these aspirations is not just about getting the technology right. We also have to make cultural, procedural and governance changes to become a data-enabled organisation. It’s a huge project. It needs a culture shift in UNHCR – because even if it’s internal, it’s a bit of work to convince people to upload. The metadata is always visible for everyone internally, but the actual data itself can be restricted and only visible following a request and evaluation. We want to be a trusted leader, but we also want to use that data to arrive at a better solution for refugees, to enrich our partnerships, and to enable evidence-based decision making – which is what we always aim to do.

Thanks for sharing your insights with us today Mariann. 

MU/UNHCR: No problem. It’s been a pleasure. 

Find out more

Open Knowledge Foundation is working with UNHCR to deliver the Raw Internal Data Library (RIDL). If you work outside of UNHCR, you can access UNHCR’s Microdata Library here. Learn more about CKAN here. 

If your organisation needs a Data Library solution and you want to learn more about our work, email We’d love to talk to you !

Applications to be the new CEO of Open Knowledge Foundation now open!

- October 27, 2020 in Featured, Interviews, Jobs, News, Press

We want openness of all forms of knowledge to ensure a fair, free future full of possibility for all, where shared knowledge contributes to happier and healthier lives. To this end, we are looking for a CEO who help us step up as global leaders.

We are looking for a leader who will spearhead and lead the team to spread the global message of openness and establish new rules to counter the unaccountable tech companies monopolising the digital age. This is a time to be hopeful about the future, and to inspire those who want to build a better society.

We will pursue our mission in the following ways:

  • People – support people and organisations to create a free, fair and open future
  • Places – extend our global reach into new geographies and industries, in particular, health, education and work
  • Policies – have policies and procedures that support our vision and make us fit for purpose
  • Partnerships – work in partnership with others who can help us achieve our vision, and secure funding and income that enable us to be sustainable

We will achieve all of this as the world struggles to recover from the coronavirus pandemic, faces a new global recession and an ongoing climate emergency. There is a crossroad ahead, with a choice between two paths – open or closed. We can be the inspiration for others to follow and ensure society takes the most equitable route. It is an exciting time for our organisation.

Could it be an exiting time for you to join us to take your career to the next level?

For more information and to apply click here.

We will not accept speculative CVs sent to staff or any Board member via email. We can only accept applications from applicants directly via the portal. We expressly do not accept any agency terms and conditions unless contained in a separate agreement signed between us and the agency before the date this application went live on our website.

Podcast: Pavel Richter on the value of open data

- August 25, 2017 in Interviews, Open Knowledge, podcasts

This month Pavel Richter, CEO of Open Knowledge International, was interviewed by Stephen Ladek of Aidpreneur for the 161st episode of his Terms of Reference podcast. Aidpreneur is an online community focused on social enterprise, humanitarian aid and international development that runs this podcast to cover important topics in the social impact sector. Under the title ‘Supporting The Open Data Movement’, Stephen Ladek and Pavel Richter discuss a range of topics surrounding open data, such as what open data means, how open data can improve people’s lives (including the role it can play in aid and development work) and the current state of openness in the world. As Pavel phrases it: “There are limitless ways where open data is part of your life already, or at least should be”. Pavel Richter joined Open Knowledge International as CEO in April 2015, following five years of experience as Executive Director of Wikimedia Deutschland. He explains how Open Knowledge International has set its’ focus on bridging the gap between the people who could make the best use of open data (civil society organisations and activists in areas such as human rights, health or the fight against corruption) and the people who have the technical knowledge on how to work with data. OKI can make an impact by bridging this gap, empowering these organisations to use open data to improve people’s lives. The podcast goes into several examples that demonstrate the value of open data in our everyday life, from how OpenStreetMap was used by volunteers following the Nepal earthquake to map where roads were destroyed or still accessible, to governments opening up financial data on tax returns or on how foreign aid money is spent, to projects such as OpenTrials opening up clinical trial data, so that people are able to get information on what kind of drugs are being tested for effectiveness against viruses such as Ebola or Zika. In addition, Stephen Ladek and Pavel Richter discuss questions surrounding potential misuse of open data, the role of the cultural context in open data, and the current state of open data around the world, as measured in recent initiatives such as the Open Data Barometer and the Global Open Data Index. Listen to the full podcast below, or visit the Aidpreneur website for more information:  

How participatory budgeting can transform community engagement – An interview with Amir Campos

- June 2, 2017 in Interviews, OpenBudgets, OpenSpending

For most municipalities, participatory budgeting is a relatively new approach to include their citizens directly in the decision making for new investments and developments in their community. Fundación Civio is a civic tech organisation based in Madrid, Spain that develops tools for citizens that both reveal the civic value of data and promote transparency. The organisation has developed an online platform for participatory budgeting processes, both for voting and monitoring incoming proposals, that is currently being tested in three Spanish municipalities. Diana Krebs (Project Manager for Fiscal Projects at OKI) talked with Amir Campos, project officer at Fundación Civio, on how tech solutions can help to make participatory budgeting a sustainable process in communities and what is needed beyond from a non-tech point of view.

Amir Campos, Project officer at Fundación Civio

Participatory budgeting (PB) is a relatively new form for municipalities to engage with their citizens. You developed an online platform to help to make the participatory process easier. How can this help in order to turn PB in an integrative part of community life? Participatory budgets are born with the desire to democratise power at a local level, to “municipalise the State”, with a clear objective, that these actions at local level serve as an example at a regional and national level and foster change in State participation and investment policies. This aim for the democratisation of power also represents a struggle for a better distribution of wealth, giving voice to the citizens, taking them out of political anonymity every year, making local investment’s needs visible much faster than any traditional electoral process. Participatory budgeting is a tough citizen’s marking of their local representatives. The tool we have designed is powerful but easy to use because we have avoided the development of a tool that only technical people would use. Users are able to upload their own data (submitting or voting proposals, comments, feedback, etc. in order to generate discussions, voting processes, announcements, visualisations, etc.) It has a more visual approach that clearly differentiates our solution from existing solutions and gives further value to it. Our tool is targeted at administrators, users and policy makers without advanced technical skills and it is online, presented as Software as a Service (SaaS), avoiding the need for users to download or install any special software. All in all, out tool, will bring the experience of taking part in a process of participatory budgeting closer to all citizens. Once registered, its user-friendliness and visual features will keep users connected, not only to vote proposals but also to monitor and share them, while exercising effective decision-making actions and redistributing available resources in their municipality. Along with off-line participatory processes, this platform gives voice to citizens, vote and also gives them the possibility of making their public representatives more accountable through its monitoring capabilities. The final aim is to enable real participatory experiences, providing solutions that are easy to implement by all stakeholders involved, thus strengthening the democratic process.

Do you think that participatory budgeting is a concept that will be more successful in small communities, where the daily business is less ruled by political parties’ interest and more by consent of what the community needs (like new playgrounds or sports parks)? Or can it work in bigger communities such as Madrid as well? Of course! The smaller the community, the better the decision-making process, not only at the PB level but at all levels. Wherever there is a “feeling” of a community it is much easier to generate agreements oriented towards the common good. That is why in large cities there are always more than one PB process at the same time, one at the neighborhood level, and another at the municipal level (whole city), to engage people at the neighborhood level and push them to vote at the city level. Examples such as Paris or Madrid, which use on-line and off-line platforms use that division, instead, small town halls, such as Torrelodones, open just a single process for the whole municipality. All process need municipal representatives commitment and citizens engagement, connected to a culture of participation, for harvesting successful outcomes. Do you see a chance that PB might increase fiscal data literacy if communities are more involved in deciding on what the community should spend tax money on? Well, I am not sure about an improvement on fiscal data literacy, but I am absolutely convinced that citizens will better understand the budget cycle, concepts and the overall approval process. Currently, in most cases, budget preparation and approval has been a closed-door process within administrations. Municipal PB implementations will act as enabling processes for citizens to influence budget decisions, becoming actual stakeholders of the decision-making process and auditing budget compromised vs. actual spending and giving feedback to the administrations. Furthermore, projects implemented thanks to a PB will last longer since citizens will take on a commitment to the project implemented, their representatives and their peers with whom individuals will have to agree once and will easily renew this agreement. The educational resources available for citizens in the platform will help also to improve the degree of literacy. They provide online materials to better understand the budget period, terms used or how to influence and monitor the budget. What non-tech measures and commitments do a municipal council or parliament need to take so that participatory budgeting will become a long-term integrative part of citizens’ engagement? They will have to agree as a government. One of the key steps to maintain a Participatory Budgets initiative over time is to legislate on this so that, regardless of the party that governs the municipality, the Participatory Budgeting processes keep running and a long-lasting prevalence is achieved. Porto Alegre (Brazil) is a very good example of this; they have been redistributing their resources at the municipal level for the last 25 years. Fundación Civio is part of the EU H2020 project, where it collaborates with 8 other partners around topics of fiscal transparency.  

Storytelling with

- October 21, 2014 in Community Session, Events, infogram, Interviews, skillshare

infogram As we well know, Data is only data until you use it for storytelling and insights. Some people are super talented and can use D3 or other amazing visual tools, just see this great list of resources on Visualising Advocacy. In this 1 hour Community Session, Nika Aleksejeva of shares some easy ways that you can started with simple data visualizations. Her talk also includes tips for telling a great story and some thoughtful comments on when to use various data viz techniques.
We’d love you to join us and do a skillshare on tools and techniques. Really, we are tool agnostic and simply want to share with the community. Please do get in touch and learn more: about Community Sessions.

Nicolas Kayser-Bril – on doing journalism in the digital era

- September 2, 2014 in Data Journalism, Interviews, tools

Journalism++ is an agency for data-driven storytelling. Started by three people, it is now a network of independent for-profit companies working from Berlin, Paris, Stockholm, Cologne, Amsterdam and Porto. They define journalism as ‘making interesting what is important’, not ‘making important what is interesting’. Winner of Data Journalism Awards 2014, the agency is famous for building data-driven web-apps, creating visualizations, and, last but not least, investigative journalism projects. How cool is that? We discuss it with a CEO and co-founder of J++, Nicolas Kayser-Bril. A self-taught programmer and journalist, Nicolas holds a degree in Media Economics. He is also investing in data journalism as instructor at massive open online courses taught in English and French.   Nicolas, how did you decide that it’s a time to open a data journalism agency? Before, together with Pierre (Pierre Romera, Chief Technology Officer and Developer at J++), we worked at the news start up called OWNI in Paris. OWNI is a legendary French newsroom launched in April 2009, where Nicolas was a head of the data journalism team. It was focusing on technology, politics and culture and running on a non-profit economic model. Twice a winner of Online Journalism Awards, OWNI is probably most famous for  its cooperation with Wikileaks .Due to financial problems, OWNI was closed down on 21 December 2012. As things at OWNI get worse, we left it at 2011 and wanted to keep working together. He is a developer, I am a journalist, so we looked for different newsrooms in Paris and London, who might be willing to hire us as a team. Some newsrooms told us: ok, you can come and work for us, but the developer is going to work with developers, and the journalist is going to work journalists, which we refused. So that’s why we created the company: it was more a plan B, but we just wanted to keep doing data journalism together. Why did you open up in Berlin and how did the name come up? The name was a nerdy joke: when you code and you add ‘++ ‘, it means that this variable is now equal to the value of the variable plus one, so basically ‘journalism ++’ means’ journalism equals journalism +1’. So, something more than just journalism. As for the city, again, it was not planned, but just happened. At that time I was living in Berlin, and Pierre was living in London. Since both Pierre and me are French, we first created a company in Paris due to the legal reasons, and we actually planned on going back in Paris. But at some point our Head Project Manager Anne-Lise Bouyer decided to move to Berlin, and we created this way of working between two cities. Now it’s seven people here. But aside from seven people in the core team, you do have the branches. How did you develop this and what’s the rule to be accepted in the club? This franchise program developed again kind of randomly. We knew Jens Finnas and Peter Grensund from Sweden, they were very good and we heard that they were going to open an agency. So we just told them, that it would be cool to have the same name, and to create this franchise concept. Since it worked really well with J++ Stockholm, we expanded to the other cities. The idea is to bring together the best data-driven journalists in every market, so we are looking for really good developers mostly, because we believe, that DDJ is really a technology thing. And then you need to come up with the concept, to show that you are not just doing things, but that you have a plan to create a company.  We do not want to make our brand kind of a label that we give or don’t give to people. This is more about creating the companies, because this makes you much stronger in your journalism investigations.   What’s you favourite own project so far? Right now we are pivoting towards is an open-source tool developed by the Journalism++. It lets users to upload, store and mine the data you have on a precise topic. Therefore, it is a useful platform to host an investigation, be this done by journalists, lawyers or business intelligence. To start working, you need to download the source code and install on your own server. Alternatively, you can do everything online on which is much simpler and exactly as safe.  detective We run some of our investigative projects to advertise this tool, such as The Migrant Files or the Belarus Networks (a database of connections within the Belarussian elite, to be published soon). Pushing to the new markets, we invest in the investigative journalism as a field. We are also going to provide the customization services to this. Before that there was something really cool that we did for the Arte at the beginning of this year. The special thing about this was that Arte came to us and said: we want to do something about the employment and the work situation of young people in Europe, so we did the project from the concept to the development. This was called World of Work. Which is just another evidence of how many shapes can journalism take nowadays, since this project was a questionnaire? Exactly, in this case we wanted the young people who would take the questionnaire  to ask themselves the questions about the work that they might not have asked themselves otherwise. It was 60 questions on various topics, and we pretended that it was a survey, but actually the idea was to make users think in new ways about their work situation. For instance, something that we did was to never talk about unemployment or employment, because we believe these categories never work for the younger generation. The reason I am very happy about this is that people who took the questionnaire told us that some of the questions they would never have asked themselves.  And this really made them think, and this precisely was the goal, even if we they did not realize that it was the goal that we had. wow Did you get a psychologist in the team for this project? We hired a consulting from the advertising to create the atmosphere. Generally, it was a huge amount of research, and a multi-skilled team including developer and designer. But the ‘other’ skill we had to put to the project was from advertising because it’s surprisingly hard to write for 22-30 years group and to find the questions that were interesting for the user and relevant for the project.   What are currently your favourite tools? Apart from, when it comes to simple visualizations I use Datawrapper (another project of J++, which is run by J++ Cologne) or Chartbuilder (project by Quartz). You have so many tools, and also it really depends from project to project, for example right now we do social network visualization, so we use Gephi a lot – open source social network analysis software. What would advice to people who want to promote DDJ in Russia? You have lots of the agencies doing cool stuff there, and I think it’s important to have good developers who understand something about journalism, and that’s how you can have the good ideas, because otherwise you would just mimic what’s happening in the other countries, but that’s not the point.  The point is to leverage technology, to bring something new to the field. And in this case either you learn how to code or you find a developer.   That said, data driven journalism sometimes looks like an inner thing, a nerdy direction of journalism. Do you think it can become a mainstream? What you say is especially true, with what was created in the US this year, like Upshot,Vox and 538, and I agree, it might shift the definition of DDJ to the nerdy field. But if you consider the data journalism as a journalism which could not be done without a computer – which is the definitions I favour – then you can do anything. Like in the project  ‘World of work’ I was referring to: what’s special about this, is that it has been done by journalists working with developers.  But as a user, you don’t realize it. And that’s what we should aim at.   Is then data journalism something that we have been known before, under different names, like computer-assisted reporting, or there’s something particularly new about the data journalism as we know it today? It’s true that using computers for journalism is nothing new, same thing with visualization of data, but what’s new, especially in Europe is that people in the newsrooms have realized that they need some math to do their work. Before if you wanted to do a piece of computer assisted reporting, you needed to have a statistician, you had to go out and rent computer, and computer time was extremely expensive. Now you can do same kind of analysis in a few hours for 0 euros. And then anyone can do it and publish. And there is also another aspect: the term ‘online journalism’ has been hijacked by the people doing copy paste journalism. So that’s also one of the main reasons why data journalism is fashionable now: it means doing journalism online in a different way than it was done for the past 10 years.   The profession of journalism as a whole is very much shifting nowadays, where so many things of journalism are being done by citizen journalists or bloggers, and in the same time the ‘real’ journalists have to acquire the skills they were never asked before. As a self-taught journalist, what is your stake on this? That’s a very interesting topic and we might talk about it a few hours. The definition of journalism or who the journalist was before the digitalization of content was really a definition by the means. The journalist was the person who had access to the means of publication or broadcast. And that’s why the anchor on Russian TV is a journalist even if she’s just repeating what the government wants her to say. And the investigative journalist at the New York Times is also a journalist, because they both have access to the means of communication to the public. And this concept doesn’t exist anymore. Now everybody can publish. And what you see, when it was Ben Laden assassination, you had this guy in Pakistan, he had no contact to any media outlet whatsoever, he just published a couple of tweets and then he was for a few minutes the leading news source on the topic. Take another example, at Aurora shooting in Colorado, when there was this guy dressed as Superman who came to the movie theatre in Denver, Colorado, and shooted everyone, and  for the first 10-12 hours the main source of information was this teenager in his room close to the shooting. It was at night, so it took like 10 hours for the TV stations to come on location, and during this time he was the one checking the information and just doing journalism. And when he was asked why he did it, he just said: I thought it was needed. Next, there was a visualization with all the drones strike, it was just a nice visualization, there was no breaking news, it was more like a cold journalism thing. But it became extremely successful, and again, the reason behind it, as the author said – this story has not been told, I though it needs to be told. drones What all these things have in common, is adding value to the information in the public interest. And I think, this is what constitutes journalism today. If you are working full time in the news room writing articles or presenting the news at the TV, I would call it not journalism but information professional. That’s why I define journalism not by the occupation of the people who do it, but really by the goal, telling info in the public interest. And anybody can do the act of journalism.       Team photo: © Marion Kotlarski/Journalism++


- July 9, 2014 in Events, Ideas and musings, Interviews, network, OKFest, OKFestival, Open Knowledge Foundation

Everyone is a storyteller! Just one week away from the big Open Brain Party of OKFestival. We need all the storytelling help you can muster. Trust us, from photos to videos to art to blogs to tweets – share away. The Storytelling team is a community-driven project. We will work with all participants to decide which tasks are possible and which stories they want to cover. We remix together. We’ve written up this summary of how to Storytell, some story ideas and suggested formats. There are a few ways to join:
  • AT the Event: We will host an in person meetup on Tuesday, July 15th to plan at the Science Fair. Watch #okstory for details. Look for the folks with blue ribbons.
  • Digital Participants: Join in and add all your content with the #okfest14 @heatherleson #OKStory tags.
  • Share: Use the #okstory hashtag. Drop a line to heather.leson AT okfn dot org to get connected.
We highlighted some ways to storytell in this brief 20 minute chat:

Community Session: Open Data Hong Kong

- May 7, 2014 in Community Sessions, Events, Interviews, OKF Hong Kong, Open Knowledge Foundation Local Groups

Open Data Hong Kong is an open, participative, and volunteer-run group of Hong Kong citizens who support Open Data. Join Mart van de Ven, Open Knowledge Ambassador for Hong Kong, and Bastien Douglas of ODHK for a discussion about their work. odhk - logo

How to Participate

This Community Session will be hosted via G+. We will record it.
  • Date: Wednesday, May 14, 2014
  • Time: Wednesday 21:00 – 22:00 EDT/ Thursday 09:00 – 10:00 HKT/01:00 – 02:00 UTC
  • See to convert times.
  • Duration: 1 hour
  • Register for the event here

About our Community Session Guests

Mart van de Ven Mart van de Ven co-founded Open Data Hong Kong to inspire and nurture a techno-social revolution in Hong Kong. He believes Open Data is a chance for citizens to be better served by government. Not only because it enables greater transparency and accountability, but because when governments open up their data it allows them to concentrate on their irreducible core – enabling us as citizens. He is also Open Knowledge’s ambassador to Hong Kong, a data-driven developer and technology educator for General Assembly.
Bastien Douglas
Bastien’s role with ODHK is to create a structure for the community to develop sustainability, form partnerships with other organisations and operationalize projects to achieve the goals of the organisation. Bastien’s background combines public sector experience, research analysis and citizen engagement. For over 4 years as a public servant in the federal government of Canada in Ottawa, he analysed policy at the front lines of policy development and researched public management issues at the centre of the bureaucracy. In 2009, a community of innovative public servants formed by Bastien to work across silos using collaborative tools and social media pushed projects for to forward Open Data to raise capacity to share knowledge and better support the public. Bastien then worked in the NGO sector building knowledge capacity for the immigrant-serving sector, while supporting advocacy for improved services, information-sharing, access to resources and sharing of practices for service delivery. Bastien Douglas on Twitter
More Details
See the full Community Session Schedule

Vice Italy interview with the editor of the Public Domain Review

- January 28, 2013 in Interviews, Public Domain, public domain review

The editor of The Public Domain Review, Adam Green, recently gave a feature-length interview to Vice magazine Italy. You can find the original in Italian here, and an English version below! While there is a wealth of copyright-free material available online, The Public Domain Review is carving out a niche as strongly curated website with a strong editorial line. How did the PDR begin? Myself and The Public Domain Review’s other co-founder, Jonathan Gray, have long been into digging around in the these huge online archives of digitised material – places like the Internet Archive and Wikimedia Commons – mostly to find things with which to make collages. We started a little blog called Pingere to share some of the more unusual and compelling things that we stumbled across. Jonathan suggested that we turned this into a bigger project aiming to celebrate and showcase the wonderfulness of this public domain material that was out there. We took the idea to the Open Knowledge Foundation, a non-profit which promotes open access to knowledge in a variety of fields, and they helped us to secure some initial seed funding for the project. And so the Public Domain Review was born. What was the first article you posted? We initially focused on things which were just coming into the public domain that year. In many countries works enter the public domain 70 years after the death of the author or artist – although there are lots of weird rules and exceptions (often unnecessarily complicated!). Anyway, 2011 saw the works of Nathaniel West enter the public domain, including his most famous book Day of the Locusts. The first article was about that, and West’s relationship with Hollywood, written by Marion Meade who’d recently published a book on the subject. What criteria do you use to choose stuff for the Review? As the name suggests, all our content is in the ‘public domain’, so that is the first criterion. We try to focus on works that are in the public domain in most countries, which isn’t as easy as it sounds as every country has different rules. Generally it means stuff created by people who passed away before the early 1940s. The second criterion is that there are no restrictions on the reuse of the digital copies of the public domain material. What kind of restrictions? Well, some countries say that in order to qualify for copyright digital reproductions have to demonstrate some minimal degree of originality, and others say that there just needs to be demonstrable investment in the digitisation (the so-called “sweat of the brow” doctrine). Many big players in the world of digitisation – like Google, Microsoft, the Bridgeman Art Library, and national institutions – argue that they own rights in their digital reproductions of works that have entered the public domain, perhaps so they can sell or restrict access to them later down the line. We showcase material from institutions who have already decided to openly license their digitisations. We are also working behind the scenes to encourage more institutions to do the same and see free and open access to their holdings as part of their public mission. But you have a strong aesthetic line as well, don’t you? Yes of course, the material has to be interesting! We tend to go for stuff which is less well known, so rather than put up all the works of Charles Dickens (as great as they are) we’ll go instead for something toward the more unorthodox end of the cultural spectrum, e.g. a personal oracle book belonging to Napoleon, or a 19th century attempt to mathematically model human consciousness through geometric forms. I guess a sort of alternative history to the mainstream narrative, an attempt to showcase just some of the excellence and strangeness of human ideas and activity which exist ‘inbetween’ these bigger events and works about which the narrative of history is normally woven. Is there anything you wouldn’t publish? I guess there is some material which is perhaps a little too controversial for the virtuous pages of the PDR – such as the racier work of Thomas Rowlandson or some of the less family friendly works of the 16th century Italian printmaker Agostino Carracci. Our most risque thing to date is probably a collection of some of Eadweard Muybridge’s ‘animal locomotion’ portfolio, which included a spot of naked tennis. It seems that authors are becoming less and less important, publishers are facing extinction, and yet the potential for users of content is ever-expanding. What do you think about the future of publishing? It is certainly true that things are radically changing in the publishing world. Before the advent of digital technologies, publishers were essentially gatekeepers of what words were seen in the public sphere. You saw words in books and newspapers and – for many people – that was pretty much it. What you saw was the result of decisions made by a handful of people. But now this has changed. People don’t need publishing contracts to get their words seen. Words, pictures and audiovisual material can be shared and spread at virtually no cost with just a few clicks. But people still do want to read words in books. And they turn to publishers – through bookshops, the media, etc – to find new things to read. While there is DIY print-on-demand publishing, it is hard to compete with the PR and promotion of professional publishers. I don’t think publishers will become extinct. No doubt they will adapt to new markets in search for profits. Is the internet causing works to become more detached from their authors? Is there a way in which this could be a good thing? With the rise of digital technologies it is, no doubt, much easier for this detachment to happen. Words leave the confines of books and articles, get copied and pasted into blogs, websites and social media, are shared through illegal downloads, etc, perhaps losing proper attribution along the way. But in a way none of this is new. It is just a more accelerated version of what has happened for hundreds of years. If anything it is probably better for authors now than it was with the past – as the internet also enables people to try to check where things come from, their pedigree and provenance. In the 17th century, before there was a proper copyright law, it was common for whole books to be “stolen”, given a new title and cover, and be sold under a new author’s name. Could this be a good thing? Well, one could argue that reuse and reworking are an essential part of the creative process. We can find brilliant examples of literary pastiche and collaging techniques in the works of writers like W.G. Sebald, where you are not sure whether he’s speaking with his own words or that of another writer (whose work he is discussing). In Sebald’s case it gives the whole piece a fluency and unity, a sense that its one voice, of humanity or history speaking. But of course Sebald’s work is protected by copyright held by his publishers or his literary estate. One wonders whether one could use his works in the same way and get away with it. So is copyright a big negative? No not at all – from the perspective of artists/writers copyrighting their work, in general it makes complete sense to me. This is not just about money but also about artistic control over how a work is delivered. Looking back to the past before copyright – it wasn’t just about royalties but also about reputation, about preventing or discouraging mischievous or sloppy reuse. While copyright is far from perfect – and often pretty flawed – it still offers creators a basic level of protection for the things that they have created. As an author or artist if you want something more flexible than your standard copyright license then you can combine it with things like Creative Commons licenses to say how you want others to be able to use your works. The question of how long (or whether!) works should be copyrighted after the death of creators is an entirely different question. I think copyright laws and international agreements are currently massively skewed in favour of big publishers and record companies (often supported by well heeled lobbyist groups purporting to serve the neglected interests of famous authors and aging rock stars), and do not take sufficient account of the public domain as a positive social good: a cultural commons, free for everyone. Have you ever had problems with a copyright claim from an author? Well almost all of the public domain material we feature is by people who are long dead, so we haven’t (thank god!) had any direct complaints from them. We did get one take down notice on Gurideff’s Harmonium Recordings. The law can get very complex, particularly around films and sound recordings. I am not sure they were right, but we took it down all the same. What are your plans for the future? As well as expansion of the site with exciting new features we are also planning to break out from the internet into the real world of objects! We’re planning to produce some beautiful printed volumes with collections of images and texts curated around certain themes. We’ve wanted to do this for a while, and hopefully we’ll have time (and funds!) to finally do this next year. You can sign up to The Public Domain Review’s wonderful newsletter here

“Carbon dioxide data is not on the world’s dashboard” says Hans Rosling

- January 21, 2013 in Featured, Interviews, OKFest, Open Data, Open Government Data, Open/Closed, WG Sustainability, Working Groups

Professor Hans Rosling, co-founder and chairman of the Gapminder Foundation and Advisory Board Member at the Open Knowledge Foundation, received a standing ovation for his keynote at OKFestival in Helsinki in September in which he urged open data advocates to demand CO2 data from governments around the world. Following on from this, the Open Knowledge Foundation’s Jonathan Gray interviewed Professor Rosling about CO2 data and his ideas about how better data-driven advocacy and reportage might help to mobilise citizens and pressure governments to act to avert catastrophic changes in the world’s climate.
Hello Professor Rosling! Hi. Thank you for taking the time to talk to us. Is it okay if we jump straight into it? Yes! I’m just going to get myself a banana and some ginger cake. Good idea. Just so you know: if I sound strange, it’s because I’ve got this ginger cake. A very sensible idea. So in your talk in Helsinki you said you’d like to see more CO2 data opened up. Can you say a bit more about this? In order to get access to public statistics, first the microdata must be collected, then it must be compiled into useful indicators, and then these indicators must be published. The amount of coal one factory burnt during one year is microdata. The emission of carbon dioxide per year per person in one country is an indicator. Microdata and indicators are very very different numbers. CO2 emissions data is often compiled with great delays. The collection is based on already existing microdata from several sources, which civil servants compile and convert into carbon dioxide emissions. Let’s compare this with calculating GDP per capita, which also requires an amazing amount of collection of microdata, which has to be compiled and converted and so on. That is done every quarter for each country. And it is swiftly published. It guides economic policy. It is like a speedometer. You know when you drive your car you have to check your speed all the time. The speed is shown on the dashboard. Carbon dioxide is not on the dashboard at all. It’s like something you get with several years delay, when you are back from the trip. It seems that governments don’t want to get it swiftly. And when they publish it finally, they publish it as total emissions per country. They don’t want to show emission per person, because then the rich countries stand out as worse polluters than China and India. So it is not just an issue about open data. We must push for change in the whole way in which emissions data is handled and compiled. You also said that you’d like to see more data-driven advocacy and reportage. Can you tell us what kind of thing you are thinking of? Basically everyone admits that the basic vision of the green movement is correct. Everyone agrees on that. By continuing to exploit natural resources for short term benefits you will cause a lot of harm. You have to understand the long-term impact. Businesses have to be regulated. Everyone agrees. Now, how much should we regulate? Which risks are worse, climate or nuclear? How should we judge the bad effects of having nuclear electricity? The bad effects of coal production? These are difficult political judgments. I don’t want to interfere with these political judgments, but people should know the orders of magnitude involved, the changes, what is needed to avoid certain consequences. But that data is not even compiled fast enough, and the activists do not protest, because it seems they do not need data? Let’s take one example. In Sweden we have data from the energy authority. They say: “energy produced from nuclear”. Then they include two outputs. One is the electricity that goes out into the lines and that lights the house that I’m sitting in. The other is the warm waste water that goes back into the sea. That is also energy they say. It is actually like a fraud to pretend that that is energy production. Nobody gets any benefit from it. On the contrary, they are changing the ecology of the sea. But they get away with it as the destination is energy produced. We need to be able to see the energy supply for human activity from each source and how it changes over time. The people who are now involved in producing solar and wind produce very nice reports on how production increase each year. Many get the impression that we have 10, 20, 30% of our energy from solar and wind. But even with fast growth from almost zero solar and wind it is nothing yet. The news reports mostly neglect to explain the difference in percentage growth of solar and wind energy and their percent of total energy supply. People who are too much into data and into handling data may not understand how the main misconceptions come about. Most people are so surprised when I show them total energy production in the world on one graph. They can’t yet see solar because it hasn’t reached one pixel yet. So this isn’t of course just about having more data, but about having more data literate discussion and debate – ultimately about improving public understanding? It’s like that basic rule in nutrition: Food that is not eaten has no nutritional value. Data which is not understood has no value. It is interesting that you use the term data literacy. Actually I think it is presentation skills we are talking about. Because if you don’t adapt your way of presenting to the way that people understand it, then you won’t get it through. You must prepare the food in a way that makes people want to eat it. The dream that you will train the entire population to about one semester of statistics in university: that’s wrong. Statisticians often think that they will teach the public to understand data the way they do, but instead they should turn data into Donald Duck animations and make the story interesting. Otherwise you will never ever make it. Remember, you are fighting with Britney Spears and tabloid newspapers. My biggest success in life was December 2010 on the YouTube entertainment category in the United Kingdom. I had most views that month. And I beat Lady Gaga with statistics. Amazing. Just the fact that the guy in the BBC in charge of uploading the trailer put me under ‘entertainment’ was a success. No-one thought of putting a trailer for a statistics documentary under entertainment. That’s what we do at Gapminder. We try to present data in a way that makes people want to consume it. It’s a bit like being a chef in a restaurant. I don’t grow the crop. The statisticians are like the farmers that produce the food. Open data provide free access to potatoes, tomatoes and eggs and whatever it is. We are preparing it and making a delicious food. If you really want people to read it, you have to make data as easy to consume as fish and chips. Do not expect people to become statistically literate! Turn data into understandable animations. My impression is that some of the best applications of open data that we find are when we get access to data in a specific area, which is highly organized. One of my favorite applications in Sweden is a train timetable app. I can check all the communter train departures from Stockholm to Uppsala, including the last change of platform and whether there is a delay. I can choose how to transfer quickly from the underground to the train to get home fastest. The government owns the rails and every train reports their arrival and departure continuously. This data is publicly available as open data. Then a designer made an app and made the data very easy for me to understand and use. But to create an app which shows the determinants of unemployment in the different counties of Sweden? No-one can do that because that is a great analytical research task. You have to take data from very many different sources and make predictions. I saw a presentation about this yesterday at the Institute for Future Studies. The PowerPoint graphics were ugly, but the analysis was beautiful. In this case the researchers need a designer to make their findings understandable to the broad public, and together they could build an app that would predict unemployment month by month. The CDIAC publish CO2 data for the atmosphere and the ocean, and they publish national and global emissions data. The UNFCCC publish national greenhouse gas inventories. What are the key datasets that you’d like to get hold of that are currently hard to get, and who currently holds these? I have no coherent CO2 dataset for the world beyond 2008 at the present. I want to have this data until last year, at least. I would also welcome half year data but I understand this can be difficult because carbon dioxide emission vary for transport, heating or cooling of houses over the seasons of the year. So just give me the past year’s data in March. And in April/May for all countries in the world. Then we can hold government accountable for what happens year by year. Let me tell you a bit about what happens in Sweden. The National Natural Protection Agency gets the data from the Energy Department and from other public sources. Then they give these datasets to consultants at the University of Agriculture and the Meteorological Authority. Then the consultants work on these datasets for half a year. They compile them, the administrators look through them and they publish them in mid-December, when Swedes start to get obsessed about Christmas. So that means that there was a delay of eleven and a half months. So I started to criticize that. My cutting line was when I was with the Minister of Environment and she was going to Durban. And I said “But you are going to Durban with eleven and a half month constipation. What if all of this shit comes out on stage? That would be embarrassing wouldn’t it?”. Because I knew that she had in 2010 an increase in carbon dioxide emission and it increased by 10%. But she only published that coming back from Durban. So that became a political issue on TV. And then the government promised to make it earlier. So 2012 we got CO2 data by mid-October, and 2013 we’re going to get it in April. Fantastic. But actually ridiculing is the only way that worked. That’s how we liberated the World Bank’s data. I ridiculed the President of the World Bank at an international meeting. People were laughing. That became too much. The governments in the rich countries don’t want the world to see emissions per capita. They want to publish emissions per country. This is very convenient for Germany, UK, not to mention Denmark and Norway. Then they can say the big emission countries are China and India. It is so stupid to look at total emissions per country. This allows small countries to emit as much as they want because they are just not big enough to matter. Norway hasn’t reduced their emissions for the last forty years. Instead they spend their aid money to help Brazil to replant rainforest. At the same time Brazil lends 200 times more money to the United States of America to help them consume more and emit more carbon dioxide into the atmosphere. Just to put these numbers up makes a very strong case. But I need to have timely carbon dioxide emission data. But not even climate activists ask for this. Perhaps it is because they are not really governing countries. The right wing politicians need data on economic growth, the left wing need data on unemployment but the greens don’t yet seem to need data in the same way. As well as issues getting hold of data at a national level, are there international agencies that hold data that you can’t get hold? It is like a reflection. If you can’t get data from the countries for eleven and a half months, why the heck should the UN or the World Bank compile it faster? Think of your household. There are things you do daily, that you need swiftly. Breakfast for your kids. Then, you know, repainting the house. I didn’t do it last year, so why should I do it this year? It just becomes slow the whole system. If politicians are not in a hurry to get data for their own country, they are not in a hurry to compare their data to other countries. They just do not want this data to be seen during their election period. So really what you’re saying that you’d recommend is stronger political pressure through ridicule on different national agencies? Yes. Or sit outside and protest. Do a Greenpeace action on them. Can you think of datasets about carbon dioxide emissions which aren’t currently being collected, but which you think should be collected? Yes. In a very cunning way China, South Africa and Russia like to be placed in the developing world and they don’t publish CO2 data very rapidly because they know it will be turned against them in international negotiations. They are not in a hurry. The Kyoto Protocol at least made it compulsory for the richest countries to report their data because they had committed to decrease. But every country should do this. All should be able to know how much coal each country consumed, how much oil they consumed, etc and from that data have a calculation made on how much CO2 each country emitted last year. It is strange that the best country to do this – and it is painful for a Swede to accept this – is the United States. CDIAC. Federal Agencies in US are very good on data and they take on the whole world. CDIAC make estimates for the rest of the world. Another US agency I really like is the National Snow and Ice Data Centre in Denver, Colorado. Thay give us 24 hours updates on the polar sea ice area. That’s really useful. They are also highly professional. In the US the data producers are far away from political manipulation. When you see the use of fossil fuels in the world there is only one distinct dip. That dip could be attributed to the best environmental politician ever. The dip in CO2 emissions took place in 2008. George W. Bush, Greenspan and the Lehman Brothers decreased CO2 emissions by inducing a financial crisis. It was the most significant reduction on the use of fossil fuels in modern history. I say this to put things into proportion. So far it is only financial downturns that have had an effect on the emission of greenhouse gases. The whole of environmental policy hasn’t yet had any such dramatic effect. I checked this with Al Gore personally. I asked him “Can I make this joke? That Bush was better for the climate than you were?”. “Do that!”, he said, “You’re correct.” Once we show this data people can see that the economic downturn so far was the most forceful effect on CO2 emission. If you could have all of the CO2 and climate data in the world, what would you do with it? We’re going to make teaching materials for high schools and colleges. We will cover the main aspects of global change so that we produce a coherent data-driven worldview, which starts with population, and then covers money, energy, living standards, food, education, health, security, and a few other major aspects of human life. And for each dimension we will pick a few indicators. Instead of doing Gapminder World with the bubbles that can display hundreds of indicators we plan a few small apps where you get a selected few indicators but can drill down. Start with world, world regions, countries, subnational level, sometimes you split male and female, sometimes counties, sometimes you split income groups. And we’re trying to make this in a coherent graphic and color scheme, so that we really can convey an upgraded world view. Very very simple and beautiful but with very few jokes. Just straightforward understanding. And for climate impact we will relate to the economy. To relate to the number of people at different economic levels, how much energy they use and then drill down into the type of energy they use and how that energy source mix affects the carbon dioxide emissions. And make trends forward. We will rely on the official and most credible trend forecast for population, one, two or more for energy and economic trends etc. But we will not go into what needs to be done. Or how should it be achieved. We will stay away from politics. We will stay away from all data which is under debate. Just use data with good consensus, so that we create a basic worldview. Users can then benefit from an upgraded world view when thinking and debating about the future. That’s our idea. If we provide the very basic worldview, others will create more precise data in each area, and break it down into details. A group of people inspired by your talk in Helsinki are currently starting a working group dedicated to opening up and reusing CO2 data. What advice would you give them and what would you suggest that they focus on? Put me in contact with them! We can just go for one indicator: carbon dioxide emission per person per year. Swift reporting. Just that. Thank you very much Professor Rosling. Thank you.
If you want help to liberate, analyse or communicate carbon emissions data in your country, you can join the OKFN’s Open Sustainability Working Group.