Open Knowledge Festival Spotlight: The Knowledge Stream

- June 11, 2014 in Data Journalism, disability, inclusivity, Lobbying, low-tech, media, Open Access, Open Science, open-education, Programme, Public Domain, Transparency

To create societies where everyone has both access to key information and the ability to use it to understand and shape their lives, we must build knowledge into the heart of all of our activities. This is a big task which requires not just a global shift in mindset, but also that we build the […]

Data Roundup, 16 April

- April 16, 2014 in air, Cities, Data Roundup, deaths, England, Google, InfoAmazonia, International journalism festival, Landline, Lobbying, pollution, ProPublica, resilient, Stateline, tech, top tweets, world

Ana_Cotta – saudades da Amazônia

Tools, Events, Courses On Wednesday the 30th, the eighth edition of the International Journalism Festival will take place in Perugia. The event has become one of the most important of its kind in Europe, and it will host hundreds of journalists from all over the world. The IFJ will also be the location of the third edition of the 2014 School of Data Journalism jointly organized by the European Journalism Centre and the Open Knowledge Foundation. The School will start on the May the 1st and will see the participation of 25 instructors from world-leading newspapers, universities, and think tanks. ProPublica just announced the release of two JavaScript libraries. The first one is Landline and will help developers turn GeoJSON data into SVG browser-side maps. The second is built on the previous one and is called Stateline and will facilitate the process of creating US choropleth maps. Data Stories Chris Michael from the Guardian Data Blog recently published a short article listing the world’s most resilient cities. Michael extracted data from a study of Grosvenor, a London-based company which measured resilience by assigning a value to cities’ vulnerability to environmental changes and their capacity to face political or economical threats. British citizens might be interested in the quality of air they breathe everyday. Those who are worried about air pollution should take a look at George Arnett’s interactive choropleth map showing the percentage of deaths caused by particulate air pollution in England. What’s the role of the world tech giants in politics? Tom Hamburger and Matea Gold tried to explain it in this article on the Washington Post by observing the evolution of Google in its lobbying activities at the White House. Google’s political influence increased enormously since 2002 thus making the company the second largest spender in the US on lobbying practices. Are conservatives all conservatives in the same way, or is there a certain degree of moderation among them and toward different issues? On his newly-born FiveThirtyEight, Nate Silver faces the argument by displaying data on the “partisan split” between the two US parties on several main topics. If you are Catholic, or maybe just curious, you should be very interested in seeing The Visual Agency’s last infographic, which represents through a series of vertical patterns the number, geographical area, and social level of professions of all Catholic saints. Gustavo Faileros, ICFJ Knight International Journalism Fellowship, is about to present to the public InfoAmazonia, a new data journalism site which will be monitoring environmental changes in the southern part of South America using both satellite and on-the-ground data. In addition, as environmental changes increase, so do the number of deaths of environmental and land defenders. The Global Witness team has just published its latest project, Deadly Environment, a 28-page report containing data and important insights on the rise of this phenomenon which is incredibly expanding year by year, especially in South America. Data Sources Michael Corey is a news app developer who was involved in the realization process of the National Public Radio mini-site named Borderland. In this post, he analyses the main features of the geographical digital tools that he used to collect and display data on the US-Mexico border which helped him correctly localizing the fences build by the US government all along the line which separates the two Countries. The data-driven journalism community is expanding rapidly, especially on Twitter. If you need a useful recap of what has been tweeted and retweeted by data lovers, then the Global Investigative Journalism Network #ddj top ten is what you need. flattr this!

How to study lobbying with crowdsourced open data

- May 10, 2011 in Crowd Sourcing, External, Featured Project, France, Government, Guest post, Lobbying, Open Data, Open Government Data, Regards Citoyens, Transparency International, WG EU Open Data, WG Open Government Data, Working Groups

The following guest post is from Regards Citoyens, a French organisation that promotes open data. For about a year, Regards Citoyens has been working together with the French chapter of Transparency International in order to bring more transparency in the processes of influence and lobbying within the French parliament. Lobbying is a very controversial subject in France: we discuss it a lot, but we do not know much about it. So we decided to try and study the visible part of this mysterious iceberg by bringing out some new data to the public debate. On a regular basis, MPs publish official reports regarding the preparation of their legislative and government evaluation work. It makes sense that they would listen to anyone concerned with the current topic during this process. But is this done in a fair, plural and transparent way? Are corporations and unions listened on an equal footing? What about NGOs and other actors from the civil society? Much like the European Parliament did, the French Assembly recently created an official register of lobbyists who get granted access to the hallways. But it turns out that this register does not contain more than a hundred names.
Official MPs reports
A few official reports from MPs
We decided to take a closer look and try to get a more complete list by browsing through all the 1,174 reports published between July 2007 and July 2010. Indeed, some of them propose an appendix with a list of all the hearings organised during the preparation of the report. Unfortunately, we quickly discovered that most reports do not feature such a list: using text analysis tools, we found them in only 38 % of the reports. Even this small visible part of influence seriously lacks in transparency. But that already provided us with an important dataset of 16,000 names, much more than the few officially registered lobbyists. Our main concern then was to identify each organisation behind all of these names. Doing so was sometimes easy (mentioned along the name in the appendix), sometimes a bit harder (requiring to read pieces of the report, for instance). So we decided to develop a crowdsourcing tool allowing anyone to participate. An application available under a free licence, the AGPL, was built to process each name one by one, at least by three different users to validate the data. The idea was to make anyone able to easily contribute for just a few minutes, without having to register. Registration was only needed to participate in the top 50 contributors ladder. The simplicity and dynamicity of the Ajax-based interface (fields pre-filled and reports pre-loaded and scrolled), the fun of discovering lobbyists while “digitizing them” and the competitive aspect, provided by the ladder, certainly helped a lot: in a couple days a good buzz started, and while we expected the crowdsourcing to take a couple months, everything was achieved in only 10 days thanks to more than 3,000 citizens! This cool process brought us a database of 16,000 hearings with names, sex, functions and organisations of each one of the lobbyists. After some brief discussions with the national Assembly and the CNIL (French commission for privacy rights), we decided to release only the names of the organisations and not those of the people. Even though they are already public, coming from official reports, these institutions were unable to find an agreement on whether the names of lobbyists were public or private information. In the end, we decided to anonymise the data and make sure no illegal database of religious or union affiliation could be published out of it. Using Freebase GridWorks, we finally refined the data and consolidated it into 9,300 grouped hearings of organizations, which were associated to the theme subjects of each report. But to be able to draw trends, we needed to categorize these organizations by interests: unions, corporations, individuals, religious organisations, think-tanks, NGO’s and associations. We first used the EU registry, but the large number of organisations we needed to classify quickly revealed the limits of the commission’s categories, especially regarding the public sector organisations. So we decided to improve it and build progressively our own categorization of interest representatives (fr) while categorizing gradually the data. Holding all of these enriched data, TI started browsing it and drafted an insightful study based on the results (fr). At the same time, we worked on developping a visualisation in order to present the data in ways people could easily understand and browse. Inspired by WhereDoesMyMoneyGo‘s first design, we used the powerful Raphael JavaScript library to put out in a couple weeks a fully accessible application allowing to browse by themes and in subdetails all of these information. But what did we learn? First, that on all subjects, there were considerably fewer hearings with women (24 %), with the only exception of the reports regarding… gender issues, of course! Also, the study reveals that MPs listen mainly during their hearings to administrations and organisations from the public sector (48 %). Trade unions and other professionnal organisations come then, followed at the 3rd place by private companies. NGOs and civil society organisations lobby in only 7 % cases. But the most interesting conclusion probably comes from the comparison of the categories for each specific theme. We can observe that companies are more often listened on topics like economy, energy, environment and probably more suprisingly on transportations, culture or digital issues. On the other hand, civil society organisations are more presents in topics like development aid or veterans. All of these results concern of course only the visible part of the lobbying, but taking a close look at the holes (like the surprisingly low number of hearings for private companies on health issues) provides interesting insights and validates our conclusion: transparency in France is definitely lacking in this area! Of course, all of the anonymised data that was generated for this study is republished as open data under the ODBL licence and freely reusable. We completed the data with extra information such as the authors and political groups of the reports and such. This means there are certainly plenty other possible uses to these data! We’re convinced making it open data can only bring more great projects! Read the study and browse the visualisation online. Related posts:
