You are browsing the archive for Open Spending.

OpenSpending platform update

Paul Walsh - August 16, 2017 in Open Knowledge, Open Spending

Introduction

OpenSpending is a free, open and global platform to search, visualise, and analyse fiscal data in the public sphere. This week, we soft launched an updated technical platform, with a newly designed landing page. Until now dubbed “OpenSpending Next”, this is a completely new iteration on the previous version of OpenSpending, which has been in use since 2011. At the core of the updated platform is Fiscal Data Package. This is an open specification for describing and modelling fiscal data, and has been developed in collaboration with GIFT. Fiscal Data Package affords a flexible approach to standardising fiscal data, minimising constraints on publishers and source data via a modelling concept, and enabling progressive enhancement of data description over time. We’ll discuss in more detail below. From today:
  • Publishers can get started publishing fiscal data with the interactive Packager, and explore the possibilities of the platform’s rich API, advanced visualisations, and options for integration.
  • Hackers can work on a modern stack designed to liberate fiscal data for good! Start with the docs, chat with us, or just start hacking.
  • Civil society can access a powerful suite of visualisation and analysis tools, running on top of a huge database of open fiscal data. Discover facts, generate insights, and develop stories. Talk with us to get started.
All the work that went into this new version of OpenSpending was only made possible by our funders along the way. We want to thank Hewlett, Adessium, GIFT, and the OpenBudgets.eu consortium for helping fund this work. As this is now completely public, replacing the old OpenSpending platform, we do expect some bugs and issues. If you see anything, please help us by opening a ticket on our issue tracker.

Features

The updated platform has been designed primarily around the concept of centralised data, decentralised views: we aim to create a large, and comprehensive, database of fiscal data, and provide various ways to access that data for others to build localised, context-specific applications on top. The major features of relevance to this approach are described below.

Fiscal Data Package

As mentioned above, Fiscal Data Package affords a flexible approach to standardising fiscal data. Fiscal Data Package is not a prescriptive standard, and imposes no strict requirements on source data files. Instead, users “map” source data columns to “fiscal concepts”, such as amount, date, functional classification, and so on, so that systems that implement Fiscal Data Package can process a wide variety of sources without requiring change to the source data formats directly. A minimal Fiscal Data Package only requires mapping an amount and a date concept. There are a range of additional concepts that make fiscal data usable and useful, and we encourage the mapping of these, but do not require them for a valid package. Based on this general approach to specifying fiscal data with Fiscal Data Package, the updated OpenSpending likewise imposes no strict requirements on naming of columns, or the presence of columns, in the source data. Instead, users (of the graphical user interface, and also of the application programming interfaces) can provide any source data, and iteratively create a model on top of that data that declares the fiscal measures and dimensions.

GUIs

Packager

The Packager is the user-facing app that is used to model source data into Fiscal Data Packages. Using the Packager, users first get structural and schematic validation of the source files, ensuring that data to enter the platform is validly formed, and then they can model the fiscal concepts in the file, in order to publish the data. After initial modelling of data, users can also remodel their data sources for a progressive enhancement approach to improving data added to the platform.

Explorer

The Explorer is the user-facing app for exploration and discovery of data available on the platform.

Viewer

The Viewer is the user-facing app for building visualisations around a dataset, with a range of options, for presentation, and embedding views into 3rd party websites.

DataMine

The DataMine is a custom query interface powered by Re:dash for deep investigative work over the database. We’ve included the DataMine as part of the suite of applications as it has proved incredibly useful when working in conjunction with data journalists and domain experts, and also for doing quick prototype views on the data, without the limits of API access, as one can use SQL directly.

APIs

Datastore

The Datastore is a flat file datastore with source data stored in Fiscal Data Packages, providing direct access to the raw data. All other databases are built from this raw data storage, providing us with a clear mechanism for progressively enhancing the database as a whole, as well as building on this to provide such features directly to users.

Analytics and Search

The Analytics API provides a rich query interface for datasets, and the search API provides exploration and discovery capabilities across the entire database. At present, search only goes over metadata, but we have plans to iterate towards full search over all fiscal data lines.

Data Importers

Data Importers are based on a generic data pipelining framework developed at Open Knowledge International called Data Package Pipelines. Data Importers enable us to do automated ETL to get new data into OpenSpending, including the ability to update data from the source at specified intervals. We see Data Importers as key functionality of the updated platform, allowing OpenSpending to grow well beyond the one thousand plus datasets that have been uploaded manually over the last five or so years, towards tens of thousands of datasets. A great example of how we’ve put Data Importers to use is in the EU Structural Funds data that is part of the Subsidy Stories project.

Iterations

It is slightly misleading to announce the launch today, when we’ve in fact been using and iterating on OpenSpending Next for almost 2 years. Some highlights from that process that have led to the platform we have today are as follows.

SubsidyStories.eu with Adessium

Adessium provided Open Knowledge International with funding towards fiscal transparency in Europe, which enabled us to build out significant parts of the technical platform, commision work with J++ on Agricultural Subsidies , and, engage in a productive collaboration with Open Knowledge Germany on what became SubsidyStories.eu, which even led to another initiative from Open Knowledge Germany called The Story Hunt. This work directly contributed to the technical platform by providing an excellent use case for the processing of a large, messy amount of source data into a normalised database for analysis, and doing so while maintaining data provenance and the reproducibility of the process. There is much to do in streamlining this workflow, but the benefits, in terms of new use cases for the data, are extensive. We are particularly excited by this work, and the potential to continue in this direction, by building out a deep, open database as a potential tool for investigation and telling stories with data.

OpenBudgets.eu via Horizon 2020

As part of the OpenBudgets.eu consortium, we were able to both build out parts of the technical platform, and have a live use case for the modularity of the general architecture we followed. A number of components from the core OpenSpending platform have been deployed into the OpenBudgets.eu platform with little to no modification, and the analytical API from OpenSpending was directly ported to run on top of a triple store implementation of the OpenBudgets.eu data model. An excellent outcome of this project has been the close and fruitful work with both Open Knowledge Germany and Open Knowledge Greece on technical, community, and journalistic opportunities around OpenSpending, and we plan for continuing such collaborations in the future.

Work on Fiscal Data Package with GIFT

Over three phases of work since 2015 (the third phase is currently running), we’ve been developing Fiscal Data Package as a specification to publish fiscal data against. Over this time, we’ve done extensive testing of the specification against a wide variety of data in the wild, and we are iterating towards a v1 release of the specification later this year. We’ve also been piloting the specification, and OpenSpending, with national governments. This has enabled extensive testing of both the manual modeling of data to the specification using the OpenSpending Packager, and automated ETL of data into the platform using the Data Package Pipelines framework. This work has provided the opportunity for direct use by governments of a platform we initially designed with civil society and civic tech actors in mind. We’ve identified difficulties and opportunities in this arena at both the implementation and the specification level, and we look forward to continuing this work and solving use cases for users inside government.

Credits

Many people have been involved in building the updated technical platform. Work started back in 2014 with an initial architectural vision articulated by our peers Tryggvi Björgvinsson and Rufus Pollock. The initial vision was adapted and iterated on by Adam Kariv (Technical Lead) and Sam Smith (UI/X), with Levko Kravets, Vitor Baptista, and Paul Walsh. We reused and enhanced code from Friedrich Lindenberg. Lazaros Ioannidis and Steve Bennett made important contributions to the code and the specification respectively. Diana Krebs, Cecile Le Guen, Vitoria Vlad and Anna Alberts have all contributed with project management, and feature and design input.

What’s next?

There is always more work to do. In terms of technical work, we have a long list of enhancements.
However, while the work we’ve done in the last years has been very collaborative with our specific partners, and always towards identified use cases and user stories in the partnerships we’ve been engaged in, it has not, in general, been community facing. In fact, a noted lack of community engagement goes back to before we started on the new platform we are launching today. This has to change, and it will be an important focus moving forward. Please drop by at our forum for any feedback, questions, and comments.

Valtion hankintatiedot avoimena datana – hieno edistysaskel tulossa?!

Open Knowledge Finland - August 9, 2017 in avoin data, avoin hallinto, Featured, godi, godi 2016, hansel, julkiset hankinnat, Open Government Data, Open Government Partnership, Open Spending

Tiedon avoimuutta on tarpeen lisätä, Helsingin Sanomien pääkirjoituksessa 7.8.2017 todetaan. Olemme samaa mieltä! Viime vuosina useat Suomen kunnat ovat julkaisseet tietoja omista hankinnoistaan, jopa kuittitasolla. Tämä käytäntö on laajenemassa uuden ns. Hansel-lain myötä, jonka myötä eri ministeriöiden, laitosten, virastojen ja mahdollisesti maakuntien hankinnat julkaistaisiin keskitetysti valtion hankintayhtiön Hansel Oy:n toimesta. Hallitus valmistelee uutta ns. Hansel-lakia (Hallituksen esitys HE 63/2017 vp Hallituksen esitys eduskunnalle laiksi Hansel Oy -nimisestä osakeyhtiöstä annetun lain muuttamisesta). Laki on käsittelyssä talousvaliokunnassa, jossa sen yksityiskohtia viimeistellään. Avoimuuden ja avoimen datan kannalta erityisen kiinnostavia ovat lakiehdotuksen kohdat, jossa ehdotetaan säädettäväksi uusi säännös hankintatiedon käsittelyyn liittyvästä tietojensaanti- ja käsittelyoikeudesta (2§ ja 5§). Nähdäksemme toteutuessaan Hansel-laki lisää hallinnon avoimuutta erinomaisella tavalla. Samalla kun Suomi täyttää kansainvälisiä sitoumuksiaan, saamme verovarat tehokkaammin käyttöön, kilpailu julkisista hankinnoista on reilumpaa ja julkinen rahankäyttö on ylipäätään avoimempaa.

Hansel-laki ja hankintoja koskeva avoin data

Hansel Oy on siis toiminut valtion yhteishankintayksikkönä ja kilpailuttanut asiakkailleen sellaisia tavara- ja palveluhankintoja, joita valtionhallinnossa käytetään laajasti. Hanselin tehtäviin on kuulunut myös asiakkaiden omien hankintojen kilpailuttaminen sekä erilaiset hankintatoimeen liittyvät asiantuntijatehtävät. Viime vuosina yhtiön tehtävät ovat kehittyneet muun muassa valtion hankintatoimen digitalisointiohjelman myötä, minkä vuoksi lakiin ehdotetaan tehtäväksi joitakin yhtiön tehtäviin liittyviä täsmennyksiä.  Laissa yhtiön tehtäviä siis ajantasaistetaan. Talousvaliokunnassa lakitekstiä on tiettävästi muotoiltu eteenpäin, mutta viimeisin julkinen versio (HE 63/2017) kuvaa Hanselin muuttuneita tehtäviä mm. seuraavasti (boldaus kirjoittajan): 2§ Yhtiön tehtävät 2 mom: Yhtiön tehtävänä on tuottaa asiakkailleen yhteishankintatoimintoja ja hankintojen tukitoimintoja. Yhtiö ylläpitää hankintasopimuksia ja tuottaa asiakkailleen hankintasopimuksiin liittyvää asiantuntijapalvelua. Lisäksi yhtiön tehtävänä on tuottaa asiakkailleen hankintatoimeen liittyviä asiantuntija- ja kehittämispalveluja sekä hankintatiedon käsittely- ja analysointipalveluja ja näihin liittyviä teknisiä ratkaisuja. 5 §  Tiedonsaantioikeus ja tietojen tuottaminen 4 mom: Yhtiö voi tuottaa, luovuttaa ja julkaista hankintatietoa käsittävää tietoaineistoa, jos tietoaineiston luovuttaminen ei sen muodostamisessa käytettyjen hakuperusteiden, tietojen määrän, laadun tai sisällön taikka tietoaineiston käyttötarkoituksen vuoksi ole vastoin sitä, mitä tietojen salassapidosta ja henkilötietojen suojasta säädetään. Alla muutama Open Knowledge Finland ry:n näkemys lakiin liittyen.

Hansel-laki lisää julkisten hankintojen tervettä kilpailua ja tehokkuutta

Hankintatietojen avoimuus edistää reilua kilpailua eri toimittajien kesken. Hankintojen vertailutietojen kautta saadaan kustannustehokkuutta hankintoihin ja sitä kautta verovarojen käyttöön. Kun hankinnat kuvataan vertailukelpoisesti, voidaan helposti seurata esimerkiksi, maksaako joku yksikkö huomattavan erilaista hintaa toiseen verrattuna tai onko hankinnoissa jotain muuta poikkeavaa tai erikoista ja kenties parannettavaa (kuten hankintojen kasautuminen vuoden loppuun). Pidemmällä tähtäimellä vertailukelpoiseen dataan voisi lisätä tai yhdistää vaikkapa alkuperätietoa, sertifiointeja, eettistä tietoa tai muuta vertailutietoa. Yksityiskohtainen hankintatieto voi auttaa ehkäisemään korruptiota ja harmaata taloutta.

Julkisuuslaki, läpinäkyvyys ja oikeus tietoon koskee myös hankintatietoja

Joka tapauksessa julkisuuslain mukaan kansalaisilla, järjestöillä ja medialla on jo nyt olemassa oikeus tietoon – myös hankintatietoon – silloin kun kyse ei ole erityisistä seikoista kuten esimerkiksi turvallisuusasioista tai tietyn tyyppisistä yrityssalaisuuksista. Riippumatta Hansel-laista, oikeus tähän tietoon on olemassa, eikä siltä osin ole tiedossa muutoksia.  Mutta kiinnostava muutos on, että Hansel-lain myötä saadaan selkeyttä ja yhdenmukaisuutta tiedon julkaisuun liittyviin käytänteihin ja laki toteuttaa ja tarkentaa siten julkisuuslain henkeä.

Yksi toimija hankintatiedon julkaisijana on tehokas tapa lisätä läpinäkyvyyttä ilman suurta hallinnollista taakkaa julkishallinnolle

Yksi datan avaamisen haasteista julkishallinnossa on ollut julkisuuslain erilaisten tulkintojen määrä – tämä tuli esille mm. Valtioneuvoston kanslian selvitys- ja tutkimustoiminnan “Avoimen datan hyödyntäminen ja vaikuttavuus”  -raportissat, jonka ETLA ja Open Knowledge Finland tekivät. Vastaavasti kaupungit avatessaan ostolaskujaan, ovat soveltaneet toisistaan poikkeavia käytäntöjä ja dataformaatteja. 6Aika-hanke ja kuntaliitto ovatkin pyrkineet ohjeistamalla yhtenäistämään käytäntöjä. Kaavailtu käytäntö yhtenäistää käytäntöjä eri ministeriöiden, virastojen ja tutkimuslaitosten kesken. Kaavailtu käytäntö keventää hallinnollista taakkaa kun asiat, kuten tiedon siivoaminen, formatointi, priorisointi, ongelmienratkaisu, dokumentointi, julkaisukäytännöt ym. tukitoiminnot hoidetaan yhdessä paikassa, eli Hansel toimii tässä siis eräänlaisena ns. clearinghousena, laadunvarmistajana ja tiedon hyödyntäjien rajapintana. Yhtenäiset käytännöt puolestaan paitsi lisäävät julkishallinnon tehokkuutta datan julkaisussa, myös helpottavat datan löydettävyyttä ja hyödynnettävyyttä.  Sivumennen sanoen, uuden lakiehdotuksen kaavailemat tietopalvelut täydentävät muita meneillään olevia hankkeita, kuten YTI-hanke ja Kuntatieto-ohjelma. Hansel on valtionvarainministeriön ohjaukessa ja hankintatietojen avoimuus yhtenä asiana Valtion hankintojen digitalisaatio -toteutusohjelmaa, joten tahtotilaa modernisointiin tuntuu olevan laajemmin.  

Kansainväliset johtajuus ja tehtyjen sitoumuksien lunastaminen

Yleisesti Suomi sijoittuu kansainvälisesti avoimeen dataan ja avoimeen tietoon liittyvissä vertailuissa melko hyvin. Esimerkiksi uusimmassa Open Knowledge Internationalin Government Open Data Index 2016 -vertailussa olemme sijalla 5. Toisaalta, nimenomaan taloustietojen avoimuudessa olemme varsin surkeita – hankintojen (“procurement” – hankintailmoitusten ja sopimusten) tiimoilta “45% avoin” ja ostojen (“Government spending” – todellinen kulutus) jopa hälyttävällä 0% tasolla! Hansel-lain myötä pysymme mukana kansainvälisessä kehityksessä kun vahvistamme todettuja heikkouksiamme.   Suomi on myös mukana USA:n ex-Presidentti Obaman aloittamassa avoimen hallinnon kumppanuusohjelmassa (Open Government Partnership), jossa eri maiden (yli 70 maata on mukana) hallinnot yhdessä kansalaisyhteiskunnan kanssa tekevät sitovia avoimuutta edistäviä konkreettisia toimenpiteitä ja sitoumuksia.  Hankintatietojen avoimuus on myös Suomen Avoimen hallinnon 3. Toimintaohjelmassa (2017-2019) yhtenä konkreettisena lupauksena. Toimintaohjelmassa sanotaan näin.
  1. sitoumus
Julkaistaan valtion hankintatiedot kansalaisille. Julkaistaan avoimesti verkossa tiedot siitä, mitä valtio ostaa, millä rahalla ja mistä. Valtion hankintatiedot julkaistaan keväällä 2017 avoimena datana. Samalla toteutetaan kaikille avoin palvelu, jossa kansalaiset ja yritykset voivat seurata lähes reaaliaikaisesti valtion hankintoihin liittyvän rahan käyttöä. Palvelujen tietosisältönä ovat hankintojen julkiset tiedot, joista käy ilmi, mitä valtion organisaatiot hankkivat ja mistä hankinnat tehdään. Sinänsä Hansel-laki ja sen kuvailemat hankintatietoon liittyvät tietopalvelut eivät ole välttämättä suunniteltuja “vain” kansalaisille, vaan palveluille on luonnollisesti useita eri käyttäjiä, kuten yritykset, media ja julkinen sektori itse. Ylipäätään olennaista on, että data avataan. Tällöin erilaiset toimijat voivat tehdä omista näkökulmistaan erilaisia kiinnostavia sovelluksia – joku tekee vertailuja tai visualisointeja, joku myynnin ja markkinoinnin työkaluja ja niin edelleen! Näin eri toimijat täydentävät Hanselin osaamista ja tarjontaa – tieto kun ei jakamalla kulu. Eräänlainen verrokki voisi olla valtion budjetti ja sen ympärillä olevat sovellukset: valtion budjettia kuvaava, Hahmota Oy:n tekemä www.valtionbudjetti.fi joka tavallaan täydentää VM:n omaa www.tutkibudjettia.fi -palvelua. Vastaavasti Hanselin mahdollisesti tuottaman verkkotyökalun (jolla voi tutkia ja analysoida hankintoja tietyin kriteerein) lisäksi on hyvin mahdollista, että syntyy muita palveluita tai analyysityökaluja hankintoihin. Toteuttaakseen kaavailtua lakia sekä em. avoimen hallinnon sitoumusta, Hansel onkin käsittääkseni hahmotellut tulevaa verkkopalvelua, jossa hankintoja voisi analysoida. Alla muutamia esimerkinomaisia ruutukaappauksia sovelluksesta, jotka antavat suuntaviivoja siitä miltä verkkopalvelu voisi näyttää. Nämä ruutukaappaukset ovat tietystikin suuntaa-antavia, mutta vaikuttavat lupaavalta. Tarkempia analyysejä varten itse kukin voisi sitten ladata tietoa sopivin kriteerein rajattuna. Ylipäätään talouden ja talouselämän avoimuutta ja avointa tietoa olisi järkevää lisätä jatkossa. Tavoitteena tulisi mielestämme olla, että “kolminaisuus”, eli  julkiset budjetit, sopimukset ja hankinnat olisivat saatavilla avoimesti standardimuotoisina. Hankintatiedot on erinomainen askel.   Odotamme Open Knowledge:ssa innolla uutta Hansel-lakia ja ylipäätään julkisten hankintojen lisääntyvää avoimuutta. Avoimuus on omiaan paitsi hälventämään mahdollista epäluottamusta, myös lisäämään tehokkuutta ja reilua kilpailua. Kyseessä on siis veronmaksajien etu ja oikeudenmukaisuus. The post Valtion hankintatiedot avoimena datana – hieno edistysaskel tulossa?! appeared first on Open Knowledge Finland.

csv,conf,v3

Daniel Fowler - May 30, 2017 in Events, Frictionless Data, OD4D, Open Spending

The third manifestation of everyone’s favorite community conference about data—csv,conf,v3—happened earlier this May in Portland, Oregon. The conference brought together data makers/doers/hackers from various backgrounds to share knowledge and stories about data in a relaxed, convivial, alpaca-friendly (see below) environment. Several Open Knowledge International staff working across our Frictionless Data, OpenSpending, and Open Data for Development projects made the journey to Portland to help organize, give talks, and exchange stories about our lives with data. Thanks to Portland and the Eliot Center for hosting us. And, of course, thanks to the excellent keynote speakers Laurie Allen, Heather Joseph, Mike Bostock, and Angela Bassa who provided a great framing for the conference through their insightful talks. Here’s what we saw.

Talks We Gave

The first priority for the team was to present on the current state of our work and Open Knowledge International’s mission more generally. In his talk, Continuous Data Validation for Everybody, developer Adrià Mercader updated the crowd on the launch and motivation of goodtables.io: It was a privilege to be able to present our work at one of my favourite conferences. One of the main things attendees highlight about csv,conf is how diverse it is: many different backgrounds were represented, from librarians to developers, from government workers to activists. Across many talks and discussions, the need to make published data more useful to people came up repeatedly. Specifically, how could we as a community help people publish better quality data? Our talk introducing goodtables.io presented what we think will be a dominant approach to approaching this question: automated validation. Building on successful practices in software development like automated testing, goodtables.io integrates within the data publication process to allow publishers to identify issues early and ensure data quality is maintained over time. The talk was very well received, and many people reached out to learn more about the platform. Hopefully, we can continue the conversation to ensure that automated (frictionless) data validation becomes the standard on all data publication workflows. David Selassie Opoku presented When Data Collection Meets Non-technical CSOs in Low-Income Areas: csv,conf was a great opportunity to share highlights of the OD4D (and School of Data) team’s data collection work. The diverse audience seemed to really appreciate insights on working with non-technical CSOs in low-income areas to carry out data collection. In addition to highlighting the lessons from the work and its potential benefit to other regions of the world, I got to connect with data literacy organisations such as Data Carpentry who are currently expanding their work in Africa and could help foster potential data literacy training partnerships. As a team working with CSOs in low-income areas like Africa, School of Data stands to benefit from continuing conversations with data “makers” in order to present potential use cases. A clear example I cited in my talk was Kobo Toolbox, which continues to mitigate several daunting challenges of data collection through abstraction and simple user interface design. Staying in touch with the csv,conf community may highlight more such scenarios which could lead to the development of new tools for data collection. Paul Walsh, in his talk titled Open Data and the Question of Quality (slides) talked about lessons learned from working on a range of government data publishing projects and we can do as citizens to demand better quality data from our governments:

Talks We Saw

Of course, we weren’t there only to present; we were there to learn from others as well. Before the conference, through our Frictionless Data project, we have been lucky to be in contact with various developers and thinkers around the world who also presented talks at the conference. Eric Busboom presented Metatab, an approach to packaging metadata in spreadsheets. Jasper Heefer of Gapminder talked about DDF, a data description format and associated data pipeline tool to help us live a more fact-based existence. Bob Gradeck of the Western Pennsylvania Regional Data Center talked about data intermediaries in civic tech, a topic near and dear to our hearts here at Open Knowledge International.

Favorite Talks

Paul’s:
  • “Data in the Humanities Classroom” by Miriam Posner
  • “Our Cities, Our Data” by Kate Rabinowitz
  • “When Data Collection Meets Non-technical CSOs in Low Income Areas” by David Selassie Opoku
David’s:
  • “Empowering People By Democratizing Data Skills” by Erin Becker
  • “Teaching Quantitative and Computational Skills to Undergraduates using Jupyter Notebooks” by Brian Avery
  • “Applying Software Engineering Practices to Data Analysis” by Emil Bay
  • “Open Data Networks with Fieldkit” by Eric Buth
Jo’s:
  • “Smelly London: visualising historical smells through text-mining, geo-referencing and mapping” by Deborah Leem
  • “Open Data Networks with Fieldkit” by Eric Buth
  • “The Art and Science of Generative Nonsense” Mouse Reeve
  • “Data Lovers in in a Dangerous Time” by Bendan O’Brien

Data Tables

This csv,conf was the first csv,conf to have a dedicated space for working with data hands-on. In past events, attendees left with their heads buzzing full of new ideas, tools, and domains to explore but had to wait until returning home to try them out. This time we thought: why wait? During the talks, we had a series of hands-on workshops where facilitators could walk through a given product and chat about the motivations, challenges, and other interesting details you might not normally get to in a talk. We also prepared several data “themes” before the conference meant to bring people together on a specific topic around data. In the end, these themes proved a useful starting point for several of the facilitators and provided a basis for a discussion on cultural heritage data following on from a previous workshop on the topic. The facilitated sessions went well. Our own Adam Kariv walked through Data Package Pipelines, his ETL tool for data based on the Data Package framework. Jason Crawford demonstrated Fieldbook, a tool for managing easily managing a database in-browser as you would a spreadsheet. Bruno Vieira presented Bionode, going into fascinating detail on the mechanics of Node.js Streams. Nokome Bentley walked through a hands-on introduction to accessible, reproducible data analysis using Stencila, a way to create interactive, data-driven documents using the language of your choice to enable reproducible research. Representatives from data.world, an Austin startup we worked with on an integration for Frictionless Data also demonstrated uploading datasets to data.world. The final workshop was conducted by several members of the Dat team, including co-organizer Max Ogden, with a super enthusiastic crowd. Competition from the day’s talks was always going to be fierce, but it seems that many attendees found some value in the more intimate setting provided by Data Tables.

Thanks

If you were there at csv,conf in Portland, we hope you had a great time. Of course, our thanks go to the Gordon and Betty Moore Foundation and to Sloan Foundation for enabling me and my fellow organizers John Chodacki, Max Ogden, Martin Fenner, Karthik, Elaine Wong, Danielle Robinson, Simon Vansintjan, Nate Goldman and Jo Barratt who all put so much personal time and effort to bringing this all together. Oh, and did I mention the Comma Llama Alpaca? You, um, had to be there.

Making European Subsidy Data Open

Michael Peters - April 24, 2017 in OK Germany, Open Government Data, Open Spending

One month after releasing subsidystories.eu a joint project of Open Knowledge Germany and Open Knowledge International, we have some great news to share. Due to the extensive outreach of our platform and the data quality report we published, new datasets have been directly sent to us by several administrations. We have recently added new data for Austria, the Netherlands, France and the United Kingdom. Furthermore, first Romanian data recently arrived and should be available in the near future. Now that the platform is up and running, we want to explain how we actually worked on collecting and opening all the beneficiary data. Subsidystories.eu is a tool that enables the user to visualize, analyze and compare subsidy data across the European Union thereby enhancing transparency and accountability in Europe. To make this happen we first had to collect the datasets from each EU member state and scrape, clean, map and then upload the data. Collecting the data was an incredible frustrating process, since EU member states publish the beneficiary data in their own country (and regional) specific portals which had to be located and often translated. A scraper’s nightmare: different websites and formats for every country The variety in how data is published throughout the European Union is mind-boggling. Few countries publish information on all three concerned ESIF Funds (ERDF, ESF, CF) in one online portal, while most have separate websites distinguished by funds. Germany provides the most severe case of scatteredness, not only is the data published by its regions (Germany’s 16 federal states), but different websites for distinct funds exist (ERDF vs. ESF) leading to a total of 27 German websites. Arguably making the German data collection just as tedious as collecting all data for the entire rest of the EU. Once the distinct websites were located through online searches, they often needed to be translated to English to retrieve the data. As mentioned the data was rarely available in open formats (counting csv, json or xls(x) as open formats) and we had to deal with a large amount of PDFs (51) and webapps (15) out of a total of 122 files. The majority of PDF files was extracted using Tabula, which worked fine some times and required substantial work with OpenRefine – cleaning misaligned data – for other files. About a quarter of the PDFs could not be scraped using tools, but required hand tailored scripts by our developer. Data Formats
However, PDFs were not our worst nightmare: that was reserved for webapps such as this French app illustrating their 2007-2013 ESIF projects. While the idea of depicting the beneficiary data on a map may seem smart, it often makes the data useless. These apps do not allow for any cross project analysis and make it very difficult to retrieve the underlying information. For this particular case, our developer had to decompile the flash to locate the multiple dataset and scrape the data. Open data: political reluctance or technical ignorance? These websites often made us wonder what the public servants that planned this were thinking? They already put in substantial effort (and money) when creating such maps, why didn’t they include a “download data” button? Was it an intentional decision to publish the data, but make difficult to access? Or is the difference between closed and open data formats simply not understood well enough by public servants? Similarly, PDFs always have to be created from an original file, while simply uploading that original CSV or XLSX file could save everyone time and money. In our data quality report we recognise that the EU has made progress on this behalf in their 2013 regulation mandating that beneficiary data be published in an open format. While publication in open data formats has increased henceforth, PDFs and webapps remain a tiring obstacle. The EU should assure the member states’ compliance, because open spending data and a thorough analysis thereof, can lead to substantial efficiency gains in distributing taxpayer money. This blog has been reposted from https://okfn.de/blog/2017/04/Making-EU-Data-Open/

New site SubsidyStories.eu shows where nearly 300bn of EU subsidies go across Europe

Diana Krebs - March 9, 2017 in Money flows, News, Open Spending, OpenSpending

Open Knowledge Germany and Open Knowledge International launched SubsidyStories.eu: a database containing all recipients of EU Structural Funds, accounting for 292,9 Billion Euros of EU Subsidies.

The European Union allocates 44 % of its total 7-year budget through the European Structural Funds. Who received these funds – accounting for 347 Billion Euro from 2007 – 2013 and 477 Billion Euro from 2014 – 2020 – could only be traced through regional and local websites. Subsidystories.eu changes this by integrating all regional datasets into one database with all recipients of the European Structural and Investment Funds from 2007 onwards.

“SubsidyStories is a major leap forward in bringing transparency to the spending of EU funds,” said Dr Ronny Patz, a researcher focused on budgeting in the European Union and in the United Nations system at the Ludwig-Maximilans-Universität (LMU) in Munich. “For years, advocates have asked the EU Commission and EU member state governments to create a single website for all EU Structural and Investment Funds, but where they have failed, civil society now steps in.”

Subsidystories.eu makes the recipients of the largest EU subsidies program visible across Europe. Recent and future debates on EU spending will benefit from the factual basis offered by the project, as spending on the member state, regional and local level can be traced. Subsidystories.eu makes it possible to check which projects and organisations are receiving money and how it is spent across Europe. For example, the amounts given per project are vastly different per country; in Poland, the average sum per project is 381 664  € whereas in Italy this is only 63 539 €.

The data can be compared throughout the EU enabling a thorough analysis of EU spending patterns. Subsidystories.eu gives scientists, journalists and interested citizens the direct possibility of visualising data and running data analytics using SQL. The data can be directly downloaded to CSV for the entire European Union or for specific countries.

Beneficiary data, which was previously scattered across the EU in different languages and formats, had to be opened, scraped, cleaned and standardised to allow for cross-country comparisons and detailed searches. That we are now able to run detailed searches, aggregate projects per beneficiary and across countries, is a big step for financial transparency in Europe.

Subsidystories.eu is a joined cooperation between Open Knowledge Germany and Open Knowledge International, funded by Adessium and OpenBudgets.eu. 



New site SubsidyStories.eu shows where nearly 300bn of EU subsidies go across Europe

Diana Krebs - March 9, 2017 in Money flows, News, Open Spending, OpenSpending

Open Knowledge Germany and Open Knowledge International launched SubsidyStories.eu: a database containing all recipients of EU Structural Funds, accounting for 292,9 Billion Euros of EU Subsidies.

The European Union allocates 44 % of its total 7-year budget through the European Structural Funds. Who received these funds – accounting for 347 Billion Euro from 2007 – 2013 and 477 Billion Euro from 2014 – 2020 – could only be traced through regional and local websites. Subsidystories.eu changes this by integrating all regional datasets into one database with all recipients of the European Structural and Investment Funds from 2007 onwards.

“SubsidyStories is a major leap forward in bringing transparency to the spending of EU funds,” said Dr Ronny Patz, a researcher focused on budgeting in the European Union and in the United Nations system at the Ludwig-Maximilans-Universität (LMU) in Munich. “For years, advocates have asked the EU Commission and EU member state governments to create a single website for all EU Structural and Investment Funds, but where they have failed, civil society now steps in.”

Subsidystories.eu makes the recipients of the largest EU subsidies program visible across Europe. Recent and future debates on EU spending will benefit from the factual basis offered by the project, as spending on the member state, regional and local level can be traced. Subsidystories.eu makes it possible to check which projects and organisations are receiving money and how it is spent across Europe. For example, the amounts given per project are vastly different per country; in Poland, the average sum per project is 381 664  € whereas in Italy this is only 63 539 €.

The data can be compared throughout the EU enabling a thorough analysis of EU spending patterns. Subsidystories.eu gives scientists, journalists and interested citizens the direct possibility of visualising data and running data analytics using SQL. The data can be directly downloaded to CSV for the entire European Union or for specific countries.

Beneficiary data, which was previously scattered across the EU in different languages and formats, had to be opened, scraped, cleaned and standardised to allow for cross-country comparisons and detailed searches. That we are now able to run detailed searches, aggregate projects per beneficiary and across countries, is a big step for financial transparency in Europe.

Subsidystories.eu is a joined cooperation between Open Knowledge Germany and Open Knowledge International, funded by Adessium and OpenBudgets.eu. 



Open Data by default: Lorca City Council is using OpenSpending to increase transparency and promote urban mobility.

Diana Krebs - February 7, 2017 in Fiscal transparency, Open Fiscal Data, Open Spending, OpenSpending, smart city, Smart Region

Castillo de Lorca. Torre Alfonsina (Public Domain)

Lorca, a city located in the South of Spain with currently 92,000 inhabitants, launched its open data initiative on January 9th 2014. Initially it offered 23 datasets containing transport, mobility, statistical and economic information. From the very beginning, OpenSpending was the tool selected by Lorca City Council because of its capabilities and incredible visualization abilities. The first upload of datasets was done in 2013, on the previous version of OpenSpending. With the OpenSpending relaunch last year, Lorca City Council continued to make use of the OpenSpending datastore, while the TreeMap view of the expenditure budget was embedded on the council’s open data website. In December 2016, the council’s open data website was redesigned, including budget datasets built with the new version at next.openspending.org. The accounting management software of Lorca allows the automatic conversion of data files to csv. format, so these datasets are compatible with the requested formats established by OpenSpending. Towards more transparency and becoming a smart city In 2015, when the City of Lorca transparency website was launched, the council decided to continue with the same strategy focused on visualization tools to engage citizens with an intuitive approach to the budget data. Lorca is a city pioneer in the Region of Murcia in terms of open data and transparency. So far, 125 datasets have been released and much information is available along with the raw data. It deserves to be highlighted that there are pilot project initiatives to bring open data to schools, which was carried out during the past year. In 2017, we will resume to teach the culture of open data to school children with the main goal to demonstrate how to work with data by using open data. In the close future the council plans to open more data directly from the sources, i.e. achieve policy of open data by default. And of course Lorca intends to continue exploring other possibilities that Open Spending offers us to provide all this data to the citizenry. In addition, Lorca is working to become a smart city (article in Spanish only) – open data is a key element in this goal. Therefore, Lorca’s open data initiative will be a part of the Smart Social City strategy from the very beginning. 

Open State Foundation Netherlands wins OGP 2016 award for work to advance fiscal transparency through OpenSpending

Arjan El Fassed - January 31, 2017 in Open Spending

Open State Foundation is a non-profit based in the Netherlands, working on digital transparency by opening up public information as open data and making it accessible for re-use. Last December, the organization received one of the seven Open Government Partnership 2016 Awards for its work on OpenSpending at the OGP Global Summit in Paris, France. The awards celebrated civil society initiatives that are using government data to bring concrete benefits. This blog post describes Open State Foundation’s work on advancing fiscal transparency through OpenSpending. The financial crisis and various budget cuts in the Netherlands caused more than ever before the need for citizens to gain real-time access to financial data of all local and regional governments. Civil servants, journalists and citizens alike need data on budgets and spending to hold their own local governments to account. Two years ago, Open State Foundation sat down with some civil servants of the Central District of the City of Amsterdam. We discovered that each quarter they were obliged to send an Excel file with financial data on budgets and spending to the Central Bureau of Statistics. We decided to ask for the same Excel file from all districts of the city of Amsterdam and built a website to visualise the data and make comparison possible. Each district could compare not only its own budget with the actual spending but also could compare that with the other districts. We build a tool to show what unlocking all local government financial data would look like.

Image credit: Amsterdam Canal by Lies Thru a Lens CC BY 2.0

Open State Foundation decided then to approach all local governments ourselves and ask each of them for the data. It was a great opportunity to raise awareness about the importance of open data not only for society but also for the local governments themselves. We thought the easiest thing to do was now to approach the Central Bureau of Statistics and ask for all the files of all local governments. However, we were told that they were not allowed to do this. Each of the 400 municipalities, 12 provinces, 24 water boards and a couple of hundred common arrangements decided on their own in what form they present their financial records to their own citizens. It was the decision of the local governments themselves whether the data could be open or not. We built a template for local advocacy and started by asking civil servants first. We asked for the data and if they declined our request, we then approached the alderman. And if the alderman rejected our request, we approached the municipal council. Sometimes with the help of local journalists. And so, in various municipalities council members raised questions and even resolutions were tabled. Within a year, using this approach, we managed to gain access to financial data of more than 200 local governments in the Netherlands, collecting thousands of files, containing millions of data points. We then approached the Central Bureau of Statistics again. Now, together with the Ministry of Interior, that supported our mission. We could show that there was a huge number of cities and towns that were willing to share their financial information with anyone. And so, not much later, the Central Bureau of Statistics sent out a memorandum to all local and regional governments in the Netherlands, that by the end of that year, all budget and spending of all local governments, regional authorities would be released as open data. Not only was the data released historically, but from that moment, the data was published each quarter in a sustainable manner. The result was not only that data was released historically but also sustainably, by releasing the data every quarter. Municipal council members can now hold their local government to account throughout the year. Civil servants can easily benchmark the financial performance of their own city and create their own benchmarks, something that they in the past spent a lot of money on. Journalists use the tool to see how their local governments are performing. Citizens are now able to challenge the government by showing that they could do things better and reduce costs. Eventually, this success depended on the right approach to trigger various local governments. With a strong community and a mix of technical and political knowledge, everyone should be able to hold power to account. By now, a number of cities are providing data as deep as transaction level. At the moment, Open State Foundation is working with a number of local governments to dive in deeper levels of detail and to make it possible to scale this up. Together with the process to unlock local council data on minutes and decisions we want to continue working towards connecting spending to decisions made.

Open State Foundation Netherlands wins OGP 2016 award for work to advance fiscal transparency through OpenSpending

Arjan El Fassed - January 31, 2017 in Open Spending

Open State Foundation is a non-profit based in the Netherlands, working on digital transparency by opening up public information as open data and making it accessible for re-use. Last December, the organization received one of the seven Open Government Partnership 2016 Awards for its work on OpenSpending at the OGP Global Summit in Paris, France. The awards celebrated civil society initiatives that are using government data to bring concrete benefits. This blog post describes Open State Foundation’s work on advancing fiscal transparency through OpenSpending. The financial crisis and various budget cuts in the Netherlands caused more than ever before the need for citizens to gain real-time access to financial data of all local and regional governments. Civil servants, journalists and citizens alike need data on budgets and spending to hold their own local governments to account. Two years ago, Open State Foundation sat down with some civil servants of the Central District of the City of Amsterdam. We discovered that each quarter they were obliged to send an Excel file with financial data on budgets and spending to the Central Bureau of Statistics. We decided to ask for the same Excel file from all districts of the city of Amsterdam and built a website to visualise the data and make comparison possible. Each district could compare not only its own budget with the actual spending but also could compare that with the other districts. We build a tool to show what unlocking all local government financial data would look like.

Image credit: Amsterdam Canal by Lies Thru a Lens CC BY 2.0

Open State Foundation decided then to approach all local governments ourselves and ask each of them for the data. It was a great opportunity to raise awareness about the importance of open data not only for society but also for the local governments themselves.We thought the easiest thing to do was now to approach the Central Bureau of Statistics and ask for all the files of all local governments. However, we were told that they were not allowed to do this. Each of the 400 municipalities, 12 provinces, 24 water boards and a couple of hundred common arrangements decided on their own in what form they present their financial records to their own citizens. It was the decision of the local governments themselves whether the data could be open or not. We built a template for local advocacy and started by asking civil servants first. We asked for the data and if they declined our request, we then approached the alderman. And if the alderman rejected our request, we approached the municipal council. Sometimes with the help of local journalists. And so, in various municipalities council members raised questions and even resolutions were tabled. Within a year, using this approach, we managed to gain access to financial data of more than 200 local governments in the Netherlands, collecting thousands of files, containing millions of data points. We then approached the Central Bureau of Statistics again. Now, together with the Ministry of Interior, that supported our mission. We could show that there was a huge number of cities and towns that were willing to share their financial information with anyone. And so, not much later, the Central Bureau of Statistics sent out a memorandum to all local and regional governments in the Netherlands, that by the end of that year, all budget and spending of all local governments, regional authorities would be released as open data. Not only was the data released historically, but from that moment, the data was published each quarter in a sustainable manner. The result was not only that data was released historically but also sustainably, by releasing the data every quarter. Municipal council members can now hold their local government to account throughout the year. Civil servants can easily benchmark the financial performance of their own city and create their own benchmarks, something that they in the past spent a lot of money on. Journalists use the tool to see how their local governments are performing. Citizens are now able to challenge the government by showing that they could do things better and reduce costs. Eventually, this success depended on the right approach to trigger various local governments. With a strong community and a mix of technical and political knowledge, everyone should be able to hold power to account. By now, a number of cities are providing data as deep as transaction level. At the moment, Open State Foundation is working with a number of local governments to dive in deeper levels of detail and to make it possible to scale this up. Together with the process to unlock local council data on minutes and decisions we want to continue working towards connecting spending to decisions made.

Brazil’s Public Spending project is looking for leaders in various regions of Brazil to increase participation in the budgeting process.

Diana Krebs - January 30, 2017 in network, OK Brazil, Open Spending

OK Brazil's public spending website

On the 11th of January, OK Brazil launched its new Public Spending website.

The website is part of a wider campaign to search, recruit and support new leaders that wish to work with transparency, mainly public spending, in Brazilian municipalities and is using OKI’s OpenSpending technical architecture. The support will be provided to mentors specializing in law, transparency, technology and open data. The goal here is to increase the transparency in budget execution, bidding process and contractual management of cities. In order that leaders can achieve concrete results, the OK Brazil team will develop a chronogram with each and everyone of them, using the existing legal framework, the support of mentors and digital tools to increase transparency and the participation in the budgeting process.

“The new website demonstrates how to organize the missions and actions of the new leaders, empower the civilian society so that they may be able to monitor public spending and give access to both academics and journalists to budgeting data of cities”, says Lucas Ansei, developer and one of the mentors of the new website.

According to Thiago Rondon, coordinator of the OK Brazil team, the mentors will have a fundamental role to the formation of the leaders. “They’re specialists with experience on the matter at hand and will support the leaders with online conferences that will offer directions so that the impact of the actions of these new leaders is meaningful.” Another goal of this new phase of the project is to reach out to city mayors all over the country with the intention to get them to both sign the Public Spending Brazil Commitment Letter and realize the concrete actions foreseen in the letter.

Be a leader of the Open Spending project in 2017

According to Thiago, there will be an initial agenda of action that functions like a step-by-step manual so that anyone can help to increase the transparency in the city where they reside. “We want to empower the people so that they may do that on their own. To potentialize the divulgation, we will have local leaders in pilot cities that will have a direct support from the OK Brazil.” Those who want to participate as a local leader of the Public Spending project can do so on the website. During this first phase,  the OK Brazil team will select 15 local leaders through answers offered via inscription form.