You are browsing the archive for LOD2.

Open data highlights from European Data Forum 2013 in Dublin

- April 16, 2013 in Events, Featured, LOD2, Open Data


Europe’s data league convened in Dublin last week – Open Data increasingly taking the stage

Over 500 data professionals gathered last week at European Data Forum conference in Dublin. This is the annual meeting place for industry, research, policy makers, and community initiatives to discuss the challenges and opportunities of Big Data in Europe. One of the main sentiments throughout the event was a profound interest in openly licensed data and developments in the field of linked data.

The Open Knowledge Foundation was represented by Sander van der Waal and myself, and we took part with reference to the LOD2 project (an EU-funded project on Linked Open Data) and the Apps for Europe project (supporting apps competitions around Europe) – as well as to stimulate open data discussions in general. That seemed to have an increasingly fertile ground, as one of the main sentiments throughout the conference was a profound general interest not only in linking data, but also making them legally and technically open.

Open Data on the political agenda

Irish Minister for Justice, Equality and Defense Alan Shatter was among the first in the official program – which was initiated with a brief video message from EU Vice President Neelie Kroes – to address the need to embrace linked data, rightly calling it the new digital frontier. He seemingly hinted at the need for open technical standards and open licensing to be the norm, by emphasizing the need to change EU data protection regulation to enable maximum gain from the massive opportunities put before us in linking the vast datasets (commonly referred to as Big Data). This notion was supported by Robert Viola, Deputy Director General at European Commission (from Directorate General for Communications Networks, Content and Technology) in his subsequent presentation highlighting among other how open data is the optimal way to improve public health systems. Representatives from the European Commission’s DG Connect (Directorate General for Communications Networks, Content and Technology), Malte Beyer-Katzenberger and Francesco Barbato, continued this thought by presenting a concept called the EU Data Value Chain, which is a part of DG Connect’s effort to ensure that digital technologies can help deliver the growth which the EU needs. The initiative is working on creating a European data ecosystem in accordance with EU’s Open Data Policy which covers ao. open government data, public sector information (PSI) and Open Access. The reason for this is the need to pursue untapped business opportunities, ensure better governance and citizen empowerment (through transparency), and the need to address societal changes and accelerate scientific progress. In that regard the European Commission has been pushing members to open up data since the launch of the PSI directive in 2003. Malte Beyer-Katzenberger also presented the EU Open Data Portal later in the conference program, which we at Open Knowledge Foundation have helped develop. The portal is part of the European open data infrastructure that aggregates metadata from sources across the EU and acts as a single access point which helps to identify what data exists without knowing who is holding them; at the same time, Beyer-Katzenberger noted, it also acts as a driver for re-use policies inside the organization.

Open data as an innovation strategy for industry

The first day of the event also saw the announcement of the winner of the European Data Innovator Award, which was given to Michael Gorriz, CIO of car manufacturer Daimler, for his linked knowledge systems in Mercedes cars. Gorriz explained how data is connecting customers and enterprises more directly – calling it an emerging new economy of crowdsourcing and interaction – and highlighted the enormous business potential of linked open data. Specifically, he stressed the importance getting data and information out of the technical and legal “silos” (referring to proprietary data) in order to create value. This obviously requires not only overcoming the technical challenge, but also the cultural one of adapting to making business and driving innovation through linked and open data. In this argument Gorriz referred specifically to Sir Tim Berners-Lee’s principles for linked open data and the need to leverage standards such as RDF and Sparql instead of developing proprietary technologies. As a key point he also urged other business leaders to step into the new economies by building trust and reducing the fear of data transparency – and to dare using linked open data to drive the cultural change of their enterprise.
In the field of energy, Florian Bauer from REEEP (The Renewable Energy and Energy Efficiency Partnership) gave a presentation advocating open data as a way of helping the uptake of clean sustainable energy in society in general. Based on experience from the more than 180 clean energy projects in 58 countries REEEP has supported, Bauer pointed out that the power of open data lies in energy companies avoiding replication of work by having joint access to data, and therefore being able to concentrate resources on their own expertise and keeping maintenance to a minimum. Additionally, open data allows for lowering CO2 emission by using the data that is already there. However, Bauer explained that this road has only just begun. Connecting data portals through open standards and with interoperability is needed, and the energy sector needs to publish more data – in raw, machine readable formats and under licenses that allow re-use. Another major industry representative, Chief Engineer of IT at Statoil, Knut Sebastian Tungland (responsible for technology strategies and professional practices), spoke on the second day of the conference and started out by commenting on the main point that he felt he would take from the conference: namely that they need to act on open data in general, which is not something that he feels they’ve contributed to a lot to so far. In the same breath he expressed the difficulty in doing so and sent out an invitation to help them leverage these ideas – to help them figure out how to share their data.

Open Knowledge Foundation projects enabling innovation

The European Data Forum also featured a presentation by Open Knowledge Foundation (by Sander van der Waal and me) about the project that has been developed as part of the LOD2-project (focusing on Linked Open Data). The portal, which runs on the CKAN open source data management system, provides access to open, freely reusable datasets from local, regional and national public bodies across Europe. The portal has recently been updated with a new set of social features and visualization capabilities, inviting citizens to examine, discuss and share the datasets; thereby making it easier to find relevant data to use for science, journalism and research in general – as well as for business and app development purposes. It was highly motivating to see open data being more and more widely acknowledged as a driver of innovation and growth. The Open Knowledge Foundation has been pushing for more openly licensed data for years, and we look forward to working with anyone to further stimulate innovation and wider uptake of openly licensed data and content.

Announcing Recline.JS: a Javascript library for building data applications in the browser

- July 5, 2012 in Featured, LOD2, OKF, Open Textbooks, Press, Sprint / Hackday, texts

Today we’re pleased to announce the first public release of Recline.JS, a simple but powerful open-source library for building data applications in pure Javascript. For those of you who want to get hands on right away, you can: recline-map-geo-filter-sf-crime

What Is It?

Recline is a Javascript library of data components incuding grid, graphing and data connectors. The aim of Recline is to weave together existing open-source components to create an easy to use but powerful platform for building your own data apps. The views can be embedded in to other apps just like we’ve done for CKAN and the DataHub where it’s used for our data viewer and visualisations. What makes Recline so versatile is its modularity, meaning you only need to take what you need for the data app you want to build. Main features:
  • View (and edit) your data in a clean grid / table interface
  • Built in visualizations including graphs, maps and timelines
  • Load data from multiple sources including online CSV and Excel, local CSV, Google Docs, ElasticSearch and the DataHub
  • Bulk update/clean your data using an easy scripting UI
  • Easily extensible with new Backends so you can connect to your database or storage layer
  • Open-source, pure javascript and designed for integration — so it is easy to embed in other sites and applications
  • Built on the simple but powerful Backbone giving a clean and robust design which is easy to extend
  • Properly designed model with clean separation of data and presentation
  • Componentized design means you use only what you need

Who’s Behind It?

Recline has been developed by Rufus Pollock and Max Ogden with substantial contributions from the CKAN team including Adria Mercader and Aron Carroll.


There are a selection of demos now available on the Recline website for you to try out.

Multiview Demo


The Data Explorer




Data Catalog Schema and Protocol – Draft Specification

- June 13, 2012 in ckan, LOD2, Open Data, Open Government Data, Open Standards, Our Work, WG Open Government Data

Open Data is an idea that continues to gain momentum, and one of the signs of this is that the world has more and more data catalogs. This is great for many reasons but it also brings its own problem especially around interoperability and standardization — the lack of standard schema and interfaces is something we’ve experienced in our work on projects like which pulls together dataset information from many different data catalogs around Europe. Last year we convened an international data catalogs meeting in Edinburgh. Since then we at the Open Knowledge Foundation, in collaboration and consultation with the W3C’s DCAT team, have been working on a draft specification for a data catalog schema (format) and protocol for accessing and syncing data catalogs. A first draft of this standard is now ready and we’re putting out a request for comments:


Roughly the specification consists of 2 parts:
  1. A schema (in essence DCAT) specifying a serialization of Dataset information,
  2. A protocol / API for getting this information from a compliant data catalogue site.
We emphasize that this is a first draft, and is intentionally fairly rough as an invitation to contribute. You can do this in several ways:

Talk at European Data Forum: Open Data, Where We’ve Come From, Where We’re Going

- June 12, 2012 in LOD2, Open Data, Our Work, Policy, Talks

Last week I was at the European Data Forum and gave a Keynote entitled Open Data, Where We’ve Come From, Where We’re Going. Here are the slides.

LOD2 plenary, Vienna, 21-3 March 2012

- March 23, 2012 in ckan, Events, linked-open-data, LOD2, OKF Austria

I am in Vienna, along with my colleague Ira, for a plenary meeting of the assorted partners of the LOD2 project. LOD2 is an EU-funded research project on Linked Open Data, the vision of an interlinked web of data known to many from Tim Berners-Lee’s TED talk. The meeting runs for 3 days, in which there will be discussions about the various work packages, but I have been given the task of blogging about the opening introductory session on Wednesday afternoon. (Full disclosure: I have received a handsome LOD2 mug as advance payment for my efforts.) The Open Knowledge Foundation is one of the partners, because the pan-European CKAN data portal is part of the project. But being personally a relative newcomer, I was looking forward to finding out in this introductory session what the project is really all about. [IMG: Delegates at LOD2 plenary]
Delegates at the LOD2 plenary Sören Auer, the project co-ordinator, kicked off, giving an overview of the overview. He described the lifecycle of Linked Data, from extraction (from other structured or unstructured data) through to linking in to existing data, enrichment (perhaps by adding more structure), to the point where it can be explored for interesting patterns. For each stage in the lifecycle, there are tools being developed by the project – many are already released. Collectively these tools, which are all Open Source, form the LOD2 ‘stack’. Sören also mentioned some recent milestones, including a Serbian CKAN portal holding a lot of data in RDF, the native format for Linked Data; and a planned new data-oriented conference, the European Data Forum.

The tools: Work Packages 2-6

WP2: Optimising the store

Peter Boncz of CWI spoke about Work Package 2. (What happened to WP1, you ask? It was a prototype which finished earlier in the project.) WP2 concerns Virtuoso, the database part of the LOD2 stack. The challenge with RDF is to make a database that runs efficiently with huge quantities of data, as the potential for rich interlinking means the data is not neatly segmented into tables as in a normal database. A lot of progress has already been made, and he hopes that Virtuoso 7 will be released soon. It will be structured to enable better compression (speeding up processing by reducing I/O), and use adaptive caching to try to minimise the number of queries that need to be done more than once.

WP3: Getting the data

Jens Lehman of AKSW at the University of Leipzig was next, talking about WP3 on ‘extraction, enrichment and repair’: the creation of Linked Data from existing structured or unstructured sources, its enrichment with suitable taxonomies to describe it, and detecting inconsistencies or other problems with its structure. If that sounds like a wide-ranging package, it is: as Jens told me later over dinner (not entirely seriously), ‘anything that doesn’t fit in one of the other packages gets stuffed into WP3′! There are currently over 20 tools playing a role in this stage, including Natural Language Processing techniques for extracting data from free text.

WP4: Creating links

Next up was Robert Isele of the Freie Universität Berlin. WP4 aims to enrich RDF data by adding links to other data sources, as well as linking data together by identifying duplicate entities within or between datasets. Automatic tools suggest links that a user can confirm or reject. WP4 also includes work to create an RDF-enabled version of the open source data cleaning tool Google Refine.

WP5: User interfaces

Sean Policarpio of DERI reported on WP5 on browsing, visualisation and authoring interfaces. He demonstrated geospatial data on a map, filtered with a structured (faceted) search – combining the power of Linked Data with a mapping search like Google Maps. Associated with this, they have produced a ‘semantic authoring’ tool, allowing the user to add or edit Linked Data via the map. Their next tasks are to implement ‘social semantic networking’ – for example, notifications based on semantic content – and mobile interfaces for their semantic tools.

WP6: Integrating the tools

Finally, the engaging and very Belgian Bert van Nuffelen of TenForce spoke about WP6, which aims to make the various disparate tools in the LOD2 stack play nicely together. They have worked on making it easier for users to install the stack tools, a shared interface and shared authorisation using WebID. They have also recently released an intermediate version of the stack (version 1.1) with new and upgraded tools and better documentation. By now it was 3 o’clock and, against all expectations, the meeting was ahead of schedule. So we had a relatively luxurious half-hour break for tea. Your correspondent and another relative newcomer, Jan from Tenforce, took the opportunity to get some fresh air and a feel for the Viennese genius loci. Or should that be Ortsgeist?

The use cases

WP7: Publishing

We had heard about the tools that had been, and are being, developed to manipulate Linked Data. But how will they be used? Refreshed by tea we returned to the meeting to hear about the three Work Packages concerned with use cases. Perhaps the most exciting talk of the afternoon came from Christian Dirschl of WP7 and Wolters Kluwer Germany (WKD). WKD is a legal and accountancy publisher who are already adapting and using the LOD2 stack tools to enhance their publishing business. Christian told us that ‘semantic technologies enable publishing media to create added value’, and WKD’s first release of news and media datasets created using Linked Data tools is on course for publication in April. By December they will release an interlinked version of the datasets, including links to DPpedia and further optimised tools.

WP8: Enterprise

Amar-Djalil Mezaour of Exalead presented the ‘enterprise’ use case WP8, an application to human resources with the aim of matching job vacancies to applicants. Some early work trying to model CVs had met criticism on the ground, among others, that the EU reviewers had doubts about volume of data freely available. WP8 has refocused its attention on job vacancies rather than CVs, for which there is plenty of data and better RDF support. They hope to release the results later this year, with vacancies ‘dashboards’ and analytics, faceted by sector, region, salary, etc, using Linked Data, and enriched with mashups with other sites such as social networks.

WP9: Government data

After a long wait in the wings, it was time for the OKF’s own Ira Bolychevsky to take centre stage at last. WP9 aims to explore the applications to making government data available and maximising its use. Its main visible output is, which republishes open data from government portals throughout the European Union. has recently been upgraded and repaired: it now runs the latest version of CKAN, introducing features such as data previews (like this) and – live on the DataHub and coming soon to – a data API for structured data. Two subjects we hope to discuss more later in the plenary are closer integration with the LOD2 stack, and metadata standards. [IMG: Ira Bolychevsky at LOD2 plenary]
Ira presenting WP9 Jindřich Mynarz briefly mentioned the new Czech CKAN portal. They have developed a detailed methodology as well as a ‘Quick Start guide’ for publishers, both of which they promise to make available in English soon (hurrah!) Finally Vojtech Svatek of UEP gave a quick overview of WP9a, which aims to use Linked Data technology in the field of public procurement, with ontologies for public sector contracts – providing matchmaking and analytics not dissimilar from those in WP8.

A jug of wine, a loaf of bread

Perhaps the reader has read enough of Work Packages for now. Anticipating your satiety, the organisers had decided to defer the presentations from WP10-12 until Friday. In their place an outsider to the LOD2 project, Allan Hanbury, gave a lightning talk on a slightly related EU project, Khresmoi, which aims to provide useful searching tools for large medical databases. Thus concluded the day’s business, and we all dispersed to our various hotels. The OKF contingent, along with TenForce, are staying in one just a couple of roads away. Crossing a road is hazardous in Vienna, because there are sometimes cars parked in what seems to be the middle of the road. You keep half-expecting some lights to change and the cars to zoom off. In fact they are parked between the road and the tramlines, along which long and elderly trams snake through the city. In the evening, everyone from the day’s meetings reconvened and were whisked away on one such tram to an outlying districts of the city, for an evening at a (more or less) traditional Austrian Heurige, an untranslatable type of wine tavern. A true Heurige, Helmut from the Semantic Web Company explains to me as we hurtle along, is run by a vineyard, and gives people an opportunity to sample its new year’s crop of wine. (‘Heurige’ in Austrian German literally means ‘this year’.) It will have a licence to open for only 2 or 3 weeks a year, and when open will hang out a spray of branches and a lamp to signify the fact. There is still some wine grown in Vienna, I am told, but most of the Viennese Heurigen are open all year round and are really just restaurants. But they recreate the atmosphere of the real thing. Patrons are served wine and a mixed plate of traditional local foods, which, for readers not familiar with Austrian cuisine, mainly consist of various kinds of sausage, potato and cabbage. They are delicious, and so is the Apfelstrudel that comes along later. The only thing I cannot recommend in Vienna is the tea. When will these foreigners learn that it must be made with boiling hot water? To follow blogs from the LOD2 plenary, see the blog parade from the project blog.

Open Data Search: finding useful datasets, worldwide

- March 16, 2011 in ckan, LOD2, Open Government Data, Technical, WG Open Government Data

The following post is from Friedrich Lindenberg, who is a developer at the Open Knowledge Foundation working on CKAN, and Open Spending. Recently, there has hardly been a week in which there hasn’t been an announcement of a new local, regional or national open data initiative – including ever more extensive catalogues of data that is being opened up (CKAN alone now runs in 20 or more places). While this is great news for those of us interested in re-using the data, it also means it becomes increasingly hard to keep a good overview of what kind of data are available for which places. To get a better overview we’ve now started a meta search engine for open data, is a global version of the prototype site we announced in January: it’s an aggregator for datasets, providing a simple and unified search interface to all of the catalogues contained. At the moment, this includes all known instances of the CKAN software, the Sunlight Foundation’s National Data Catalog (and with it a large number of US-based data sources), the World Bank data catalogue, Sweden’s DCat-enabled and Nexedi’s Data Publica portal. We’ve also put up which provides access to the combined index of all CKANs only. Behind the scenes, is web spider with a twist: all collected data is converted to DCat, DERI/W3C’s RDF-based ontology for dataset descriptions. While this convention is still in early development, it’s interesting to see how well different kinds of catalogues can be expressed in it already (the harvested data can be found here). By harvesting a growing set of existing dataset descriptions, we hope to gather a comprehensive picture of the dataset properties that are widely used and that should be represented in a common format. Our goal with this is to establish some degree of interoperability between different data catalogues, leading into a federated catalogue architecture for Europe and perhaps beyond. These standardization concerns aside, we want to make useful on its own. For the immediate future this means adding support for more filter options, including licenses (and their compliance to open data principles), languages used in metadata and the data itself and geographic scopes of the collected information. This, of course, is an open source development effort and we’d glad to welcome those interested in contributing comments, catalogue data or functionality on the ckan-discuss mailing list! Related posts:
  1. Launch of, a community driven French open data catalogue
  2. An Open Search Service: Regulating Search the Open Way
  3. CKAN and Finding Open Data in the Life Sciences

Notes from EU meeting on “pan-European open data portal”

- December 13, 2010 in ckan, Events, External, Government, LOD2, OKF, OKF Projects, Open Data, Open Government Data, Policy, WG EU Open Data, WG Open Government Data, Working Groups

A report from an EU meeting on the “goals and requirements for a pan-European data portal” is now online: I was invited on behalf of the Open Knowledge Foundation to discuss our work on the CKAN project, both as part of and as part of the LOD2 project, which will bring together open data from local, regional and national public bodies across Europe. From the introduction:
On the 3rd of November 2010 the European Commission organised in Luxembourg a technical workshop on the goals and requirements for a possible pan-European data portal. Experts with practical experience in their respective countries were invited to share their experiences and ideas. The experts consider that such a portal would add value to existing regional and national initiatives by improving transparency on issues of EU-wide interest, providing evidence for better policy making, improving the efficiency of data-dependent administrative and business processes and stimulating economic development through EU-wide reuse of data. Several issues of legal, technical and socio-political nature must be addressed for such a portal to function effectively, among them the need for high level political support, the systematic adoption of reuse-friendly data licences, the promotion of established data standards for maximal interoperability and the organic involvement of European software developers and data-literate citizens. A pan-European portal should be able to expand rapidly in breadth (thus fostering the interest of the public with large numbers of relevant datasets) while at the same time also showing the value of deeper data integration, starting from a core set of statistical, financial, geospatial data of high quality. Agile prototyping and development models are recommended, given the extremely fast pace at which data initiatives are developing in Europe. A small working group should be created to drive the issue forward and meet regularly to identify more precisely technical requirements. The group should connect with other open data stakeholder groups established at the national or European level and contribute to the definition of European datasets, government open data conferences and software development competitions, with first results visible and publicised by mid-2011.
The report identifies several reasons for developing a pan-European data portal:
A) For European citizens
  • Single point of access on European information
  • Enabling services for citizens that live at country borders and/or work abroad
  • knowledge of successful open government data initiatives in some Member States can drive further initiatives in other Member States
B) For administrations
  • Improvement of interoperability across processes thanks to greater availability of data
  • Improved comparability of EU 27 information and data
  • Reduction in administrative costs
  • Avoiding / cutting existing costs of re-publication of official information
  • More efficiency in servicing Freedom of Information requests
  • Involvement of European citizens (crowd sourcing approach) can have positive effects on transparency and quality of data.
C) For economic development
  • Planning and monitoring resource for companies operating across EU borders
  • Driving the European innovation process
  • Driving force for European economy (information technology, new location based services, analyzing services et al)
  • Harmonisation of standards and guidelines for open government data across Europe
It also highlighted the value of open licenses, which allow anyone to reuse the data for any purpose:
The participants of the workshop furthermore identified appropriate data licensing at the source as the conceptual precondition for any value to be extracted by data reuse (developers will not reuse data if it is not clear that they have the right to do so). This appears to be mostly an issue of educating data publishers on the selection of an appropriate license. There may be however contexts in which this might turn out to be a legislative issue, to be considered in the context of the review of the Public Sector Information Directive. There was also consensus on the fact that a clear licensing policy should be created and enforced on a pan-European data portal so as to maximise the opportunity for data reuse.
The report concludes:
[...] participants agreed that a pan-European data portal with the characteristics described above would add value to open data initiatives from the Member States. Such an initiative should be pursued without delay in order to exploit the current momentum of open government data initiatives across Europe
It is fantastic to see such interest in open government data from the European Commission, and we look forward to following further developments with great interest. If you’re interested in keeping in touch with the Open Knowledge Foundation’s work in this area you can follow: