You are browsing the archive for ckan.

Introducing Datashades.info, a CKAN Community Service

- September 23, 2019 in ckan

Do you use CKAN to power an open data portal? In this guest post Link Digital explains how you can take advantage of their latest open data initiative Datashades.info.

Datashades.info is a tool designed to deliver insights for researchers, portal managers, and the wider tech community to inform and support open data efforts relating to data hosted on CKAN platforms. Link Digital created the online service through a number of alpha releases and considers datashades.info, now in beta, as a long term initiative they expect to improve with more features in future releases. Specifically, Datashades.info provides a publicly-accessible index of metadata and statistics on CKAN data portals across the globe. For each portal, a number of statistics are aggregated and presented surrounding number of datasets, users, organisations and dataset tags. These statistics give portal managers the ability to quickly compare the size and scope of CKAN data portals to help inform their development roadmaps. Moreover, for each portal, installed plugin information is collected along with the relative penetration of those plugins across all portals in the index. This will enable CKAN developers to quickly see what extensions are the most popular and on what portals they are being used. Finally, all historical data is persisted and kept publically accessible, allowing researchers to analyse historical data trends in any indexed CKAN portal. Datashades.info was built to support a crowd-sourced indexing scheme. If a visitor searches for a CKAN portal and it is not found within the index, the system will immediately query that portal and attempt to generate a new index entry on-the-fly. Aggregation of a new portal’s statistics into Datashades.info also happens automatically. Maximise the tool and gain interesting information with the following features:

Globally Accessible open data

With Datashades.info, you can easily access an index of metadata and statistics on CKAN data portals across the globe. To do this, simply type in the portal’s URL on the homepage then click “Search“.

Integrated Values of All Metrics

After entering a portal’s URL, Datashades.info will load its information. After a few seconds, you will be able to see a range of data on portal users, datasets, resources, organisations, tags and plugins. Portal managers can access these via the individual portal page found on the site.

Easily-tracked Historical Data

Want to revisit data you previously explored? The tool also keeps old data in a historical index which users can explore any time on any portal page or by clicking “View All Data Portals” on the homepage.

Crowdsourcing

Datashades.info uses crowdsourcing to build its index. This means users can easily add any CKAN data portal not found on the site. To do this, simply search for a portal you know and it’ll be automatically added to the site and global statistics. As the project remains at a beta level of maturity, it is still wanting of improvements in many areas. But with the continuous feedback coming from the CKAN community, expect that more data and features will be added in future releases. For now, have a look around and stay tuned!  

Statement from the Open Knowledge Foundation Board on the future of the CKAN Association

- June 6, 2019 in ckan, Open Data, Open Knowledge, Open Knowledge Foundation

The Open Knowledge Foundation (OKF) Board met on Monday evening to discuss the future of the CKAN Association.

The Board supported the CKAN Stewardship proposal jointly put forward by Link Digital and Datopian. As two of the longest serving members of the CKAN Community, it was felt their proposal would now move CKAN forward, strengthening both the platform and community.

In appointing joint stewardship to Link Digital and Datopian, the Board felt there was a clear practical path with strong leadership and committed funding to see CKAN grow and prosper in the years to come.

OKF will remain the ‘purpose trustee’ to ensure the Stewards remain true to the purpose and ethos of the CKAN project. The Board would like to thank everyone who contributed to the deliberations and we are confident CKAN has a very bright future ahead of it.

If you have any questions, please get in touch with Steven de Costa, managing director of Link Digital, or Paul Walsh, CEO of Datopian, by emailing stewards@ckan.org.

CKANconUS and Code for America Summit: some thoughts about the important questions

- June 20, 2018 in ckan, code for america, Events, OK US, USA

It’s been a few weeks after CKANConUS and the seventh Code for America Summit took place in Oakland. As always, it was a great place to meet old friends and new faces of technologists, policy experts, government innovators in the U.S. In this blogpost I share some of the experience of attending these two conferences and a few thoughts I’ve been ruminating about the discussions that happened, and more importantly, those that didn’t happen. CKAN is an open source open data portal platform that Open Knowledge International developed several years ago. It has been used and reused by many governments and civil society organizations around the world. For CKANconUS, the OK US group, led by Joel Natividad organized a one day event with different users and implementers of CKAN around the United States. We had the California based LA Counts, gathering data from the 88 cities in the County of Los Angeles; the California Data Collaborative working to improve water management decisions. We also had some interesting presentations from the GreenInfo Network and the California Natural Resources Agency. And we had the chance to hear about the awesome process of the Western Pennsylvania Regional Data Center to choose CKAN as its platform and how they maintain the project (presentation included LEGOs in every slide). On the more technical side, David Read, Ian Ward and our own Adrià Mercader talked about the new versions of CKAN, the Express Loader and the Technical Roadmap for CKAN, 11 years after its development started. You can view the slides by Adrià Mercader on the CKAN Technical Roadmap overview here. We closed with some great lightning talks about datamirror.org to ensure access to federal research data and Human Centered Design and what Amanda Damewood learned about working in government in improving these processes. The next two days in the Code for America Summit were full of interesting talks about building tools, innovating in our processes and making government work for people in a better way. There were some interesting keynote speakers as well as breakout sessions where we discussed the process to build certain projects and how we can rethink how we engage in our communities. I would like highlight two mainstage talks about collaboration (or the difficulty of such) between government and civil society. The first is a talk and panel about disasters in Puerto Rico, Houston and cities in Florida, where some key points were raised about the importance of having accurate, verifiable and usable information in these cases, as well as the importance of having a network of people who are willing to help their peers. The second is the presentation Code for Asheville presented, regarding their issues with homelessness and police data. This isn’t necessarily what you would call a success story but Sabrah n’haRaven made a great point about working with social issues: “Trust effective communities to understand their own problems”. This may sound like a given in the work we do when working with data and building things with it, but it’s something that we need to keep in mind. Using this line of thought, it seems crucial to keep these conversations going. We need to understand our communities, be aware that there are policies that go against the rights of people to live a fulfilling life and we need to change that. I hope for the next CfA Summit and CKANConUS we can try to find some answers to these questions collectively.

CKANconUS and Code for America Summit: some thoughts about the important questions

- June 20, 2018 in ckan, code for america, Events, OK US, USA

It’s been a few weeks after CKANConUS and the seventh Code for America Summit took place in Oakland. As always, it was a great place to meet old friends and new faces of technologists, policy experts, government innovators in the U.S. In this blogpost I share some of the experience of attending these two conferences and a few thoughts I’ve been ruminating about the discussions that happened, and more importantly, those that didn’t happen. CKAN is an open source open data portal platform that Open Knowledge International developed several years ago. It has been used and reused by many governments and civil society organizations around the world. For CKANconUS, the OK US group, led by Joel Natividad organized a one day event with different users and implementers of CKAN around the United States. We had the California based LA Counts, gathering data from the 88 cities in the County of Los Angeles; the California Data Collaborative working to improve water management decisions. We also had some interesting presentations from the GreenInfo Network and the California Natural Resources Agency. And we had the chance to hear about the awesome process of the Western Pennsylvania Regional Data Center to choose CKAN as its platform and how they maintain the project (presentation included LEGOs in every slide). On the more technical side, David Read, Ian Ward and our own Adrià Mercader talked about the new versions of CKAN, the Express Loader and the Technical Roadmap for CKAN, 11 years after its development started. You can view the slides by Adrià Mercader on the CKAN Technical Roadmap overview here. We closed with some great lightning talks about datamirror.org to ensure access to federal research data and Human Centered Design and what Amanda Damewood learned about working in government in improving these processes. The next two days in the Code for America Summit were full of interesting talks about building tools, innovating in our processes and making government work for people in a better way. There were some interesting keynote speakers as well as breakout sessions where we discussed the process to build certain projects and how we can rethink how we engage in our communities. I would like highlight two mainstage talks about collaboration (or the difficulty of such) between government and civil society. The first is a talk and panel about disasters in Puerto Rico, Houston and cities in Florida, where some key points were raised about the importance of having accurate, verifiable and usable information in these cases, as well as the importance of having a network of people who are willing to help their peers. The second is the presentation Code for Asheville presented, regarding their issues with homelessness and police data. This isn’t necessarily what you would call a success story but Sabrah n’haRaven made a great point about working with social issues: “Trust effective communities to understand their own problems”. This may sound like a given in the work we do when working with data and building things with it, but it’s something that we need to keep in mind. Using this line of thought, it seems crucial to keep these conversations going. We need to understand our communities, be aware that there are policies that go against the rights of people to live a fulfilling life and we need to change that. I hope for the next CfA Summit and CKANConUS we can try to find some answers to these questions collectively.

Validation for Open Data Portals: a Frictionless Data Case Study

- December 18, 2017 in case study, ckan, Data Quality, Frictionless Data, goodtables

The Frictionless Data project is about making it effortless to transport high quality data among different tools and platforms for further analysis. We are doing this by developing a set of software, specifications, and best practices for publishing data. The heart of Frictionless Data is the Data Package specification, a containerization format for any kind of data based on existing practices for publishing open-source software. Through its pilots, Frictionless Data is working directly with organisations to solve real problems managing data. The University of Pittsburgh’s Center for Urban and Social Research is one such organisation. One of the main goals of the Frictionless Data project is to help improve data quality by providing easy to integrate libraries and services for data validation. We have integrated data validation seamlessly with different backends like GitHub and Amazon S3 via the online service goodtables.io, but we also wanted to explore closer integrations with other platforms. An obvious choice for that are Open Data portals. They are still one of the main forms of dissemination of Open Data, especially for governments and other organizations. They provide a single entry point to data relating to a particular region or thematic area and provide users with tools to discover and access different datasets. On the backend, publishers also have tools available for the validation and publication of datasets. Data quality varies widely across different portals, reflecting the publication processes and requirements of the hosting organizations. In general, it is difficult for users to assess the quality of the data and there is a lack of descriptors for the actual data fields. At the publisher level, while strong emphasis has been put in metadata standards and interoperability, publishers don’t generally have the same help or guidance when dealing with data quality or description. We believe that data quality in Open Data portals can have a central place on both these fronts, user-centric and publisher-centric, and we started this pilot to showcase a possible implementation. To field test our implementation we chose the Western Pennsylvania Regional Data Center (WPRDC), managed by the University of Pittsburgh Center for Urban and Social Research. WPRDC is a great example of a well managed Open Data portal, where datasets are actively maintained and the portal itself is just one component of a wider Open Data strategy. It also provides a good variety of publishers, including public sector agencies, academic institutions, and nonprofit organizations. The portal software that we are using for this pilot is CKAN, the world leading open source software for Open Data portals (source). Open Knowledge International initially fostered the CKAN project and is now a member of the CKAN Association. We created ckanext-validation, a CKAN extension that provides a low level API and readily available features for data validation and reporting that can be added to any CKAN instance. This is powered by goodtables, a library developed by Open Knowledge International to support the validation of tabular datasets. The ckanext-validation extension allows users to perform data validation against any tabular resource, such as  CSV or Excel files. This generates a report that is stored against a particular resource, describing issues found with the data, both at the structural level, such as missing headers and blank rows,  and at the data schema level, such as wrong data types and  out of range values. Read the technical details about this pilot study, our learnings and areas we have identified for further work in the coming days here on the Frictionless Data website.

CKAN 2.7.2 설치 및 배포 안내 문서

- October 18, 2017 in ckan

CKAN은 데이터 및 메타데이터 카탈로그 공유를 위해 미국 및 영국 등의 여러 나라에서 널리 사용되고 있다.
CKAN을 설치하기 위해서는 CKAN 공식 문서(http://docs.ckan.org/en/ckan-2.7.0/maintaining/installing/index.html)를 따라서 CKAN Package 버젼을 쉽게 설치할 수 있지만,
리눅스 우분투 14.04 (LTS)  운영체제 이외의 다른 버전을 사용할 경우 CKAN 소스파일을 이용하여 설치해야 한다.
소스파일을 이용한 CKAN 설치 시 CKAN 이외에 추가로 설치해야 것들(필수 패키지, 라이브러리, 웹 서버 등)의 의존성 및 기타 환경 설정에 있어 공식 문서에서 설명되지 않은 에러를 만날 수 있다.
그러므로, 이 글을 통해 소스파일을 설치하면서 발생할 수 있는 에러와 그에 대한 해결 방법들을 공유하고자 한다.
설치 환경은 다음과 같으며 공식 문서에서 제공하는 문서를 함께 참고하길 바란다.
운영체제 : 우분투 (16.04 LTS version.), 2017.10월. (현재)
CKAN : CKAN 2.7.0 소스파일
가. CKAN 설치

1. Install the required packages
> 우분투 16.04에서 CKAN을 운영하기위해 필요한 필수 패키지는 다음과 같이 설치할 수 있다.
$ sudo apt-get install python-dev postgresql libpq-dev python-pip python-virtualenv git-core solr-jetty openjdk-8-jdk redis-server

2. Install CKAN into a Python virtual environment
a. Create a Python virtual environment (virtualenv) to install CKAN into, and activate it:
$ sudo mkdir -p /usr/lib/ckan/default
$ sudo chown `whoami` /usr/lib/ckan/default
$ virtualenv –no-site-packages /usr/lib/ckan/default
$ . /usr/lib/ckan/default/bin/activate
> 주의사항) 이후의 내용은 모두 가상환경에서 실행한다.
> 각 설치 단계에서 오류가 발생하면 다음 단계에서 또한 오류가 발생할 수 있으므로, 매 단계마다 설치가 성공적으로 끝나야 한다. (중간에 오류가 나지 않더라도 CKAN 최종 배포 시에 운영이 되지 않을 수 있다.)
b. Install the CKAN source code into your virtualenv. To install the latest stable release of CKAN (CKAN 2.7.0), run:
> 현재(2017.09.) git에 저장된 최신 버젼이 2.7.0 임. 버젼명을 바꿔서 설치도 가능.
[Error Tip] > git과 관련하여 ‘head’ 불일치 오류가 발생할 경우 다음과 같이 특정 버전이 없이 설치 할 수도 있다.
c. Install the recommended version of ‘setuptools’:
$ pip install -r /usr/lib/ckan/default/src/ckan/requirement-setuptools.txt
> 지금 현재 사용된 setuptools의 버젼은 36.6 임
d. Install the Python modules that CKAN requires into your virtualenv:
$ pip install -r /usr/lib/ckan/default/src/ckan/requirements.txt
[참고] Deactivate and reactivate your virtualenv, to make sure you’re using the virtualenv’s copies of commands like paster rather than any system-wide installed copies:
deactivate
$ . /usr/lib/ckan/default/bin/deactivate

3. Setup a PostgreSQL database
a. Check that PostgreSQL was installed correctly by listing the existing databases
$ sudo -u postgres psql -l
b. Next you’ll need to create a database user if one doesn’t already exist. Create a new PostgreSQL database user called ckan_default, and enter a password for the user when prompted. You’ll need this password later:
 $ sudo -u postgres createuser -S -D -R -P ckan_default
c. Create a new PostgreSQL database, called ckan_default, owned by the database user you just created:
$ sudo -u postgres createdb -O ckan_default ckan_default -E utf-8

4. Create a CKAN config file
a. Create a directory to contain the site’s config files:
$ sudo mkdir -p /etc/ckan/default
$ sudo chown -R `whoami` /etc/ckan/
$ sudo chown -R `whoami` ~/ckan/etc (/home 디렉토리에 CKAN 경로로 설정했을 경우에 권한 설정을 해주면 된다.)
b. Create the CKAN config file:
$ . /usr/lib/ckan/default/bin/activate
> 이후 설정은 모두 가상 환경으로 다시 들어가서 다음 설치 작업을 시작한다.
$ paster make-config ckan /etc/ckan/default/development.ini
Edit the development.ini file in a text editor, changing the following options:
1) sqlalchemy.url
$ sqlalchemy.url = postgresql://ckan_default:pass@localhost/ckan_default
> 위의 3-b에서 계정 설정시 정해놓은 암호를 pass로 대체한다.
2) site_id
Each CKAN site should have a unique site_id, for example:
$ ckan.site_id = default
3) site_url
Provide the site’s URL (used when putting links to the site into the FileStore, notification emails etc). For example:
$ ckan.site_url = http://demo.ckan.org
> 도메인을 가지고 있을 경우 기입하고 그렇지 않으면 http://localhost 또는 고정 IP가 있을 경우 IP를 입력 함. url은 사이트에서 회원 가입 등 링크를 따라 이동할때 base 경로가 되므로 잘 설정해야 함.

5. Setup Solr
Edit the Jetty configuration file (/etc/default/jetty8) and change the following variables:
a. Setting Solr
$sudo vi /etc/default/jetty8 (아래의 3문장 주석 해제)
NO_START=0            # (line 4)
JETTY_HOST=127.0.0.1  # (line 16)
JETTY_PORT=8983       # (line 19)
b. Start or restart the Jetty server. For Ubuntu 16.04:
$ sudo service jetty8 restart
c. Check welcome page of Solr, http://localhost:8983/solr/
d. Replace the default schema.xml file with a symlink to the CKAN schema file included in the sources.
$ sudo mv /etc/solr/conf/schema.xml /etc/solr/conf/schema.xml.bak
$ sudo ln -s /usr/lib/ckan/default/src/ckan/ckan/config/solr/schema.xml
$ sudo service jetty8 restart

6. Create database tables
1.create the database tables:
$ . /usr/lib/ckan/default/bin/activate
$ cd /usr/lib/ckan/default/src/ckan
$ paster db init -c /etc/ckan/default/development.ini

7. Link to who.ini
$ ln -s /usr/lib/ckan/default/src/ckan/who.ini /etc/ckan/default/who.ini
$ cd /usr/lib/ckan/default/src/ckan
$ paster serve /etc/ckan/default/development.ini
웹 브라우져로 접속하면 아래와 같이 초기화면을 확인 할 수 있음.

나. CKAN 설치가 끝나면 배포 준비 (Deploying a source install)
1. Create a production.ini File
$ cp /etc/ckan/default/development.ini /etc/ckan/default/production.ini

2. Install Apache, modwsgi, modrpaf
$ sudo apt-get install apache2 libapache2-mod-wsgi libapache2-mod-rpaf

3. Install Nginx
$ sudo apt-get install nginx
[참고] nginx 설치 시 의존성 에러 발생 하면 다음과 같이 재설치
$ sudo apt-get purge nginx-full nginx-common
$ sudo  apt-get install nginx-full

4. Create the WSGI script file
Create your site’s WSGI script file /etc/ckan/default/apache.wsgi with the following contents:
$ sudo vi /etc/ckan/default/apache.wsgi
import os
activate_this = os.path.join(‘/usr/lib/ckan/default/bin/activate_this.py’)
execfile(activate_this, dict(__file__=activate_this))
from paste.deploy import loadapp
config_filepath = os.path.join(os.path.dirname(os.path.abspath(__file__)), ‘production.ini’)
from paste.script.util.logging_config import fileConfig
fileConfig(config_filepath)
application = loadapp(‘config:%s’ % config_filepath)

5. Create the Apache config file
$ sudo vi /etc/apache2/sites-available/ckan_default.conf
<VirtualHost 127.0.0.1:8080>
    ServerName 지정된 DNS
    ServerAlias 정된 DNS의 다른 이름
    WSGIScriptAlias / /etc/ckan/default/apache.wsgi
    # Pass authorization info on (needed for rest api).
    WSGIPassAuthorization On
    # Deploy as a daemon (avoids conflicts between CKAN instances).
    WSGIDaemonProcess ckan_default display-name=ckan_default processes=2 threads=15
    WSGIProcessGroup ckan_default
    ErrorLog /var/log/apache2/ckan_default.error.log
    CustomLog /var/log/apache2/ckan_default.custom.log combined
    <IfModule mod_rpaf.c>
        RPAFenable On
        RPAFsethostname On
        RPAFproxy_ips 127.0.0.1
    </IfModule>
    <Directory />
        Require all granted
    </Directory>
</VirtualHost>

6. Modify the Apache ports.conf file
$ sudo vi /etc/apache2/ports.conf
 Listen 8080 (Listen 80에서 8080으로 변경)

7. Create the Nginx config file
Create your site’s Nginx config file at /etc/nginx/sites-available/ckan, with the following contents:
$ sudo vi /etc/nginx/sites-available/ckan
proxy_cache_path /tmp/nginx_cache levels=1:2 keys_zone=cache:30m max_size=250m;
proxy_temp_path /tmp/nginx_proxy 1 2;
server {
    client_max_body_size 100M;
    location / {
        proxy_pass http://127.0.0.1:8080/;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_set_header Host $host;
        proxy_cache cache;
        proxy_cache_bypass $cookie_auth_tkt;
        proxy_no_cache $cookie_auth_tkt;
        proxy_cache_valid 30m;
        proxy_cache_key $host$scheme$proxy_host$request_uri;
        # In emergency comment out line to force caching
        # proxy_ignore_headers X-Accel-Expires Expires Cache-Control;
    }
}

8. Enable your CKAN site
To prevent conflicts, disable your default nginx and apache sites. Finally, enable your CKAN site in Apache.
$ sudo a2ensite ckan_default
$ sudo a2dissite 000-default
$ sudo rm -vi /etc/nginx/sites-enabled/default
$ sudo ln -s /etc/nginx/sites-available/ckan /etc/nginx/sites-enabled/ckan_default
$ sudo service apache2 reload
$ sudo service nginx reload
배포 완료

[추가 사항]
Dataset 추가를 위한 Admin 권한 부여 명령어 to [User1]
$ paster –plugin=ckan sysadmin add [User1] -c /etc/ckan/default/production.ini

New open energy data portal set to spark innovation in energy efficiency solutions

- June 22, 2017 in ckan, Viderum

Viderum spun off as a company from Open Knowledge International in 2016 with the aim to provide services and products to further expand the reach of open data around the world. Last week they made a great step in this direction by powering the launch of the Energy Data Service portal, which will make Denmark’s energy data available to everyone. This press release has been reposted from Viderum‘s website at http://www.viderum.com/blog/2017/06/17/new-open-energy-data-portal-set-to-spark-innovation.

Image credit: Jürgen Sandesneben, Flickr CC BY

A revolutionary new online portal, which gives open access to Denmark’s energy data, is set to spark innovation in smart, data-led solutions for energy efficiency. The Energy Data Service, launched on 17 June 2017 by the CEO of Denmark’s state-owned gas and electricity provider Energinet, and the Minister for Energy, Utilities and Climate, will share near real-time aggregated energy consumption data for all Danish municipalities, as well data on CO2emissions, energy production and the electricity market. Developers, entrepreneurs and companies will be able to access and use the data to create apps and other smart data services that empower consumers to use energy more efficiently and flexibly, saving them money and cutting their carbon footprint. Viderum is the technology partner behind the Energy Data Service. It developed the portal using CKAN, the leading data management platform for open data, originally developed by non-profit organisation Open Knowledge International. Sebastian Moleski, CEO of Viderum said: “Viderum is excited to be working with Energinet at the forefront of the open data revolution to make Denmark’s energy data available to everyone via the Energy Data Service portal. The portal makes a huge amount of complex data easily accessible, and we look forward to developing its capabilities further in the future, eventually providing real-time energy and CO2 emissions data.” Energinet hopes that the Energy Data Service will be a catalyst for the digitalisation of the energy sector and for green innovation and economic growth, both in Denmark and beyond. “As we transition to a low carbon future, we need to empower consumers to be smarter with how they use energy. The Energy Data Service will enable the development of innovative data based solutions to make this possible. For example, an electric car that knows when there is spare capacity on the electricity grid, making it a good time to charge itself.Or an app that helps local authorities understand energy consumption patterns in social housing, so they can make improvements that will save money and cut carbon”, said Peder Ø. Andreasen, CEO of Energinet. The current version of the Energy Data Service includes the following features:
  • API (Application Programme Interface) access to all raw data, which makes it easy to use in data applications and services
  • Downloadable data sets in regular formats (CSV and Excel)
  • Helpful user guides
  • Contextual information and descriptions of data sets
  • Online discussion forum for questions and knowledge sharing

Three ways ROUTETOPA promotes Transparency

- March 14, 2017 in ckan, Open Data, Open Knowledge, Route to PA

Data sharing has come a long way over the years. With open source tools, improvements and new features are always quickly on the rise. Serah Rono looks at how ROUTETOPA, a Horizon2020 project advocate for transparency.

From as far back as the age of enlightenment, the human race has worked hard to keep authorities accountable. Long term advocates of open data agree that governments are custodians, rather than owners, of data in their keep and should, therefore, avail the information they are charged with safekeeping for public scrutiny and use. Privacy and national security concerns are some of the most common barriers to absolute openness in governments and institutions in general around the world.

As more governments and organisations embrace the idea of open data, some end up, inadvertently, holding back on releasing data they believe is not ready for the public eye, a phenomenon known as ‘data-hugging’. In other instances, governments and organisations end up misleading the general public about the actual quantity and quality of information they have made public. This is usually a play at politics – a phenomenon referred to as ‘open-washing’ and is very frustrating to the open data community. It does not always stop here – some organisations are known to notoriously exaggerate the impact of their open data work  – a phenomenon Andy Nickinson refers to as ‘open-wishing’.

The  Horizon2020 project, Raising Open and User-Friendly Transparency Enabling Technologies for Public Administrations (ROUTETOPA), works to bridge the gap between open data users and open data publishers. You can read the project overview in this post and find more information on the project here.

In an age of open-washing and data-hugging, how does ROUTETOPA advocate for transparency

  1. ROUTETOPA leads by example!

The source code for ROUTETOPA tools is open source and lives in this repository. ROUTETOPA also used CKAN, a renowned data portal platform, as the basis for its Transparency Enabling Toolkit (TET). TET provides public administrators in ROUTETOPA’s pilot cities with a platform to publish and open up their data to the public. You can read more about it here. 

       2. Data publishers as pilot leads

ROUTETOPA pilots are led by public administrators. This ensures that public administrators are publishing new data regularly and that they are also at hand to answer community questions, respond to community concerns and spearhead community discussions around open data in the five pilot cities.

3.Use of online and offline communication channels

Not only does ROUTETOPA have an active social media presence on Facebook, Twitter and Youtube, it also has its own social media platform, the Social Platform for Open Data (SPOD) that provides a much needed avenue for open data discourse between data publishers and users.  The pilots in Prato, Groningen, Dublin, Issy and Den Haag also hold regular workshops, focus groups and tool test parties. Offline engagement is more relatable, and creates rapport between public administrations and citizens and is also a great avenue for making data requests.

The ROUTETOPA consortium also runs an active blog that features project updates and lessons learnt along the way. Workshops and focus groups are a key part of the success of this project, as user feedback informs the development process of ROUTETOPA tools.

ROUTETOPA partners also attend and spread the work in open data conferences and seminars, to keep the open data community across Europe in the know, and as an avenue to invite the community to test the tools, give feedback, and if it suites, adapt the tools for use in their organizations, institutions and public administrations.

Need clarification, or want to plug in and be a part of ROUTETOPA’s progress? Write to serah.rono@okfn.org. Stay open!

7 ways the ROUTE-TO-PA project has improved data sharing through CKAN

- February 27, 2017 in ckan, Route to PA

Data sharing has come a long way over the years. With open source tools, improvements and new features are always quickly on the horizon. Serah Rono looks at the improvements that have been made to open source data management system CKAN through the course of the ROUTE-TO-PA project.  In the present day, 5MB worth of data would probably be a decent photo, a three-minute song, or a spreadsheet. Nothing worth writing home about, let alone splashing across front pages of mainstream media. This was not the case in 1956 though –  in September of that year, IBM made the news by creating a 5MB hard drive. It was so big, a crane was used to lift it onto a plane. Two years later, in 1958, the World Data Centre was established to allow users open access to scientific data. Over the years, data storage and sharing options have evolved to be more portable, secure, and with the blossoming of the Internet, virtual, too. One such virtual data sharing platform, CKAN, has been up and running for ten years now. CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available. It is no wonder then that ROUTE-TO-PA, a Horizon2020 project pushing for transparency in public administrations across the EU, chose CKAN as a foundation for its Transparency Enhancing Toolset (TET). As one of ROUTE-TO-PA’s tools, the Transparency Enhancing Toolset provides data publishers with a platform on which they can open up data in their custody to the general public. So, what improvements have been made to the CKAN base code to constitute the Transparency Enhancing Toolset? Below is a brief list:

1. Content management system support

CKAN Integration with a content management system enables publishers to publish content related to datasets and publish updates related to the portal in an easy way. TET WordPress plugin seamlessly integrates TET enabled CKAN and provides rich content publishing features to publishers and an elegantly organized entry point to data portal. 

2. PivotTable

CKAN platform has limited data analysis capabilities, essential for working with data. ROUTE-TO-PA added a PivotTable feature to allow users to view, summarize and visualize data. From the data explorer in this example, users can easily create pivot tables and even run SQL queries.  See source code here.

3. OpenID

ROUTE-TO-PA created an OpenID plugin for CKAN which enabled OpenID authentication on CKAN. See source code here.

4. Recommendation for related datasets

With this feature, the application recommends related datasets a user can look at based on the current selection and other contextual information. The feature guides users to find potentially useful and relevant datasets. See example in this search result for datasets on bins in Dublin, Ireland.

5. Combine Datasets Feature

This feature allows users to combine related datasets in their search results within TET into one ‘wholesome’ dataset. Along with the Refine Results feature, the Combined Datasets feature is found in the top right corner of the search results page, as in this example. Please note, that only datasets with the same structure can be combined at this point. Once combined, the resulting dataset can be downloaded for use.

6. Personalized search and recommendations

Personalized search feature allows logged-in users to get personalized search based on details provided in their profile. In addition logged-in users are provided with personalized recommendations based on their profile details.

7. Metadata quality check/validation

Extra validations to dataset entry form are added to prevent data entry errors and to ensure consistency. You can find, borrow from and contribute to CKAN and TET code repositories on Github, join CKAN’s global user group or email serah.rono@okfn.org with any/all of your questions. Viva el open source!

Energinet.dk will use CKAN to launch Energy DataStore – a free and open portal for sharing energy data

- January 24, 2017 in åben data, ckan, energi data, english, klima, offentlige data

Dette er en repost af et indlæg fra Open Knowledge Internationals blog.   Open data service provider Viderum is working with Energinet.dk, the gas and electricity transmission system operator in Denmark, to provide near real-time access to Danish energy data. Using CKAN, an open-source platform for sharing data originally developed by Open Knowledge International, Energinet.dk’s Energy DataStore will provide easy and open access to large quantities of energy data to support the green transition and enable innovation.
Image credit: Jürgen Sandesneben, Flickr CC BY

Image credit: Jürgen Sandesneben, Flickr CC BY

What is the Energy DataStore?

Energinet.dk holds the energy consumption data from Danish house-holds and businesses as well as production data from windmills, solar cells and power plants. All this data will be made available in aggregated form through the Energy DataStore, including electricity market data and near-real-time information on CO2 emissions. The Energy DataStore will be built using open-source platform CKAN, the world’s leading data management system for open data. Through the platform, users will be able to find and extract data manually or through an API. “The Energy DataStore opens the next frontier for CKAN by expanding into large-scale, continuously growing datasets published by public sector enterprises”, writes Sebastian Moleski, Managing Director of Viderum, “We’re delighted Energinet.dk has chosen Viderum as the CKAN experts to help build this revolutionary platform. With our contribution to the success of the Energy DataStore, Viderum is taking the next step in fulfilling our mission: to make the world’s public data discoverable and accessible to everyone.” Open Knowledge International’s commercial spin-off, Viderum, is using CKAN to build a responsive platform for Energinet.dk that publishes energy consumption data for every municipality in hourly increments with a look to provide real-time in future. The Energy DataStore will provide consumers, businesses and non-profit organizations access to information vital for consumer savings, business innovation and green technology. As Pavel Richter, CEO of Open Knowledge International explains, “CKAN has been instrumental over the past 10 years in providing access to a wide range of government data. By using CKAN, the Energy DataStore signals a growing awareness of the value of open data and open source to society, not just for business growth and innovation, but for citizens and civil society organizations looking to use this data to address environmental issues.” Energinet.dk hopes that by providing easily accessible energy data, citizens will feel empowered by the transparency and businesses can create new products and services, leading to more knowledge sharing around innovative business models.     Notes:
Energinet.dk
Energinet.dk owns the Danish electricity and gas transmission system – the ‘energy’ motorways. The company’s main task is to maintain the overall security of electricity and gas supply and create objective and transparent conditions for competition on the energy markets.
CKAN
CKAN is the world’s leading open-source data portal platform. It is a complete out-of-the-box software solution that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available. A slide-deck overview of CKAN can be found here.
Viderum
Viderum is an open data solutions provider spun off from Open Knowledge, an internationally recognized non-profit working to open knowledge and see it used to empower and improve the lives of citizens around the world.
Open Knowledge International
Open Knowledge International is a global non-profit organisation focused on realising open data’s value to society by helping civil society groups access and use data to take action on social problems. Open Knowledge International does this in three ways: 1) we show the value of open data for the work of civil society organizations; 2) we provide organisations with the tools and skills to effectively use open data; and 3) we make government information systems responsive to civil society.