You are browsing the archive for Stephen Gates.

Data Curator – share usable open data

- March 14, 2019 in Frictionless Data, tools

Data Curator is a simple desktop editor to help describe, validate, and share usable open data.

Open data producers are increasingly focusing on improving open data so it can be easily used to create insight and drive positive change. Open data is more likely to be used if data consumers can:

  • understand the structure and quality of the data
  • understand why and how the data was collected
  • look up the meaning of codes used in the data
  • access the data in an open machine-readable format
  • know how the data is licensed and how it can be reused

Data Curator enables open data producers to define all this information, and validate the data, prior to publishing it on the Internet. The data is published as a Tabular Data Package following the Frictionless Data specification. This allows open data consumers to read the data using Frictionless Data applications and software libraries.

“We need to make it easy to manage data throughout its lifecycle and ensure it can be easily and reliably retrieved by people who want to reuse and repurpose it. We developed Data Curator to help publishers define certain characteristics to improve data and metadata quality” – Dallas Stower, Assistant Director-General, Digital Platforms and Data, Queensland Government – Project Sponsor

< p class="part" data-startline="21" data-endline="21">Data Curator allows you to create data from scratch or open an Excel or CSV file. Data Curator requires that each column of data is given a type (e.g. text, number). Data can be defined further using a format (e.g. text may be a URL or email). Constraints can be applied to data values (e.g. required, unique, minimum value, etc.). This definition process can be accelerated by using the Guess feature, that guesses the data types and formats for all columns.

Data can be validated against the column type, format and constraints to identify and correct errors. If it’s not appropriate to correct the errors, they can be added to the provenance information to help people understand why and how the data was collected and determine if it is fit for their purpose.

Data Curator screenshot

Often a set of codes used in the data is defined in another table. Data Curator lets you validate data across tables. This is really useful if you want to share a set of standard codes across different datasets or organisations.

Data Curator lets you save data as a comma, semicolon, or tab separated value file. After you’ve applied an open license to the data, you can export a data package containing the data, its description, and provenance information. The data package can then be published to the Internet. Some open data platforms support uploading, displaying, and downloading data packages. Open data consumers can then confidently access and use quality open data.

Get Started

Download Data Curator for Windows or macOS.

Learn more about Data Curator and Frictionless Data.

Who made Data Curator?

Data Curator was made possible with funding and guidance from the Queensland Government.

The project was led by Stephen Gates from the ODI Australian Network. Software development made possible by Gavin Kennedy and Matt Mulholland from the Queensland Cyber Infrastructure Foundation (QCIF).

Data Curator uses the Frictionless Data software libraries maintained by Open Knowledge International. Data Curator started life as Comma Chameleon an experiment by the Open Data Institute.

Walkthrough: My experience building Australia’s Regional Open Data Census

- March 6, 2015 in australia, census, Featured Project, OKF Australia, Open Data Census, regional

Skærmbillede 2015-03-06 kl. 11.27.11 On International Open Data Day (21 Feb 2015) Australia’s Regional Open Data Census launched. This is the story of the trials and tribulations in launching the census.

Getting Started

Like many open data initiatives come to realise, after filling up a portal with lots of open data, there is a need for quality as well as quantity. I decided to tackle improving the quality of Australia’s open data as part of my Christmas holiday project. I decided to request a local open data census on 23 Dec (I’d finished my Christmas shopping a day early). While I was waiting for a reply, I read the documentation – it was well written and configuring a web site using Google Sheets seemed easy enough. The Open Knowledge Local Groups team contacted me early in the new year and introduced me to Pia Waugh and the team at Open Knowledge Australia. Pia helped propose the idea of the census to the leaders of Australia’s state and territory government open data initiatives. I was invited to pitch the census to them at a meeting on 19 Feb – Two days before International Open Data Day.

A plan was hatched

On 29 Jan I was informed by Open Knowledge that the census was ready to be configured. Could I be ready be launch in 25 days time? Configuring the census was easy. Fill in the blanks, a list of places, some words on the homepage, look at other census and re-use some FAQ, add a logo and some custom CSS. However, deciding on what data to assess brought me to a screaming halt.

Deciding on data

The Global census uses data based on the G8 key datasets definition. The Local census template datasets are focused on local government responsibilities. There was no guidance for countries with three levels of government. How could I get agreement on the datasets and launch in time for Open Data Day? I decided to make a Google Sheet with tabs for datasets required by the G8, Global Census, Local Census, Open Data Barometer, and Australia’s Foundation Spatial Data Framework. Based on these references I proposed 10 datasets to assess. An email was sent to the open data leaders asking them to collaborate on selecting the datasets.

GitHub is full of friends

When I encountered issues configuring the census, I turned to GitHub. Paul Walsh, one of the team on the OpenDataCensus repository on GitHub, was my guardian on GitHub – steering my issues to the right place, fixing Google Sheet security bugs, deleting a place I created called “Try it out” that I used for testing, and encouraging me to post user stories for new features. If you’re thinking about building your own census, get on GitHub and read what the team has planned and are busy fixing.

The meeting

I presented to the leaders of Australia’s state and territory open data leaders leaders on 19 Feb and they requested more time to add extra datasets to the census. We agreed to put a Beta label on the census and launch on Open Data Day.

Ready for lift off

The following day CIO Magazine emailed asking for, “a quick comment on International Open Data Day, how you see open data movement in Australia, and the importance of open data in helping the community”. I told them and they wrote about it. The Open Data Institute Queensland and Open Knowledge blogged and tweeted encouraging volunteers to add to the census on Open Data Day. I set up Gmail and Twitter accounts for the census and requested the census to be added to the big list of censuses.

Open Data Day

No support requests were received from volunteers submitting entries to the census (it is pretty easy). The Open Data Day projects included:
  • drafting a Contributor Guide.
  • creating a Google Sheet to allow people to collect census entries prior to entering them online.
  • Adding Google Analytics to the site.

What next?

We are looking forward to a few improvements including adding the map visualisation from the Global Open Data Index to our regional census. That’s why our Twitter account is @AuOpenDataIndex. If you’re thinking about creating your own Open Data Census then I can highly recommend the experience and there is great team ready to support you. Get in touch if you’d like to help with Australia’s Open Data Census. Stephen Gates lives in Brisbane, Queensland, Australia. He has written Open Data strategies and driven their implementation. He is actively involved with the Open Data Institute Queensland contributing to their response to Queensland’s proposed open data law and helping coordinate the localisation of ODI Open Data Certificates. Stephen is also helping organise GovHack 2015 in Brisbane. Australia’s Regional Open Data Census is his first project working with Open Knowledge.