You are browsing the archive for Essays.

That to Study Philosophy is to Learn to Die (1580)

- August 29, 2013 in collections, Digital Copy: No Additional Rights, Essays, Internet Archive, Michel de Montaigne, texts, Texts: 16th and older, Texts: Non-fiction, That to study philosophy is to learn to die. death, Underlying Work: PD Worldwide, University of Toronto Libraries

…let us learn bravely to stand our ground, and fight him. And to begin to deprive him of the greatest advantage he has over us, let us take a way quite contrary to the common course. Let us disarm him of his novelty and strangeness, let us converse and be familiar with him, and have nothing so frequent in our thoughts as death. Upon all occasions represent him to our imagination in his every shape; at the stumbling of a horse, at the falling of a tile, at the least prick with a pin, let us presently consider, and say to ourselves, ‘Well, and what if it had been death itself?’ and, thereupon, let us encourage and fortify ourselves. Let us evermore, amidst our jollity and feasting, set the remembrance of our frail condition before our eyes, never suffering ourselves to be so far transported with our delights, but that we have some intervals of reflecting upon, and considering how many several ways this jollity of ours tends to death, and with how many dangers it threatens it. The Egyptians were wont to do after this manner, who in the height of their feasting and mirth, caused a dried skeleton […]

Prizewinning bid in ‘Inventare il Futuro’ Competition

- November 5, 2011 in Annotator, Bibliographic, Essays, Featured Project, Free Culture, Musings, News, OKF Projects, Open Shakespeare, Public Domain, Public Domain Works, texts, WG Humanities, WG Open Bibliographic Data

By James Harriman-Smith and Primavera De Filippi On the 11th July, the Open Literature (now Open Humanities) mailing list got an email about a competition being run by the University of Bologna called ‘Inventare il Futuro’ or ‘Inventing the Future’. On the 28th October, Hvaing submitted an application on behalf of the OKF, we got an email saying that our idea had won us €3 500 of funding. Here’s how.

The Idea: Open Reading

The competition was looking for “innovative ideas involving new technologies which could contribute to improving the quality of civil and social life, helping to overcome problems linked to people’s lives.” Our proposal, entered into the ‘Cultural and Artistic Heritage’ category, proposed joining the OKF’s Public Domain Calculators and Annotator together, creating a site that allowed users more interaction with public domain texts, and those texts a greater status online. To quote from our finished application:
Combined, the annotator and the public domain calculators will power a website on which users will be able to find any public domain literary text in their jurisdiction, and either download it in a variety of formats or read it in the environment of the website. If they chose the latter option, readers will have the opportunity of searching, annotating and anthologising each text, creating their own personal response to their cultural literary heritage, which they can then share with others, both through the website and as an exportable text document.

As you can see, with thirty thousand Euros for the overall winner, we decided to think very big. The full text, including a roadmap is available online. Many thanks to Jason Kitkat and Thomas Kandler who gave up their time to proofread and suggest improvements.

The Winnings: Funding Improvements to OKF Services

The first step towards Open Reading was always to improve the two services it proposed marrying: the Annotator and the Public Domain Calculators. With this in mind we intend to use our winnings to help achieve the following goals, although more ideas are always welcome:
  • Offer bounties for flow charts regarding the public domain in as yet unexamined jurisdictions.
  • Contribute, perhaps, to the bounties already available for implementing flowcharts into code.
  • Offer mini-rewards for the identification and assessment of new metadata databases.
  • Modify the annotator store back-end to allow collections.
  • Make the importation and exportation of annotations easier.
Please don’t hesitate to get in touch if any of this is of interest. An Open Humanities Skype meeting will be held on 20th November 2011 at 3pm GMT.

Scaling the Open Data Ecosystem

- October 31, 2011 in Essays, Musings, News, Open Data, Open Knowledge

This is a post by Rufus Pollock, co-Founder of the Open Knowledge Foundation. As reported elsewhere I’ve been fortunate enough to have my Shuttleworth Fellowship renewed for the coming year so that I can continue and extend my work at the Open Knowledge Foundation on developing the open data ecosystem. The following text and video formed the main part of my renewal application.

Scaling the Open Data Ecosystem

Describe the world as it is.

The last several decades the world has seen an explosion of digital technologies which have the potential to transform the way knowledge is disseminated. This world is rapidly evolving and one of its more striking possibilities is the creation of an open data ecosystem in which information is freely used, extended and built on. The resulting open data ‘commons’ is valuable in and of itself, but also, and perhaps even more importantly, because the social and commercial benefits it generates — whether in helping us to understand climate change; speeding the development of life-saving drugs; or improving govenance and public services. In developing this open data ecosystem there are three key things are needed: material, tools and people. This is a key point: open information without tools and communities to utilise it is not enough, after all, openness isn’t an end itself – open material has no value if it isn’t used.We need therefore to have widely available the capabilities for utilising open material, for processing, analysing and sharing it, especially on a large scale. Relevant tools need to be freely and openly available and the related infrastructure — after all tools need somewhere to run, and data needs somewhere to be stored — should be capable of effective deployment by distributed communities. Over the last few years we’ve started to see increasing amounts of open material made available, with release of open data really starting to take off in the last couple of years. But the (open) tools and the communities to use them are still very limited — we’re just starting to see the first self-identified “data wranglers / data hackers / data scientists” (note how the terms have not settled yet!). Key architectural elements of the ecosystem, such as how we create and share data in an open componentized way, are only just beginning to be worked through. We are therefore at a key moment where we transition from just ‘getting the data’ (and building the app) to a real data ecosystem in which data is transformed, shared and reintegrated and we replace a ‘data pipeline’ with ‘data cycles’.

What change do you want to make?

I want to see a world in which open data – data that can be freely shared and used without restriction – is ubiquitous and in which that data is used to improve the world around us, whether by finding you a better route to work, helping us to prevent climate change, or improving reportage. I want open data to allow us to build the tools and systems to help us navigate and managing the increasingly complex information-based world in which we now live. Specifically, I want to help grow the emerging open data ecosystem. While part of this involves supporting and expanding the ongoing release of material — building on the major progress of the last few years — the biggest change I want to make is develop the tools and communities so that we can make effective use of the increasing amounts of open data is now becoming available. Particular changes I want to make are:
  • Development of real ‘data cycles’ (especially for government data). By data cycles I mean a process whereby material is released, it’s used and improved by the community and then that work finds its way back to the data source.
  • Greater connection of open data to journalists and other types of reporters/analysts who can use this data and bring it to a wider audience.
  • Development of an active and globally-connected community of open data wranglers.
  • Development of better open tools and infrastructure for working with data, especially in a distributed community using a componentization approach that allow us to scale rapidly and efficiently.

What do you want to explore?

I’m interested in learning more about the actual and potential user communities for open data. I want to explore what they want — in relation to both tools and data — and, also their awareness of what is already out there. I’m especially interested in areas like journalism, government, and the general civic hacker community. I want to explore the processes around ‘data refining’ — obtaining, cleaning and transforming source data into something more useful and data ‘analysis’ (usually closely related tasks). I’m especially interested in existing business activity in this area — often labelled with headings like business intelligence and data warehousing. I want to see what we could learn from business regarding tools and process that could be used in the wider open data community as well as how the business community can take advantage of open data. I want to explore how we can connect together the distributed community of data wranglers and hacktivists, focusing on a specific area like civic information or finances. How do we allow for loose networks across different location and different organisations while sharing information and collaborating on the development of tools. Lastly, I want to explore the tools and processes needed to support decentralised, collaborative, and componentised development of data. How can we build robust and scalable infrastructures? How can we build the technology to allow people to combine multiple sources of official data in a wiki-like manner – so that changes can be tracked, and provenance can be traced? How can we break down data into smaller manageable components, and then successfully recombine them again? How can we ‘package’ data and create knowledge APIs to enable automated distribution and reuse of datasets? How can we achieve real read/write status for official information – not just access alone?

What are you going to do to get there?

I want to focus my efforts in this next year on 3 key areas, breaking new ground but also building on existing work I’ve been doing with the Open Knowledge Foundation. First, I want to build out CKAN software and community from a registry to a data hub – a platform for working with data not just listing it. The last year has seen very significant uptake of the CKAN with dozens of CKAN instances around the world including several official government and institutional deployments. Improving and expanding CKAN we will allow us to capitalize on this success to make CKAN into an essential tool and platform for open data “development”. The most important aspect of the software side of this will be the development of a datastore component supporting the processing and visualization of data within CKAN. With features like these CKAN can become a valuable tool not just for tech-savvy data ‘geeks’ but for the more general users of data such as journalists and civil servants. Engaging this wider, “non-techy” audience is a key part of scaling up the ecosystem. It is important to emphasize that this won’t just be about developing software but is about understanding and engaging with the a variety of data-user communities, exploring how they work, what they want and how they can be helped. Second I want to build out the OpenSpending platform and community. OpenSpending is Where Does My Money Go Goes globalized — a worldwide project to ‘map the money’. Following the successful launch of Where Does My Money Go last autumn in the UK, in the last 6 months we have dramatically expanded of coverage with data now from more than 15 countries (in May our work on Italy received coverage in La Stampa, the Guardian and other major newspapers). Working with OpenSpending complements work on CKAN because it is a chance to act as a data user and refiner — we already have some basic integration with CKAN but it’s still very basic. Furthermore, OpenSpending presents the chance to develop a specific data wrangler / data user community and one which can and should have close links with users and analysts of data including journalist and civic ‘hacker’ groups. In this way OpenSpending can act as a microcosm and prototype for developments in the wider open data community. Third, I want to develop the OKF Open Data Labs. Much like the “Google Labs” for Google’s web services, Mozilla Labs for the Web, and the “Sunlight Labs” for US transparency websites, I would like the “Open Data Labs” to be a place for coders and data wranglers to collaborate, experiment, share ideas and prototypes, and ultimately build a new generation of open source tools and services for working with open data. The labs would form a natural complement to the my other activities with CKAN and OpenSpending – the Labs could build on material and tools from those projects while simultaneously acting as an incubator for new extensions and ideas useful both there and elsewhere.

Open Data: a means to an end, not an end in itself

- September 15, 2011 in Essays, Open Data, Open Knowledge

The following is a post by Rufus Pollock, co-Founder of the Open Knowledge Foundation. In almost all the talks I give about open data or content, I aim, at least once, to make the statement along the lines: “Openness for data and content is not an end in itself, it’s a means to an end” This, of course, begs the question: if open data is a means and not an end in itself, what are the real ends that we are seeking? The real ends are the improved creation, processing and use information for the purpose of bettering our lives and the world around us — finding a better way to travel to work, understanding and addressing climate change, finding better ways to cure and prevent disease, deciding who to vote for, the list goes on and on because it includes almost anything where information, and more specifically digital information is or could be important. Now, there are many things that contribute to us improving the “creation, processing and use of information” but the following are especially important (and interlink):
  1. Scalability — i.e. dealing with larger and larger amounts of information
  2. Improved tools, techniques and process for handling that information
  3. Wide access to the raw data and content
(I’d also add a fourth item: to create, process and use information in a collaborative, distributed and decentralized manner that puts ‘information power’ — the power to access, understand and utilize information — in the hands of the many rather than concentrating it in the hands of the few. However, I have left this out as it could be argued that this is not a requirement for improvement but an additional, and separate, desiderata.) It is at this point that openness enters: openness — both of data and of tools — is central to making rapid progress in each of these areas:
  1. Scalability: successful ‘data scaling’ requires componentization — the breaking up material into maintainable chunks (components) that can be recombined. However, without openness componentization cannot function because the recombination of components will rapidly become impossible due to the need to check and clear rights with so many different sources of data (and incompatibilities between the conditions imposed by different sources).

  2. Tools, technique and process. Open data makes it much easier to develop and share tools, techniques and processes for working with data. Moreover, without open data the application of those tools can be severely limited.

  3. Wider access to the material: given the vast amount of material becoming available we’re going to want as many people as possible (and not just ‘professionals’) to be able to access, experiment with and redistribute that data as easily as possible. Remember the many minds principle: the best thing to do with your data will be though of by someone else.

Summing Up

Open data, then, is a means to an end not an end in itself. Openness is important to the extent it helps us do something “useful” — not because it is valuable in and of itself. I think it’s important to emphasize this point because as the open data movement grows, we need to be clear that open data is not some magic potion that, on its own, will automatically solve problems. Fundamentally, to be useful data (open or otherwise) needs to be used: it needs individuals and institutions to analyze it and to act on that analysis, it needs companies and communities to build apps and services with it, and it needs tools and processes developed to facilitate doing those activities. This is not to underestimate the value of openness: as argued above, it is central to making significant progress in “doing useful stuff”, but we must also avoid the trap of confusing means with ends, and thereby neglecting the many other changes that are needed if open data is to deliver full value.

Forthcoming Series of Open Articles on Open Shakespeare

- September 5, 2011 in Essays, News, OKF Projects, Open Shakespeare, texts

This is a cross-posting from Open Shakespeare to announce the culmination of a project run over the summer to encourage greater participation in the website and greater awareness of its goals of promoting open critical commentary. From Monday 12th September to Monday 10th October, Open Shakespeare will host a series of articles on the topic of ‘Shakespeare and the Internet’. When we invited contributions, the theme was deliberately kept as broad as possible in order to facilitate a wide and diverse range of responses from each of those who have written a post for us. Our contributors range from teachers and students of Shakespeare to an experimental theatre company. Having already read the majority of the contributions, I can say now that the series fulfils its goal of offering what the Bard would call a “multitudinous” range of approaches to the topic of Shakespeare and the Internet; subjects range from why Polonius would appreciate hypertext to the problems and opportunities of online abundance. The contributions will appear in the following order: Every article in this series is published under a Creative Commons 3.0 SA BY licence; as with all the other material on Open Shakespeare, we hope that publication under such a licence will encourage the diffusion and development of our contributors’ ideas. My thanks to all those who have contributed their time and thoughts to this project, particularly Erin Weinberg, whose proof-reading skills have been extremely useful in the preparation of these pieces for publication. Depending on the success of this series, we intend to publish similar, themed posts under an open licence in the future: if you would like to participate as either a writer or an editor, please get in touch through the usual channels. Now, to conclude, I leave you, I hope, in approximately the same state of anticipation as Leonato leaves an impatient Claudio in Much Ado about Nothing:
till Monday [...] which is hence a just seven-night; and a time too brief too, to have all things answer my mind.

Sand dunes, civil society and legal structures in the cloud

- June 24, 2011 in Essays, Featured Project, Guest post, OKCon

The following guest post is by Charles Armstrong, social scientist, entrepreneur, and Founder of the One Click Orgs project, which the OKF supports. Charles will be joining us at OKCon2011 for his talk, One Click Orgs: simple democratic organisation Along the shoreline of the North Atlantic marram grass plays a vital role in the coastal ecosystem. Its tough root networks enable it to stabilise shifting sand dunes and create conditions where a broader ecology can start to develop. It is tempting to see the role that clubs, associations and cooperatives play in society in a similar light. They forge durable networks of trust and shared interest which bind people together and help to stabilise the ever-shifting ferment of society, laying the ground for other kinds of collaboration to flourish.
Marram Grass
Photo: Kieran Campbell. CC-BY-SA. Civil society organisations possess two qualities which are particularly useful from this perspective. First, they are formally constituted. A loose informal collaboration is likely to have a markedly different impact from the same set of people collaborating as a constituted organisation. The process of creating an organisation propagates a fixed consciousness of shared identity and purpose which subtly alters the perspective of members and the outside world. A constitution provides a set of rules and processes which structure the collaboration and increase the members’ ability to achieve their objectives. Forming an organisation gives birth to an enduring entity which may outlive every one of its original members. Second, civil society organisations are democratic. The vast majority are structured to be governed collectively by their members. In some cases decisions are voted on directly by all a group’s members. In other cases members elect officers or a representative committee to make day to day decisions. But either way a member who feels strongly about some aspect of what the organisation is doing has ways to control the choices that are made. The importance of these small-scale democracies is inestimable. Civil society organisations provide a setting where people can increase their literacy in democratic participation, focused on issues and decisions which directly concern them. There is no better nursery for active and effective involvement in the larger democracies of cities and nations. A thriving ecosystem of civil society organisations has been recognised as an essential part of a healthy society since at least the eighteenth century. But it is arguably more important now than in any previous period. Throughout the rich world states are scaling back social services at the same time as local resources such as schools, shops, post offices and pubs are disappearing. In many cases the best hope of filling the gaps is for citizen groups to come together and form local organisations to take over provision of vital services. In the UK this approach is openly advocated by the government’s Big Society initiative. But the challenges facing communities trying to form effective organisations and draw in people to sustain them are formidable. At the same time the poor world is being swept by a quite different wave of change. A growing number of authoritarian regimes are waking up to find citizen groups organising themselves to press demands for democratic reform. Many of these societies have little or no civil society infrastructure. Turning the anger and idealism of citizen uprisings into sustainable democratic fabric requires an ecosystem of civil society organisations to spring up and take root a hundred times faster than happened in Western Europe or the USA. These circumstances throw a spotlight on the factors which inhibit the formation of civil society organisations or participation in them. Under this spotlight two barriers stand out in particular. The complexity of forming an organisation, and the effort required to sustain active involvement. On the first point, anyone contemplating setting up a new organisation is faced with a bewildering variety of different legal structures, for each of which a multitude of different constitutions is available. Founders typically go through a crash course to educate themselves on the arcana of corporate law, contract law and charity law. The result is fewer organisations being formed and too many organisations being created with an off-the-shelf structure which fits their needs poorly. On the second point, once an organisation is formed anyone who wants to be an active participant in shaping its direction must be prepared to attend meetings, study agendas and minutes, get their head around voting procedures and know how to table resolutions in the proper fashion. Not surprisingly this discourages an awful lot of people from getting involved, even if they strongly support a group’s objectives. In the past little attention has been paid to these two barriers because, well, that’s just how it was. However, just at the moment a dramatic increase in new organisation formation and participation is needed most urgently, an innovation has appeared which makes it possible. The internet has already triggered revolutions in countless complex processes such as airline reservations, organising auctions and filing tax returns. Now organisational structures are being reinvented to take advantage of the internet’s capacity for instantaneous communication, automation and complex many-to-many interaction. Say hallo to the virtual organisation. From a legal perspective virtual organisations are no different from traditional ones. They maintain all the outside characteristics of established structures such as cooperatives, corporations or partnerships. But whilst the exterior form remains the same most of what lies under the skin is re-engineered. In a virtual organisation the constitution is welded onto an electronic system which automates all the logic governing membership, board appointments, voting and constitutional amendments. One result of this automation is that processes which previously required meetings or written resolutions can now happen online with the system taking care of all the bureaucratic tedium. The experience of participating in a virtual organisation resembles many familiar web activities. Founding an organisation is like creating an email group. Proposing a resolution is similar to posting on a forum. Voting is like responding to an online poll. Proposing an amendment to the constitution is no more complex then changing the moderation settings on a blog (triggering an automatic vote to authorise it). Different jurisdictions vary enormously in their openness to virtual organisations. Currently two of the most favourable are the State of Vermont, which in 2008 reformed its corporation law with this specific objective, and the UK, whose 2006 Companies Act achieved the same end largely by accident. It is perhaps no coincidence that the two initiatives which have pioneered the development of virtual organisations are the Digital LLC project at Harvard University’s Berkman Centre, led by Oliver Goodenough – who happens to be Professor at Vermont Law School; and One Click Orgs, an open source project based in London (which I helped set up). Whilst the Harvard project is working to virtualise profit-making corporations One Click Orgs has focused on the opportunities for civil society. In March 2011 the project launched a website (oneclickorgs.com) where community groups can create an unincorporated association and the tools to run it, completely free of charge. If you’re interested you can go there, right now, and set up your own association. It’s a 100% Open Service so you’re also free to reuse and adapt the code and constitutions.
The One Click Orgs platform generates a constitution which most banks will accept to open a shared account, a simple voting system for group decisions, a record of resolutions the group has passed and an official list of members who have been voted in. One Click Orgs is also working with London Hackspace to create a virtualised platform for Companies Limited by Guarantee, the UK equivalent of a non-profit corporation. Finally it has just started scouting for funders to support development of a platform for Industrial and Provident Societies, recently rebranded as Community Benefit Societies. These are the ideal structures for community groups to take on the running of local facilities and services such as post offices, shops and village halls.
From an open knowledge perspective virtual organisations are significant because they represent a radical increase in transparency around an organisation’s constitution and governance processes. Few members of traditional community associations or non-profits ever actually read their group’s constitution or think about how it could be improved. With a virtual organisation everything is visible and malleable, helping members participate effectively and increasing the chance the organisation will be able to evolve over time and continue to advance its objectives. Virtual organisations have all kinds of other interesting implications. The time is not far away when email groups and World of Warcraft guilds will be able to incorporate themselves, sell shares and enter into contracts with each other. Approaches to investment are also liable to change dramatically as a fusion of virtual organisations and crowdfunding offers the capacity to tie investment to direct participation in decisions, potentially scaling to millions of shareholders. Eventually the governance of nation states is bound to be affected by these new possibilities, which I have explored a little in my writings on Emergent Democracy. But these speculations belong in a different article. In the meantime I would urge anyone working to increase civil society capacity to start looking for ways to enable virtual organisations in the part of the world where they’re active. There is still a lot of work to do to update legal frameworks in more conservative jurisdictions (California is notably backward). But if that leads to a doubling or tripling of the rate at which new organisations are created, and a similar increase in levels of citizen participation, the effort will seem trivial in comparison to the benefits for society. It is as if, just at the moment one’s village is about to be swamped by sand dunes, a new variety of marram grass is discovered which grows twenty times faster than the old stock. It’s time to get planting.

Building the (Open) Data Ecosystem

- March 31, 2011 in Essays, Ideas, Musings, Open Data, Open Knowledge

The following is a post by Rufus Pollock, co-Founder of the Open Knowledge Foundation.

The Present: A One-Way Street

At the current time, the basic model for data processing is a “one way street”. Sources of data, such as government, publish data out into the world, where, (if we are lucky) it is processed by intermediaries such as app creators or analysts, before finally being consumed by end users1. It is a one way street because there is no feedback loop, no sharing of data back to publishers and no sharing between intermediaries. So what should be different? Data Ecosystem - Current

The Future: An Ecosystem

What we should have is an ecosystem. In an ecosystem there are data cycles: infomediaries — intermediate consumers of data such as builders of apps and data wranglers — should also be publishers who share back their cleaned / integrated / packaged data into the ecosystem in a reusable way — these cleaned and integrated datasets being, of course, often more valuable than the original source. Data Ecosystem - Future In addition, corrected data, or relevant “patches” should find their way back to data producers so data quality improves at the source. Finally, end users of data need not be passive consumers but should be also be able to contribute back — flagging errors, or submitting corrections themselves. With the introduction of data cycles we have a real ecosystem not a one way street and this ecosystem thrives on collaboration, componentization and open data. What is required to develop this ecosystem model rather than a one way street? Key changes include (suggestions welcome!):
  • Infomediaries to publish what they produce (and tools to make this really easy)
  • Data packaging and patching format (better ways to publish and share data)
  • Publisher notification of patches (pull requests) with automated integration (merge) tools
We’re starting to see the first versions of tools that will help here2: for example Google Refine with its javascript “patch” format, Scraperwiki and our own CKAN Data Management System. And for some items, such as increased sharing and publication, evangelism and a change in attitude (to a world in which the default is to publish) may be more important than tools — though even here make better tools can make easier to publish as well as provide incentives to do so (because publishing gives you access to those tools). But it is just a beginning and there’s still a good way to go before we’ve really made the transition from the one-way street to a proper (open) data ecosystem.

Annexe: Some Illustrations of Our Current One Way Street

Currently it’s common to hear people describe web-apps or visualizations that have been built using some particular dataset. However, it’s unusual to hear them then say “and I published the cleaned data, and the data cleaning code back to the community in a way that was reusable” and even rarer to hear them say “and the upstream provider has corrected the errors we found in the data based on our reports”. (It’s also rare to hear people talk about the datasets they’ve created as opposed to the apps or visualizations they’ve built). We know about this first hand. When the UK government first published its 25k spending data last Autumn we worked hard as part of our Where Does My Money Go? project to process and load the data so it was searchable and visualizable. Along the way we found data ‘bugs’ of the kind that are typical in this area — dates presented as 47653 (time since epoch), dates presented in inconsistent styles (US format versus UK format), occasional typos in names of departments or other entities etc. We did our best to correct these as part of the load. However, it’s doubtful any of the issues we found got fixed (and certainly not as a result of our work) and we also didn’t do much to share with other datawranglers who were working on the data. Why was this? First, there is no mechanism to feed back to the publishers (we did notify data.gov.uk of some of the issues we found but it is very hard for them to act on this — the precise ‘publisher’ within a department may be hard to identify and may even be a machine (if the data is automatically produced in some way). Second, there is no easy format in which to share fixes. Our cleaning code was public on the web but a bunch of python if statements is not the best ‘patch’ format for data. In a perfect world we’d have a patch format for data (or even just for csv’s) — and one that was algorithmic not line-based (so one could specific in one statement column X is wrongly formatted in this way rather than have 10k line changes); that was easily reviewable (we’re patching government spending data here!); and automatically apply-able (in short a patch format with tool support).

  1. I’m inevitably simplifying here. For example, there is of course some direct consumption by end users. There is some sharing between systems etc. 
  2. I discussed some of the work around data revisioning in this previous post We Need Distributed Revision/Version Control for Data
Related posts:
  1. Open Bibliographic Data: How Should the Ecosystem Work?
  2. Momentum building for open government data in Norway
  3. Articles in CTWatch Quarterly