You are browsing the archive for External.

OKCon 2013 Guest Post: Open Data Portal on Land Rights

- July 30, 2013 in Events, External, OKCon, Open Data

Cross-posted from the OKCon Blog. Introducing a series of guest posts by OKCon 2013 speakers that we will publish over the coming weeks. This first post is by Laura Meggiolaro, Land Portal Coordinator, International Land Coalition, who will be speaking on the main stage during the Open Development and Sustainability session on Wednesday 18th September at 10:15.
logo-land-portal-transparenThere is a wealth of information and data online about land governance. However, much of this content is fragmented and difficult to locate, and often it is not openly licensed to enable wide dissemination and reuse. Bringing this information together in one place, actively addressing gaps in the available information, and providing a range of ways for the information to be accessed and shared does increase use of the available information. This supports more informed debates and policy making, and greater adoption and up scaling of best practices and promising innovations, leading to improve land governance practice. Through a focus on localisation of content creation and use, the Land Portal aims at tipping the balance of power towards the most marginalised and insecure, promoting greater social justice in land tenure practices across the world. Access to knowledge is essential for individuals and communities seeking to secure land rights, particularly for women. Stronger networks between government agencies, CSOs, and emerging social movements are needed to support more just, equitable and gender aware land governance. Over recent decades land governance groups have come to use the Internet in their practice, but it’s full potential is by no means realised. The land Portal can support land advocacy and governance, drawing on learning from current practice, and highlighting emerging frontiers of relevance to the field. Recent online dialogue that focused on monitoring women’s land rights in Madagascar demonstrated that the Land Portal as a platform for open content and open data offers a collaborative approaches to land governance. As Madagascar has recently been debating its new progressive tenure reform, it provides an interesting case study to show how internet-based tools such as the Land Portal gives the opportunity – provided the basic infrastructure is available and those accessing it have functional literacy skills – to enhance participation and allow for diversity of insights and perspectives on questions like “is land reform in Madagascar a model for replication?” or “how legal pluralism may restrict or promote women’s access to land?”. Over the last years we found out that online discussions, in particular, are effective means to promote inclusion, knowledge sharing and promote social changes. logo The discussion on “land reform” had the objective of involving civil society to debate experiences of the land reform implementation and which key lessons could be transferred to other countries. The more recent discussion provides an interesting insight of how women’s access to land might be affected by a legal pluralism. Insights from Malagasy people or land experts in region aimed at revising and improving data on the FAO Gender and Land Rights database (GLRD). The LP is based on open source, open data and open content and applies principles of openness in its governance, its use of technology and in its outputs. Through the pursuit of more transparent and open information on land governance the Portal seeks to become a leading example of open development in action. However, the Land Portal does not adopt openness uncritically, but instead focuses in particular on identifying where openness can help tip the balance of power in favour of the marginalised, rather than where openness could ‘empower the already empowered’ (1.). Land Portal seeks to ensure that a diversity of knowledge is included and represented, and that those best placed to act in the interests of those with the most insecure land rights and the greatest vulnerability to landlessness have effective access to the open data and knowledge that is made available. landportal_card Besides documenting land rights, the Portal also encourages social information exchange, debate and networking. It aims at becoming the leading online destination for information, resources, innovations and networking on land issues; support more inclusive and informed debate and action on land governance and increase adoption and up-scaling of best practices and emerging innovation on land tenure. The Land Portal is a partnership project supported by a network of international and grassroots land organisations focussed on land governance, development and social justice. Its innovative approach to engaging stakeholders on the highly complex issue of land governance ensures that the Portal is coordinated, managed and populated with content by the stakeholders and users who are actively involved with land from far and wide.
(1.) Gurstein, M. (2011). Open data: Empowering the empowered or effective data use for everyone? (2.) Link to article
With almost 10 year work experience in the land governance sector collaborating with both UN Agencies and Civil Society Organizations in information and knowledge management, partnerships building and communication for development, Laura is strongly committed towards social change and the improvement of life conditions of disadvantaged groups within societies, focusing in particular on gender dynamics.
Since she has been assigned the overall Land Portal coordination in 2012, she has been leading an in deep project self-assessment and promoting a major re-development of the Portal to better address its main target audiences, respond to the ever-evolving technological innovations and opportunities for better quality and reach, but also to increasingly make the Portal a hub for Open Data and a clear example of open development in action contributing to open land governance information and knowledge in order to increase transparency on land related issues.

Building the foundation for an Open Data Directory

- April 24, 2013 in External, Open Data

Open (Government) Data as it is understood nowadays can still be considered a new concept. It started to gain traction worldwide since the Obama memo in early 2009 and the launch of a few months later. Following successful leading examples of the US and UK governments we have seen Open Data flourishing all over the world over the last three years. About three hundred Open Data catalogues have been identified so far. But still, it’s not always clear how to deliver good solutions and many questions remain unanswered. In order to build sustainable Open Data initiatives in a varied range of countries a broader view to address challenges is needed. New and existing initiatives will benefit from shared knowledge and will also produce a range of resources that should be published in a freely and open way for others to reuse. As the Open Data movement is growing worldwide; the number of available resources is also increasing. The scarcity of only 3-4 years ago is ending but the resources are appearing in disparate places and formats, sometimes difficult to find and share. There is a pressing need to compile and document existing resources that are verified, trustworthy, comparable, and searchable. The Open Data Directory Upon discussions with many in the Open Data community, an initial analysis of their own project needs and preliminary research on existing public resources, the Web Foundation believes that the community at large would benefit from a central entry point to Open Data related resources at a neutral source, the Open Data Directory (ODD). This ODD will help to produce clear evidence base of the benefits of Open Data holding a wide range of resources types such as: use cases, case studies, stories and anecdotes, methodologies, strategies, business cases, papers, reports, articles, blog posts, training materials, slide sets, software tools, applications and visualisations. The directory will not focus on compiling a vast number of references; instead it will give priority to high-quality references endorsed by the Open Data community. As a first step towards the ODD, we are making public the Use Cases and Requirements Draft in order to get comments from the wide community, not only on the content of the document itself but also on the overall idea of the ODD. We’ve published it as a Google Document with comments turned on. This is a tool for you, the Open Data community, so suggestions, feedback and comments are very welcome. Deadline for submitting comments is: April 29th, 2013.

The new PSI Directive – as good as it seems?

- April 19, 2013 in Access to Information, External, Open Data

A closer look at the new PSI Directive by Ton Zijlstra and Katleen Janssen EPP image by European People’s Party CC-BY-2.0, via Wikimedia Commons On 10 April, the European Commission’s Vice-President Neelie Kroes, responsible for the Digital Agenda for Europe, announced that the European Union (EU) Member States have approved a text for the new PSI Directive. The PSI Directive governs the re-use of public sector information, otherwise known as as Open Government Data. In this posting we take a closer look at the progress the EC press release claims, and make a comparison with the current PSI Directive. We base this comparison on the text (not officially published) of the output of the final trialogue of 25 March and apparently accepted by the Member States last week. The final step now, after this acceptance by Member States, is the adoption of the same text by the European Parliament, who have been part of the trialogue and thus are likely to be in agreement. The vote in the ITRE Committee is planned on 25 April, and the plenary Parliament vote on 11 June. Member States will then have 24 months to transpose the new directive into national law, which means it should be in force towards the end of 2015 across the EU.

The Open Data yardstick

The existing PSI Directive was adopted in 2003, well before the emergence of the Open Data movement, and written with mostly ‘traditional’ and existing re-users of government information in mind. Within the wider Open Data community this new PSI Directive will largely be judged by a) how well it moves towards embracing Open Data as the norm, in the sense of the Open Definition, and b) to what extent it makes this mandatory for EU Member States. This means that scope and access rights, and redress options where those rights are denied, charging and licensing practices as well as standards and formats are of interest here. We will go through these points of interest point by point:

Access rights and scope

  • The new PSI Directive brings museums, libraries and archives within its scope; however a range of exceptions and less strict rules apply to these data holders;
  • The Directive builds, as before, on existing national legislation concerning freedom of information and privacy and data protection. This means it only looks at re-use in the context of what is already legally public, and it does not make pro-active publishing mandatory in any way;
  • The general principle for re-use has been revised. Where the old directive describes cases where re-use has been allowed (making it dependent on that approval and thus leaving the choice to the Member States or the public bodies), the new directive says all documents within scope (i.e. legally public) shall be re-usable for commercial or non-commercial purposes. This is the source of the statement by Commissioner Neelie Kroes that a “genuine right to re-use public information, not present in the original 2003 Directive” has been created. For documents of museums, libraries, and archives the old rule applies: re-use needs to be allowed first (except for cultural resources that are opened up after exclusive agreements for their digitisation have ended – see below).

Asking for documents to re-use, and redress mechanisms if denied

  • The way in which citizens can ask to be provided with documents for re-use, or the way government bodies can respond, has not changed;
  • The redress mechanisms available to citizens have been specified in slightly more detail. It specifies that one of the ways of redress should be through an “impartial review body with appropriate expertise”, “swift” and with binding authority, “such as the national competition authority, the national access to documents authority or the national judicial authority”. This, although more specific than before, is not the creation of a specific, speedy and independent redress procedure hoped for.

Charging practices

  • When charges apply, they shall be limited to the “marginal costs of reproduction, provision and dissemination”, which is left open to interpretation. Marginal costing is an important principle, as in the case of digital material it would normally mean no charges apply;
  • The PSI Directive leaves room for exceptions to the stated norm of marginal costing, for public sector bodies who are required to generate revenue and for specifically excepted documents: firstly, they rely once more on the concept of the public task, which in the previous version of the directive has raised so much discussion; secondly, a distinction is made between institutions that have to generate revenue to cover a substantial part of all their costs and those that may generally be fully-funded by the State (except for particular datasets of which the collection, production, reproduction and dissemination has to be covered for a substantial part by revenue). Could this be a way to cover economic or even commercial activities, by defining them as a ‘public task’, thereby avoiding the non-discrimination rules requiring equal treatment of possible competitors?
  • The exceptions remain bound to an upper limit, that of the old PSI directive for the exceptions relating to institutions having to generate revenue. For cultural institutions, the upper limit of the total income includes the costs of collection, production, preservation and rights clearance, reproduction and dissemination, together with a reasonable return on investment;
  • How costs are structured and determined, and used to motivate standard charges, needs to be pre-established and published. In the case of the mentioned exceptions, charges and criteria applied need to be pre-established and published, with the calculation used being made transparent on request (as was the general rule before);
  • This requirement for standard charges to be fully transparent up-front, meaning before any request for re-use is submitted, might prove to have an interesting impact: it is unlikely that public sector bodies will go through establishing marginal costs and the underlying calculations for every data set they hold, but charges can no longer be applied as they have not been pre-established, motivated and published.


  • The new PSI Directive contains no changes concerning licensing, so no explicit move towards open licenses;
  • Where Member States attach conditions to re-use, a standard license should be available, and public sector bodies should be encouraged to use it;
  • Conditions to re-use should not unnecessarily restrict re-use, nor restrict competition;
  • The Commission is asked to assist the Member States by creating guidelines, particularly relating to licensing.

Non-discrimination and Exclusive agreements

  • The existing rules to ensuring non-discrimination in how conditions for re-use are applied, including for commercial activities by the public sector itself, are continued;
  • Exclusive arrangements are not allowed as before, except for ensuring public interest services, or for digitisation projects by museums, libraries and archives. For the former, reviews are mandated every 3 years; for the latter, reviews are mandated after 10 years and then every 7 years. However, it is only the duration that should be reviewed, not their existence itself. In return for the exclusivity, the public body has to get a free copy of the cultural resources which must be available for re-use when the exclusive agreement ends. Here, the cultural institutions no longer have a choice whether to allow re-use, but it may be several years before the resource actually becomes available.

Formats and standards

  • Open standards and machine readable formats should be used for both documents and their metadata, where easily possible, but otherwise any pre-existing format and language is acceptable.
In summary, the new PSI Directive does not seem to take the bold steps the open data movement has been clamoring for over the past five years. At the same time, real progress has been made. Member States with a constructive approach will feel encouraged to do more. Also, the effort of transparency in charging may dissuade public sector bodies from applying charges. But the new PSI Directive will not serve as a tool for citizens aiming for more openness by default and by design. Even with the new redress mechanisms, getting your rights acknowledged and acted upon will remain a long and arduous path as before. It will be interesting to see the European Parliament, as representative body, debate this in plenary.

About the authors

Katleen Jansen is a postdoctoral researcher in information law at the Interdisciplinary Centre for Law and ICT of KU Leuven, coordinator of the LAPSI 2.0 thematic network ( and board member of OKFN Belgium. She specialises in re-use of PSI, open data, access to information and spatial data infrastructures. Currently working on open data licensing for the Flemish Region. Ton Zijlstra has been involved in open government data since 2008. He is working for local, national and international public sector bodies to help them ‘do open data well’, both as an activist and consultant. Ton wrote the first plans for the Dutch national data portal, did a stint as project lead for the European Commission at, and is now partner at The Green Land, a Netherlands based open data consultancy. He is a regular keynote speaker on open data, open government, and disruptive change / complexity across Europe.

Boundless Learning demands a jury trial

- February 15, 2013 in External, Open Content, Open Textbooks

We’ve been following the case of Boundless Learning on the OKF blg (see here and here), in which the world’s most prominent producer of Open Access textbooks online is being sued by the world’s biggest producers of physical, copyrighted textbooks. In the latest twist to the tale, Boundless have filed their answer, requesting a trial by jury. The publishers who are pursuing Boundless – Pearson, Cengage and Macmillan’s Bedford, Freeman & Worth – do not allege that any of their content has been plagiarised, or claim copyright on any of the facts or ideas in their books (since it is impossible to claim copyright on such things). Instead they allege that the ’ “selection, coordination and arrangement” of the unprotectable elements has been pilfered. Boundless counter that following the same basic order in textbooks “is necessitated by the subject matter and standard in these fields” – a claim which they believe will be born out through trials over the coming months. In their press release they say:
At a time when textbook prices have risen at three times the rate of inflation, Boundless is well along the way to turning around this escalation by offering equivalent quality, openly-licensed educational materials online at dramatically lower costs … Boundless will vigorously deny the overly broad and legally flawed allegations made by the publishers … Boundless is confident that it will become evident that its digital textbooks do not violate copyright or any other rights of the plaintiffs.
Boundless have been at the forefront of challenging the oligopoly of the big textbook pubishers, and the outcome of this case will have implications for everyone in the sector. Boundless seem confident that a jury of peers will agree that their efforts are a development in the right direction. The rapidly-expanding world of Open Online Education is watching with baited breath.

Sovereign Credit Risk: An Open Database

- January 31, 2013 in External, Featured, Open Data, Open Economics, WG Economics

This blog post is cross-posted from the Open Economics Blog. Sign up to the Open Economics mailing list for regular updates. Throughout the Eurozone, credit rating agencies have been under attack for their lack of transparency and for their pro-cyclical sovereign rating actions. In the humble belief that the crowd can outperform the credit rating oracles, we are introducing an open database of historical sovereign risk data. It is available at where community members can both view and edit the data. Once the quality of this data is sufficient, the data set can be used to create unbiased, transparent models of sovereign credit risk. The database contains central government revenue, expenditure, public debt and interest costs from the 19th century through 2011 – along with crisis indicators taken from Reinhart and Rogoff’s public database. CentralGovernmentInterestToRevenue2010

Why This Database?

Prior to the appearance of This Time is Different, discussions of sovereign credit more often revolved around political and trade-related factors. Reinhart and Rogoff have more appropriately focused the discussion on debt sustainability. As with individual and corporate debt, government debt becomes more risky as a government’s debt burden increases. While intuitively obvious, this truth too often gets lost among the multitude of criteria listed by rating agencies and within the politically charged fiscal policy debate. In addition to emphasizing the importance of debt sustainability, Reinhart and Rogoff showed the virtues of considering a longer history of sovereign debt crises. As they state in their preface: “Above all, our emphasis is on looking at long spans of history to catch sight of ’rare’ events that are all too often forgotten, although they turn out to be far more common and similar than people seem to think. Indeed, analysts, policy makers, and even academic economists have an unfortunate tendency to view recent experience through the narrow window opened by standard data sets, typically based on a narrow range of experience in terms of countries and time periods. A large fraction of the academic and policy literature on debt and default draws conclusions on data collected since 1980, in no small part because such data are the most readily accessible. This approach would be fine except for the fact that financial crises have much longer cycles, and a data set that covers twenty-five years simply cannot give one an adequate perspective…” Reinhart and Rogoff greatly advanced what had been an innumerate conversation about public debt, by compiling, analyzing and promulgating a database containing a long time series of sovereign data. Their metric for analyzing debt sustainability – the ratio of general government debt to GDP – has now become a central focus of analysis. We see this as a mixed blessing. While the general government debt to GDP ratio properly relates sovereign debt to the ability of the underlying economy to support it, the metric has three important limitations. First, the use of a general government indicator can be misleading. General government debt refers to the aggregate borrowing of the sovereign and the country’s state, provincial and local governments. If a highly indebted local government – like Jefferson County, Alabama, USA – can default without being bailed out by the central government, it is hard to see why that local issuer’s debt should be included in the numerator of a sovereign risk metric. A counter to this argument is that the United States is almost unique in that it doesn’t guarantee sub-sovereign debts. But, clearly neither the rating agencies nor the market believe that these guarantees are ironclad: otherwise all sub-sovereign debt would carry the sovereign rating and there would be no spread between sovereign and sub-sovereign bonds – other than perhaps a small differential to accommodate liquidity concerns and transaction costs. Second, governments vary in their ability to harvest tax revenue from their economic base. For example, the Greek and US governments are less capable of realizing revenue from a given amount of economic activity than a Scandinavian sovereign. Widespread tax evasion (as in Greece) or political barriers to tax increases (as in the US) can limit a government’s ability to raise revenue. Thus, government revenue may be a better metric than GDP for gauging a sovereign’s ability to service its debt. Finally, the stock of debt is not the best measure of its burden. Countries that face comparatively low interest rates can sustain higher levels of debt. For example, The United Kingdom avoided default despite a debt/GDP ratio of roughly 250% at the end of World War II. The amount of interest a sovereign must pay on its debt each year may thus be a better indicator of debt burden. Our new database attempts to address these concerns by layering central government revenue, expenditure and interest data on top of the statistics Reinhart and Rogoff previously published.

A Public Resource Requiring Public Input

Unlike many financial data sets, this compilation is being offered free of charge and without a registration requirement. It is offered in the hope that it, too, will advance our understanding of sovereign credit risk. The database contains a large number of data points and we have made efforts to quality control the information. That said, there are substantial gaps, inconsistencies and inaccuracies in the data we are publishing. Our goal in releasing the database is to encourage a mass collaboration process directed at enhancing the information. Just as Wikipedia articles asymptotically approach perfection through participation by the crowd, we hope that this database can be cleansed by its user community. There are tens of thousands of economists, historians, fiscal researchers and concerned citizens around the world that are capable of improving this data, and we hope that they will find us. To encourage participation, we have added Wiki-style capabilities to the user interface. Users who wish to make changes can log in with an OpenID and edit individual data points. They can also enter comments to explain their changes. User changes are stored in an audit trail, which moderators will periodically review – accepting only those that can be verified while rolling back others. This design leverages the trigger functionality of MySQL to build a database audit trail that moderators can view and edit. We have thus married the collaborative strengths of a Wiki to the structure of a relational database. Maintaining a consistent structure is crucial for a dataset like this because it must ultimately be analyzed by a statistical tool such as R. The unique approach to editing database fields Wiki-style was developed by my colleague, Vadim Ivlev. Vadim will contribute the underlying Python, JavaScript and MySQL code to a public GitHub repository in a few days.

Implications for Sovereign Ratings

Once the dataset reaches an acceptable quality level, it can be used to support logit or probit analysis of sovereign defaults. Our belief – based on case study evidence at the sovereign level and statistical modeling of US sub-sovereigns – is that the ratio of interest expense to revenue and annual revenue change are statistically significant predictors of default. We await confirmation or refutation of this thesis from the data set. If statistically significant indicators are found, it will be possible to build a predictive model of sovereign default that could be hosted by our partners at Wikirating. The result, we hope, will be a credible, transparent and collaborative alternative to the credit ratings status quo.

Sources and Acknowledgements

Aside from the data set provided by Reinhart and Rogoff, we also relied heavily upon the Center for Financial Stability’s Historical Financial Statistics. The goal of HFS is “to be a source of comprehensive, authoritative, easy-to-use macroeconomic data stretching back several centuries.” This ambitious effort includes data on exchange rates, prices, interest rates, national income accounts and population in addition to government finance statistics. Kurt Schuler, the project leader for HFS, generously offered numerous suggestions about data sources as well as connections to other researchers who gave us advice. Other key international data sources used in compiling the database were:
  • International Monetary Fund’s Government Finance Statistics
  • Eurostat
  • UN Statistical Yearbook
  • League of Nation’s Statistical Yearbook
  • B. R. Mitchell’s International Historical Statistics, Various Editions, London: Palgrave Macmillan.
  • Almanach de Gotha
  • The Statesman’s Year Book
  • Corporation of Foreign Bondholders Annual Reports
  • Statistical Abstract for the Principal and Other Foreign Countries
  • For several countries, we were able to obtain nation-specific time series from finance ministry or national statistical service websites.
We would also like to thank Dr. John Gerring of Boston University and Co-Director of the CLIO World Tables project, for sharing data and providing further leads as well as Dr. Joshua Greene, author of Public Finance: An International Perspective, for alerting us to the IMF Library in Washington, DC. A number of researchers and developers played valuable roles in compiling the data and placing it on line. We would especially like to thank Charles Tian, T. Wayne Pugh, Amir Muhammed, Anshul Gupta and Vadim Ivlev, as well as Karthick Palaniappan and his colleagues at H-Garb Informatix in Chennai, India for their contributions. Finally, we would like to thank the National University of Singapore’s Risk Management Institute for the generous grant that made this work possible.

One Graduate Student’s Commitment to Open Knowledge

- January 25, 2013 in External, Open Access

  The following post was originally published on Alex Leavitt’s website. cadenas

Fed Up

I have been a PhD student for less than two years. On the other hand, for six years, I have been a member of the free culture movement, which emphasizes the importance of access to and openness of technology and information. Recently, I’ve been frustrated… sad… angry. Just over a year ago, a friend and fellow member of the free culture community, Ilya Zhitomirskiy, committed suicide. He was 22. Just one week ago, an acquaintance, a friend of many close friends, and — really — a role model, just one year older than myself and networked with many institutions and individuals I have come to work with and/or admire* committed suicide. Aaron Swartz was admired for his bravery to stand up for his ideals, and the work he put into the world demonstrated no less than exactly those ideals. I followed his actions with awe and complete understanding.

*Such as MIT, the Berkman Center, the EFF, Creative Commons; and frankly, there are too many individual people to list.

When I look at the goals that Aaron pursued, I feel disappointed in myself for not also working harder toward similar aspirations. But I want that to change. Though Aaron frequently called for more extreme forms of activism, such as through his Guerilla Open Access Manifesto, I want to begin with what is an easy solution that has been solvable for years, which I do not even think deserves to be called activism. It’s merely what should be. So I have decided that I will further my ideals by refusing to restrict the knowledge I create in outdated publishing models that retain and maintain a detrimental status quo within the academic community. I know that Aaron detested the absurdity of contemporary academic establishments and norms; in many regards, I agree with his sentiments wholeheartedly. But even though I still commit to furthering my career as a pursuer and producer of knowledge, I recognize that things just have to fucking change already. We need to fix this.

Accessible by Default

While I’ve supported and campaigned for open-access in the past as a member of Students for Free Culture, I can no longer support the outdated, profit-driven models of modern academic publishing companies. I feel it is finally time to stand up and challenge the status quo, in which academics send knowledge to journals whose sole purpose of existing is to disseminate that knowledge to others. By blocking access to and charging fees for that knowledge, I believe that journals have failed in the primary purpose for the education system as a whole: to teach and share knowledge with others. There’s an inherent flaw in modern academia: scholars are expected to publish in “high ranking” journals, foundational compilations of academic articles that — over time — have become engrained in the institutional social fabric of knowledge production within the academy. They are, however, closed to the public. But these kinds of journals do nothing for someone like me, a young, digitally networked, curious researcher. Unlike scholars of the past, I no longer wait for journals to appear on my doorstep to gain access to the latest scholarship; the internet, search engines, and personal homepages are my distributors and discovery mechanisms. It angers me that scholars think that the solution to this status quo is to post copies of their articles online. Some academics, in reality, must publish in closed journals and thus decide to free their own writing individually. But in my opinion, that is not enough. By continuing to publish in and thereby support closed journals, we continue to maintain and uphold an outdated mode of knowledge circulation. Scholars need to realize that the base act of publishing in a closed journal continues its existence: even if you make your own knowledge available, others’ may not be. I’m no longer afraid of the threat that publishing in certain closed journals might affect my career. My future depends on my work being relevant and widely read. And I will never support nor desire an institution that would punish me for pursuing those goals.

What I Must Do

I have come to the conclusion that my knowledge should and will be accessible. Therefore, I will only publish openly. I will only publish in open access journals. I will only review for open access publications. I will only sign book and chapter contracts that share copies of the text online (whether licensed through Creative Commons or made available in some other, free form). I will only attend conferences that make any related publications accessible for free. I will also only contribute to open-access publications that do not charge authors inordinate costs for publishing.

What You (and We) Can Do

Change begins when we as a community move forward together. However, absolute change can only come about with absolute decisions. If you are a graduate student:
  • Adopt the same stance: only publish in open access academic venues; refuse all others.
  • Encourage your cohort, classmates, friends, colleagues, teachers, and advisors to do the same.
If you are a senior scholar:
  • Help young scholars like me establish a variety of new and current open venues for publication.
  • Refuse to review for closed journals; volunteer to review for open access publications.
  • Cite scholarly works from open access venues when research is worthy of it.
  • Recognize junior faculty’s efforts in the tenure process for pursuing open access ideals.
  • Petition closed journals to shift their policies to open access.
  • Help spread the word that closed publications are no longer acceptable.
The movement toward open access as a norm within academia has been and will likely be a slow and ongoing process, and many better people than myself have contributed to changing the status quo in substantial ways. But I feel that individual decisions like the one made on this page can contribute to that shift and ultimately change this situation. If you are ready to take the same step, I encourage you to promote your thoughts on your own webpage and spread the word. There’s no image to share, no petition to sign, no badge to display: at this critical and crucial point, there is only action. Alex Leavitt is a PhD student at the Annenberg School for Communication & Journalism at the University of Southern California. Follow him on Twitter at @alexleavitt.

My Environment App Competition

- January 22, 2013 in External, Featured Project, Open Data

Natural England, in partnership with the Environment Agency, is launching a new web-portal service called My Environment. To celebrate its launch, My Environment is running an Apps competition. From the announcement:
Could you create an app that will appeal to mobile device users and help them to engage with nature? The app that best helps bring people closer to nature will win a £5,000 prize. The app should be innovative, available to as many people as possible on Android, Windows Mobile, or iOS and offer new or improved ways for people to get the most from nature.
The My Environment project aims to bring together information about the world around us into one place, to help people connect with nature. It looks at how the UK is responding to environmental challenges, as well as finding out about your local flora and fauna. They’ve made a whole load of data openly available on their data download site which you can use for the App competition. The competition deadline is 17th March 2013 – all the details are on the competition page. Good luck!

Content as Data: Towards Open Digital Publishing with Substance

- January 15, 2013 in External, Featured Project

I’m the maintainer of Substance, an open software platform for collaborative composition and sharing of digital documents. In this little essay, I’d like to sketch the challenges that modern publishing systems face today. I’d also like to come up with ideas for how the complexity can be handled with the help of specialized user-driven software. With Substance, our team is implementing the proposed ideas, not only to verify their feasability but also to release a crowd-funded open source publishing system as a modern alternative to existing publishing solutions.

Challenges of modern digital publishing

Content Creation

In the beginning there is content creation. Text needs to be written, annotated and structured. Images, tables and lists need to be added. Today, the addition of interactive content (such as visualizations, interactive maps, etc.) is becoming increasingly important and should be supported by publishing systems.

Quality Management

In order to ensure quality and consistency of digital content, publishers need to setup a peer-review workflow. Publishing systems must provide means for reviewers to suggest ideas and report errors, and should offer easy communication between reviewers and authors.


Once the content is complete, it’s time to share it with the public. Publishing used to be a manual process involving expert knowledge and consuming a substantial amount of time, but software tools have now automatized it. Authors should be able to share their work online themselves, without any manual steps being necessary.

Change Management

Even if the best review processes were applied, all published content is prone to inconsistencies and errors. Over time, pieces of content need to be updated, additional information added, and new editions of the document published to ensure eventual consistency. Therefore publishing systems must support incremental updates, and offer an easy way to release new editions.

Content distribution

Content distribution is becoming an increasingly complex issue today. Back in the day there was just paper, but today content is also consumed electronically on different platforms: on the web, Ebook Readers and Smartphones. And they are all different. There needs to be a smart mechanism that automagically turns one publication into a variety of formats (ePub, HTML, PDF) so as a publisher, you don’t have to prepare versions for each and every targeted platform.


Over the last 2 years we have done a lot of research, and built a working prototype, Substance. Here are some of our conclusions about what makes a good digital publishing system.

Separate content from presentation

Many traditional publishing systems store rich markup (such as HTML) and thus only allow publishing in that same format, which is a serious limitation when it comes to multi-platform publishing. How would you turn an HTML document into a printable version? How would you optimize it for Ebook Readers that use special document formats (e.g. Amazon Kindle)? By storing content (plain text) and annotations separately, it becomes possible for users to transform their documents into any format.

Structured Composition

By considering a document as a sequence of content elements rather than one big chunk of formatted text, document editors can provide a better way of writing semantically structured content, as opposed to the visually optimized layouts offered by traditional word-processors.

Offline Editing

We’ve seen the drawbacks of fully web-based publishing solutions, such as high latency and data loss. It turned out that providing an offline editor can significantly improve the user experience during editing and guarantee data security. Importantly information will stay private until users synchronize their publication with a public web-server. Moreover, authors can continue editing even if there’s no internet connection.

Support for Collaboration

Authoring a publication is an iterative process that may involve many authors simultaneously editing a single document. During that process users should be enabled to add markers and comments to certain text fragments. Markers and comments should stay out of the user’s way, and only be shown contextually (e.g. when the text cursor is matching the corresponding marker).


Editing requirements are fundamentally different among application domains. While book authors are comfortable with headings, text and images, scientists may demand more advanced content types such as formulas, data-tables and visualizations to prove their research findings. Publishing systems should feature a simple plugin system in order to allow user-specific content types.


Substance is a content creation tool and a simple multi-format publishing platform. Whether you produce stories, books, documentations or scientific papers, Substance provides the tools that allow you to create, manage and share your content. It is being built with the idea of Open Knowledge in mind. So if you’d like to publish, comment on and collaborate around public documents, open access research or public domain texts, Substance may be for you. The Substance eco-system consists of an offline editing tool (The Substance Composer) and an online multi-platform publishing system (

Open Source

Behind the scenes Substance is mainly composed by a stack of open source modules that are publicly released under the Open Source MIT license, so anyone can contribute and help developing an open standard for annotated text. Substance is meant to be an interoperable platform, rather than a product. Its infrastructure can be used by anyone to build specialized applications on top of it.

Content is data

Substance considers content as data, which makes them accessible to computers and allows new ways of processing them. Documents can not only be viewed, they can be queried, just like a database.

Semantic editing

Instead of editing a big canvas, Substance documents are composed of content elements. While existing solutions (like Google Docs) bring traditional word-processing to the web, Substance focuses on content, by leaving the layout part to the system, not the user. Because of the absence of formatting utilities, it suggests structured content-oriented writing.

Custom Element Types

Substance comes with a predefined set of element types. Out of the box you can choose from headings, text and images. However, it is easy to implement your own element types, and use it with your tailored version of the Substance Composer.

Substance needs your help!

After a two-year experimental phase, our next goal is getting it ready for production. So we sat down, worked out a roadmap and launched a campaign to back development costs. With the help of your donation we could make it happen, which is exciting! If you like the ideas proposed and want to see them in action, please support us. Lastly, here’s how the Substance Composer looks in action:

Open Data BC Summit – Call for Speakers

- January 7, 2013 in Events, External

A little note on behalf of Nelson Lah, Chair of the Open Data Society of British Columbia, Canada. The Open Data Society of BC is hosting the BC Open Data Summit on February 19, 2013 in downtown Vancouver at SFU Segal Graduate School of Business at 500 Granville Street. We want you to be part of this conversation. We’re especially proud of what the BC open data community has accomplished over the past three years since we first started growing our community. At the beginning, the thought of accessing data on a government web site was tricky business (imagine wanting access to government data!). We had to sort out challenges around licensing, formats, methods of access…the works. Through many great conversations and threads here and elsewhere, we have thankfully sorted out much of the “WHAT” of open data. Now that we have some open data accessible to us, we thought it would be helpful to focus on the value that it can bring. We want to explore how it is being used in academia, in evidence-based decision making, in fighting corruption, in uncovering new opportunities, and in creating economic value, businesses and jobs. That’s why we’re inviting you to join us at the BC Open Data Summit in February. We are calling you to come out and make a short presentation. Share your ideas, and what you or your organization has done with open data to create value. Proposals are due by January 13. More information is available here

Prescribing Analytics: how drug data can save the NHS millions

- December 17, 2012 in Exemplars, External, Open Data, Open Science

Last week saw the launch of (covered in the Economist and elsewhere). At present it’s “just” a nice data visualisation of some interesting open data that show the NHS could potentially save millions from its drug budget. I say “just” because we’re in discussions with several NHS organizations about providing a richer, tailored, prescribing analytics service to support the best use of NHS drug budgets. Working on the project was a lot of fun, and to my mind the work nicely shows the spectacular value of open data when combined with people and internet. The data was, and is, out there. All 11 million or so rows of it per month, detailing every GP prescription in England. Privately, some people expressed concern that failure to do anything with the data so far was undermining efforts to make it public at all. Once data is open it takes time for people to discover a reason for doing something interesting with it, and to get organized to do it. There’s no saying what people will use the data for, but provided the data isn’t junk there’s a good bet that sooner or later something will happen. The story of how came to be is illustrative, so I’ll briefly tell my version of it here… Fran (CEO of emailed me a few months ago with the news that she was carrying out some testing using the GP prescribing data. I replied and suggested looking at prescriptions of proprietary vs generic ACE-inhibitors (a drug that lowers blood pressure) and a few other things. I also cc’d Ben Goldacre and my good friend Tom Yates. Ben shared an excellent idea he’d had a while ago for a website with a naughty name that showed how much money was wasted on expensive drugs where there was an identically effective cheaper option and suggested looking at statins (a class of drug that reduces the risk of stroke, heart attack, and death) first. Fran did the data analysis and made beautiful graphics. Ben, Tom, and I, with help from a handful of other friends, academics, and statisticians provided the necessary domain expertise to come up with an early version of the site which had a naughty name. We took counsel and decided it’d be more constructive, and more conducive to our goals, not to launch the site with a naughty name. A while later the offered to support us in delivering In no particular order, Bruce (CTO of, Ayesha (, Sym Roe, Ross Jones, and David Miller collaborated with the original group to make the final version. I’d call the way we worked peer production, a diverse group of people with very different skill sets and motivations formed a small self-organizing community to achieve the task of delivering the site. I think the results speak for themselves, it’s exciting, and this is just the beginning :-) Notes
  1. Mastodon C is a start-up company currently based at The Open Data Institute. The Open Data Institute’s mission is to catalyse the evolution of an open data culture to create economic, environmental, and social value.

  2. Open Health Care UK is a health technology start-up

  3. About Ben Goldacre

  4. Full research findings and details on methodology can be found at: