Global Open Data Index 2015 – Uruguay Insight
Open Knowledge Foundation - December 9, 2015 in Global Open Data Index, Ideas and musings, Open Knowledge
Open Knowledge Foundation - December 9, 2015 in Global Open Data Index, Ideas and musings, Open Knowledge
Heather Leson - September 11, 2014 in community, Community Session, Events, Ideas and musings
Heather Leson - July 9, 2014 in Events, Ideas and musings, Interviews, network, OKFest, OKFestival, Open Knowledge Foundation
Rufus Pollock - May 22, 2014 in Featured, Ideas and musings, privacy
Court holds that the operator [e.g. Google] is, in certain circumstances, obliged to remove links to web pages that are published by third parties and contain information relating to a person from the list of results displayed following a search made on the basis of that person’s name. The Court makes it clear that such an obligation may also exist in a case where that name or information is not erased beforehand or simultaneously from those web pages, and even, as the case may be, when its publication in itself on those pages is lawful.At first glance, this decision has some rather substantial implications, for example:
In 2010 Mario Costeja González, a Spanish national, lodged with the Agencia Española de Protección de Datos (Spanish Data Protection Agency, the AEPD) a complaint against La Vanguardia Ediciones SL (the publisher of a daily newspaper with a large circulation in Spain, in particular in Catalonia) and against Google Spain and Google Inc. Mr Costeja González contended that, when an internet user entered his name in the search engine of the Google group (‘Google Search’), the list of results would display links to two pages of La Vanguardia’s newspaper, of January and March 1998. Those pages in particular contained an announcement for a real-estate auction organised following attachment proceedings for the recovery of social security debts owed by Mr Costeja González. With that complaint, Mr Costeja González requested, first, that La Vanguardia be required either to remove or alter the pages in question (so that the personal data relating to him no longer appeared) or to use certain tools made available by search engines in order to protect the data. Second, he requested that Google Spain or Google Inc. be required to remove or conceal the personal data relating to him so that the data no longer appeared in the search results and in the links to La Vanguardia. In this context, Mr Costeja González stated that the attachment proceedings concerning him had been fully resolved for a number of years and that reference to them was now entirely irrelevant. The AEPD rejected the complaint against La Vanguardia, taking the view that the information in question had been lawfully published by it. On the other hand, the complaint was upheld as regards Google Spain and Google Inc. The AEPD requested those two companies to take the necessary measures to withdraw the data from their index and to render access to the data impossible in the future. Google Spain and Google Inc. brought two actions before the Audiencia Nacional (National High Court, Spain), claiming that the AEPD’s decision should be annulled. It is in this context that the Spanish court referred a series of questions to the Court of Justice. [The ECJ then summarizes its interpretation. Basically Google can be treated as a data controller and ...] … the Court holds that the operator is, in certain circumstances, obliged to remove links to web pages that are published by third parties and contain information relating to a person from the list of results displayed following a search made on the basis of that person’s name. The Court makes it clear that such an obligation may also exist in a case where that name or information is not erased beforehand or simultaneously from those web pages, and even, as the case may be, when its publication in itself on those pages is lawful. Finally, in response to the question whether the directive enables the data subject to request that links to web pages be removed from such a list of results on the grounds that he wishes the information appearing on those pages relating to him personally to be ‘forgotten’ after a certain time, the Court holds that, if it is found, following a request by the data subject, that the inclusion of those links in the list is, at this point in time, incompatible with the directive, the links and information in the list of results must be erased.Image: Forgotten by Stephen Nicholas, CC-BY-NC-SA
Laura James - August 27, 2013 in Featured, Ideas and musings, Open Data, Open Data and My Data, Open Government Data, privacy
“yes, the government should open other people’s data”Traditionally, the Open Knowledge Foundation has worked to open non-personal data – things like publicly-funded research papers, government spending data, and so on. Where individual data was a part of some shared dataset, such as a census, great amounts of thought and effort had gone in to ensuring that individual privacy was protected and that the aggregate data released was a shared, communal asset. But times change. Increasing amounts of data are collected by governments and corporations, vast quantities of it about individuals (whether or not they realise that it is happening). The risks to privacy through data collection and sharing are probably greater than they have ever been. Data analytics – whether of “big “ or “small” data – has the potential to provide unprecedented insight; however some of that insight may be at the cost of personal privacy, as separate datasets are connected/correlated.
Francis Irving - July 18, 2013 in Business, Featured, Ideas and musings, Open Data
Rufus Pollock - July 2, 2013 in Featured, Ideas and musings, Open Data, Small Data, Technical
- Storing data as line-oriented text and specifically as CSV1 (comma-separated variable) files. “Line oriented text” just indicates that individual units of the data such as a row of a table (or an individual cell) corresponds to one line2.
- Use best of breed (code) versioning like git mercurial to store and manage the data.
Line-oriented text is important because it enables the powerful distributed version control tools like git and mercurial to work effectively (this, in turn, is because those tools are built for code which is (usually) line-oriented text). It’s not just version control though: there is a large and mature set of tools for managing and manipulating these types of files (from grep to Excel!). In addition to the basic pattern, there are several a few optional extras you can add:
- Store the data in GitHub (or Gitorious or Bitbucket or …) – all the examples below follow this approach
- Turn the collection of data into a Simple Data Format data package by adding a datapackage.json file which provides a small set of essential information like the license, sources, and schema (this column is a number, this one is a string)
- Add the scripts you used to process and manage data — that way everything is nicely together in one repository
Martin Tisne - May 3, 2013 in Ideas and musings, Open Data, Open Government Data, Open Spending
Laura James - May 1, 2013 in Featured, Ideas and musings, Join us, OKF, Open Data, Our Work
Rufus Pollock - April 26, 2013 in Featured, Ideas and musings, Labs, Open Data, Small Data
“Small data is the amount of data you can conveniently store and process on a single machine, and in particular, a high-end laptop or server”