You are browsing the archive for data-collection.

Mobile data collection

- December 16, 2014 in data-collection, mobile, Skillhare, tech

This blog post is based on the School of Data skillshare I hosted on mobile data collection. Thanks to everyone who took part in it!
Of recent, mobile has become an increasingly popular method of data collection. This is achieved through having an application or electronic form on a mobile device such as a smartphone or a tablet. These devices offer innovative ways to gather data regardless of time and location of the respondent. The benefits of mobile data collection are obvious, such as quicker response times and the possibility to reach previously hard-to-reach target groups. In this blog post I share some of the tools that I have been using and developing applications on top of for the past five years.
  1.       Open Data Kit
Open Data Kit (ODK) is a free and open-source set of tools which help researchers author, field, and manage mobile data collection solutions. ODK provides an out-of-the-box solution for users to:
  • Build a data collection form or survey ;
  • Collect the data on a mobile device and send it to a server; and
  • Aggregate the collected data on a server and extract it in useful formats.
ODK allows data collection using mobile devices and data submission to an online server, even without an Internet connection or mobile carrier service at the time of data collection.   Screen Shot 2014-12-15 at 20.15.30 ODK, which uses the Android platform, supports a wide variety of questions in the electronic forms such as text, number, location, audio, video, image and barcodes.
  1.      Commcare
Commcare is an open-source mobile platform designed for data collection, client management, decision support, and behavior change communication. Commcare consists of two main technology components: Commcare Mobile and CommCareHQ. The mobile application is used by client-facing community health workers/enumerator in visits as a data collection and educational tool and includes optional audio, image, and audio, GPS locations and video prompts. Users access the application-building platform through the website CommCareHQ  which is operated on a cloud-based server. Screen Shot 2014-12-15 at 20.20.30 Commcare supports J2ME feature phones, Android phones, and Android tablets and can capture photos and GPS readings, Commcare supports multi-languages and non-roman character scripts as well as the integration of multimedia (image, audio, and video). CommCare mobile versions allow applications to run offline and collected data can be transmitted to CommCareHQ when wireless (GPRS) or Internet (WI-FI) connectivity becomes available.
  1.      GEOODK
GeoODK provides a way to collect and store geo-referenced information, along with a suite of tools to visualize, analyze and manipulate ground data for specific needs. It enables an understanding of the data for decision-making, research, business, disaster management, agriculture and more. It is based on the Open Data Kit (ODK), but has been extended with offline/online mapping functionalities, the ability to have custom map layer, as well as new spatial widgets, for collecting point, polygon and GPS tracing functionality. Screen Shot 2014-12-15 at 20.21.48 This one blog post cannot cover each and every tool for mobile data collection, but some other tools that can be used to accomplish  mobile data collection each of which having their own unique features includes OpenXData and Episurveyor. Why Use Mobile Technology in Collecting Data There are several advantages as to why mobile technology should be used in collecting data some of which include,
  •         harder skipping questions,
  •         immediate (real time) access to the data from the server, which also makes data aggregation and analysis to become very rapid,
  •         Minimizes workforce and hence reduces cost of data collection by cutting out data entry personnel.
  •         Data Security is enhanced through data encryption
  •         Collect unlimited data types such as audio, video, barcodes, GPS locations
  •         Increase productivity by skipping data entry middle man
·         Save cost related to printing, storage and management of documents associated with paper based data collection. flattr this!

Give me *my* data: online crowd-sourcing platform

- December 19, 2011 in community, data-collection, Europe, open, privacy

This particular idea is quite large in scale. The foundation for this idea is that in Europe, companies (e.g. supermarkets, energy companies, telephone companies, marketing firms, even Facebook and Google, ...) are obliged to turn over 'any' information they have about any citizen, should he or she personally request for it. So, the idea is to make such public information requests happen easily and intuitively, on a massive scale, and to make the aggregation and (anonymous) sharing of this data possible. As the first step, there should exist a website at which any citizen can download a relevant letter template and the relevant address of any company she is a customer with. This contact data and template could be created through crowd-sourcing. As a result, the citizen should be able to mail, fax or send a personal and legally binding request for her data within the timespan of 5 minutes or less. Dedicated company profiles can then track the performance of the companies to these requests (e.g. average time to answer, quality of response, etc.), allow people to share tips of how to get an answer, and so on. Through this public platform, specific companies can be more easily compared and reviewed, similar to TripAdvisor, for instance. Data can be aggregated and compared by company, by business area, and so on. As the second step, there should exist a collection of easy-to-use tools that can digitize and upload any information that has been given to the citizen by the companies. Such data might be provided on paper, in weird formats, unconventional time periods, and whatever. There might thus be a need to develop a specific tool, for a specific company. Again, this could happen through crowd-sourcing and by encouraging private developers to share their tools. As the third step, there should exist some sort of online platform that allows the citizen to open up her information to others, even anonymously, so this data can be shared and compared (e.g. does a similar family in terms of number and age of children, house, etc. consume the same amount of energy than my family, and if they do, what do they pay for it). It would be interesting to compare such data locally, but as much internationally, for instance, based on real, 'individual' data. There might well be the case that people are intrinsically inclined to share their data if they can benefit from it, for instance by learning from others... Greetings - infoscape.

Give me *my* data: online crowd-sourcing platform

- December 19, 2011 in community, data-collection, Europe, open, privacy

This particular idea is quite large in scale. The foundation for this idea is that in Europe, companies (e.g. supermarkets, energy companies, telephone companies, marketing firms, even Facebook and Google, ...) are obliged to turn over 'any' information they have about any citizen, should he or she personally request for it. So, the idea is to make such public information requests happen easily and intuitively, on a massive scale, and to make the aggregation and (anonymous) sharing of this data possible. As the first step, there should exist a website at which any citizen can download a relevant letter template and the relevant address of any company she is a customer with. This contact data and template could be created through crowd-sourcing. As a result, the citizen should be able to mail, fax or send a personal and legally binding request for her data within the timespan of 5 minutes or less. Dedicated company profiles can then track the performance of the companies to these requests (e.g. average time to answer, quality of response, etc.), allow people to share tips of how to get an answer, and so on. Through this public platform, specific companies can be more easily compared and reviewed, similar to TripAdvisor, for instance. Data can be aggregated and compared by company, by business area, and so on. As the second step, there should exist a collection of easy-to-use tools that can digitize and upload any information that has been given to the citizen by the companies. Such data might be provided on paper, in weird formats, unconventional time periods, and whatever. There might thus be a need to develop a specific tool, for a specific company. Again, this could happen through crowd-sourcing and by encouraging private developers to share their tools. As the third step, there should exist some sort of online platform that allows the citizen to open up her information to others, even anonymously, so this data can be shared and compared (e.g. does a similar family in terms of number and age of children, house, etc. consume the same amount of energy than my family, and if they do, what do they pay for it). It would be interesting to compare such data locally, but as much internationally, for instance, based on real, 'individual' data. There might well be the case that people are intrinsically inclined to share their data if they can benefit from it, for instance by learning from others... Greetings - infoscape.

Tag CKAN datasets with target audience experience level

- March 4, 2011 in audience, ckan, data-collection, tagging, users

Tag CKAN datasets with target audience experience level Tag #CKAN @okfn #opendata sets with “experience level” & show who they’re aimed at via @adrianshort "Public datastores could tag datasets with an “experience level” to show who they’re aimed at. Find me all the spending data suitable for power users…" Adrian Short "There are five types of potential users for open data and data-driven apps:
  • data experts and computer scientists who can use semantic web
  • technologies; software developers who can use XML, JSON, etc.;
  • power users who can use CSV, spreadsheets, RSS, KML/Google Earth, perhaps Yahoo Pipes;
  • general users who can use a web browser;
  • offliners who need printed materials, ambient displays, public screens etc."

AuthorClaim

- February 17, 2011 in Bibliographic, data-collection, openbiblio-challenge, openservice, social

AuthorClaim is a scholarly service which provides for professionals and students involved in the academic community the ability to both claim authorship over published works, as well as (and most significantly) the ability to obtain statistics as to your rankings with your coauthors, and with other authors within the network. Open access repositories are used to create the bibliographic records for each document, similar in structure to services such as RePEc (http://repec.org) and arXiv (http://www.arxiv.org) . This service ultimately aims to provide for both the registered and the unregistered user the ability to visualize one's unique position and relationship with other individuals throughout the network. More information can be found at the following page: http://authorclaim.org/about Details of our integration of the IUCR collection: http://authorclaim.org/collections/iucr Information about our current collections can be found here: http://authorclaim.org/collections And registration (our service is, of course, entirely free of charge), can be found on the index page: http://authorclaim.org/

AuthorProfile

- February 17, 2011 in Bibliographic, data-collection, openbiblio-challenge, openservice, social

Objective The overall objective is to invert bibliographic data from its traditional format where each record describes a document. We want to create to a CV-style format that has authors as the heading and the documents written by the author underneath it. This allows for a navigation of the bibliographical space by author. It also prepare for performance evaluation of authors. Sources There are two sources. One is a set of simple document data from the OKFN sponsored 3lib project. 3lib. These data are de facto open because they contain only factual descriptions of documents, titles, author names, identifiers. The document data describe scientific articles and preprint. The other source is a set of author profiles that are openly available from AuthorClaim. Method Authors are referenced in bibliographic information by names. Names are ambiguous. There are many ways to write the name of a single person. We call these "name expressions". Several persons may share valid name expressions. Since names don't identify authors, AuthorProfile can not do a reliable job. The AuthorClaim project allows authors to claim documents. Only a very small part of documents are subject to author claims at this time. These are the people for authoritative publication lists are available. For the others we have to use name expressions. We look at bibliographic data records containing such author name expressions, and create files, one for each author name expression. We call this process "auversion". The system will have list of author pages as top-entry navigation. Author pages can only be constructed for AuthorClaim registrants. However most AuthorClaim registrants have coauthors, and most of these are not yet registered. These non-registered co-authors then provide entry points to author name expressions, etc. Thus a substantial part of "auverted" bibliographic data can be linked from the authors. System In addition to navigating a set of authors (not implemented yet), we plan two navigational features. First, we want to link from an "auverted" author name page to the closest registered. By "closest" we mean by shortest intermediate author name expression path through co-authorship. This is partly implemented on our test set system. We call this "vertical integration". Second, we want to provide links between related author name expression. Assume for example, we have the author name J. Griffin, but we also have James Griffin, we want to create a link from J. Griffin author name expression to James Griffin author name expression. We want to do a similar thing for diacritics, linking from expressions with diacritics to those without and back. We links between author name expressions that may refer to the same person as "horizontal integration". Current state A debugging/testing demonstrator of the system is available here. Eligibility The 3lib dataset includes the IUCR data from the JISC funded open bibliography project. However since the data is very small, it is not likely to be seen in the actual demonstrator that we have running.

Was It On Time? App

- December 28, 2010 in data-collection, public-transport, webapp

Purpose: allow users to record whether trains/planes/etc are on time via a simple webapp / phone app. Why? Official on-time statistics do now exist for some modes of transport in some countries. However, they are far from comprehensive. More importantly it is unclear that 'official' figures (collected by the operators0 record what users actually experience. Required:
  • Identifiers for services (e.g. train route and timing) so that you can record on-time stats against that service
  • Simple phone or webapp that let's you record whether a given service is on-time. Should allow you to quickly select the service and then either a) to push a button to indicate the service arrived at that moment b) enter a time for arrival. Service should optionally record user information so that one can deal with spamming or poor data entry.