You are browsing the archive for Open Science.

Open in order to ensure healthy lives and promote well-being for all at all ages

- November 12, 2018 in Open Access, Open Access Button, Open Science

The following blog post is an adaptation of a talk given at the OpenCon 2018 satellite event hosted at the United Nations Headquarters in New York City. Slides for the talk can be found here. When I started medical school, I had no idea what Open Access was, what subscriptions were and how they would affect my everyday life. Open Access is important to me because I have experienced first hand, on a day to day basis, the frustration of not being able to keep up to date with recent discoveries and offer patients up-to-date evidence-based treatment. For health professionals based in low and middle income countries the quest of accessing research papers is extremely time consuming and often unsuccessful. In countries where resources are scarce, hospitals and institutions don’t pay for journal subscriptions, and patients ultimately pay the price. Last week while I was doing rounds with my mentor, we came across a patient who was in a critical state. The patient had been bitten by a snake and was treated with antivenom serum, but was now developing a severe acute allergic reaction to the treatment he had received. The patient was unstable, so we quickly googled different papers to make an informed treatment decision. Unfortunately, we hit a lot of paywalls. The quest of looking for the right paper was time consuming. If we did not make a quick decision the patient could enter anaphylactic shock.

I remember my mentor going up and down the hospital looking for colleagues to ask for opinions, I remember us searching for papers and constantly hitting paywalls, not being able to do much to help. At the end of the day, the doctor made some calls, took a treatment decision and the patient got better. I was able to find a good paper in Scielo, a Latin American repository, but this is because I know where to look, Most physicians don’t. If Open Access was a norm, we could have saved ourselves and the patient a lot of time.This is a normal day in our lives, this is what we have to go through everytime we want to access medical research and even though we do not want it to, it ends up affecting our patients.
This is my story, but I am not a one in a million case. I happen to read stories just like mine from patients, doctors, and policy makers on a daily basis at the Open Access Button where we build tools that help people access the research they need without the training I receive. It is a common misconception to think that when research is published in a prestigious journal, to which most institutions in Europe and North America are subscribed, the research is easily accessible and therefore impactful, which is usually not the case. Often, the very people we do medical research to help are the ones that end up being excluded from reading it.

Why does open matter at the scale of diseases?

A few years ago, when Ebola was declared a public health crisis, the whole world turned to West Africa. The conventional wisdom among public health authorities believed that Ebola was a new phenomenon, never seen in West Africa before year 2013. As it turned out, the conventional wisdom was wrong. In 2015, the New York Times issued a report stating that Liberia’s Ministry of Health had found a paper that proved that Ebola existed in the region before. In the future, the authors asserted, “Medical personnel in Liberian health centers should be aware of the possibility that they may come across active cases and thus be prepared to avoid nosocomial epidemics” This paper was published in 1982, in an expensive, subscription European journal. Why did Liberians not have access to the research article that could have warned about the outbreak? The paper was published in a European journal, and there were no Liberian co-authors in the study. The paper costs $45, which is the equivalent of 4 days of salary for a medical professional in Liberia. The average price of a health science journal is $2,021, this is the equivalent of 2.4 years of preschool education, 7 months of utilities and 4 months of salary for a medical professional in Liberia. Let’s think about the impact open could have had in this public health emergency. If the paper had been openly accessible, Liberians could have easily read it. They could have been warned and who knows? Maybe they could have even been able to catch the disease before it became a problem. They could have been equipped with the qualities they needed to face the outbreak. They could have asked for funds and international help way before things went bad. Patients could have been informed and campaigns could have been created. These are only a few of the benefits of Open Access that we did not get during the Ebola outbreak.

What happens when open wins the race?

The Ebola outbreak is a good example of what happens when health professionals do not get access to research.However, sometimes Open Access wins and great things happen. The Human Genome Project was a pioneer for encouraging access to scientific research data. Those involved in the project decided to release all the data publicly. The Human Genome data could be downloaded in its entirety, chromosome by chromosome, by anyone in the world. The data sharing agreement required all parts of the human genome sequenced during the project to be distributed into the public domain within 24 hours of completion. Scientists believed that these efforts would accelerate the production of the human genome. This was a deeply unusual approach , with scientists by default not publishing their data at the time. When a private company wanted to patent some of the sequences, everyone was worried, because this would mean that advances arising from the work, such as diagnostic tests and possibly even cures for certain inherited diseases, would be under their control. Luckily, The Human Genome Project was able to accelerate their work and this time, open won the race. In 2003, the human genetic blueprint was completed. Since that day, because of Open Access to the research data, the Human Genome Project has generated $965 billion in economic output, 295 billion in personal income, 4 billion in economic output and helped developed at least 30% more diagnostic tools for diseases (source). It facilitated the scientific understanding of the role of genes in specific diseases, such as cancer, and led to the development of a number of DNA screening tests that provide early identification of risk factors of developing diseases such as colon cancer and breast cancer. The data sharing initiative of the Human Genome Project was agreed after a private company decided to patent the genes BRCA1 & 2 used for screening breast and colon cancer. The company charged nearly $4,000 for a complete analysis of the two genes. About a decade after the discovery, patents for all genes where ruled invalid. It was concluded that gene patents interfere with diagnosis and treatment, quality assurance, access to healthcare and scientific innovation. Now that the patent was invalidated, people can get tested for much less money. The Human Genome Project proved that open can be the difference between a whole new field of medicine or private companies owning genes.

Call to action

We have learned how research behind a paywall could have warned us better about Ebola 30 years before the crisis. In my work, open would save us crucial minutes while our patients suffer. Open Access has the power to accelerate advancement not only towards good health and well being, but towards all sustainable development goals. I have learned a lot about open because of excellent librarians, who have taken the time to train me and help me understand everything I’ve discussed above. I encourage everyone to become leaders and teachers in open practices within your local institutions. Countries and organizations all over the world look up to the United Nations for leadership and guidance on what is right, and what is practical. By being bold on open, the UN can inspire and even enable action towards open and accelerate progress on SDGs. When inspiration doesn’t cut it, The UN and other organizations can use their power as funders to mandate open . We can make progress without Open Access, and we have for a long time, but while we make progress with closed, with open as a foundation things happen faster and equality digs in. Health inequality and access inequality exists today, but we have the power to change that. We need open to be central, and for that to happen we need you to be able to see it as foundational as well.   Written by Natalia Norori with contributions by Joseph McArthur, CC-BY 4.0.  

Sources:

Introducing our new Product Manager for Frictionless Data

- November 5, 2018 in Frictionless Data, Open Science

Earlier this year OKI announced new funding from  The Alfred P. Sloan Foundation to explore “Frictionless Data for Reproducible Research”. Over the next three years we will be working closely with researchers to support the way they are using data with the Frictionless Data software and tools. The project is delighted to announce that Lilly Winfree has come on board as Product Manager to work with research communities on a series of focussed pilots in the research space and to help us develop focussed training and support for researchers. Data practices in scientific research are transforming as researchers are facing a reproducibility revolution; there is a growing push to make research data more open, leading to more transparent and reproducible science. I’m really excited to join the team at OKI, whose mission of creating a world where knowledge creates power for the many, not the few really resonates with me and my desires to make science more open. During my grad school years as a neuroscience researcher, I was often frustrated with “closed” practices (inaccessible data, poorly documented methods, paywalled articles) and I became an advocate for open science and open data. While investigating brain injury in fruit flies (yes, fruit fly brains are actually quite similar to human brains!), I taught myself coding to analyse and visualise my research data. After my PhD research, I worked on integrating open biological data with the Monarch Initiative, and delved into the open data licensing world with the Reusable Data Project. I am excited to take my passion for open data and join OKI to work on the Frictionless Data project, where I will get to go back to my scientific research roots and work with researchers to make their data more open, shareable, and reproducible. Most people that use data know the frustrations of missing values, unknown variables, and confusing schema (just to name a few). This “friction” in data can lead to massive amounts of time being spent on data cleaning, with little time left for analysis. The Frictionless Data for Reproducible Research project will build upon years of work at OKI focused on making data more structured, discoverable, and usable.  The core of Frictionless Data is the data preparation and validation stages, and the team has created specifications and tooling centered around these steps. For instance, the Data Package Creator packages tabular data with its machine readable metadata, allowing users to understand the data structure, meaning of values, how the data was created, and the license. Also, users can validate their data for structure and content with Goodtables, which reduces errors and increases data quality. By creating specifications and tooling and promoting best practices, we are aiming to make data more open and more easily shareable among people and between various tools. For the next stage of the project, I will be working with organisations on pilots with researchers to work on reducing the friction in scientists’ data. I will be amassing a network of researchers interested in open data and open science, and giving trainings and workshops on using the Frictionless Data tools and specs. Importantly, I will work with researchers to integrate these tools and specs into their current workflows, to help shorten the time between experiment → data → analysis → insight. Ultimately, we are aiming to make science more open, efficient, and reproducible. Are you a researcher interested in making your data more open? Do you work in a research-related organization and want to collaborate on a pilot? Are you an open source developer looking to build upon frictionless tools? We’d love to chat with you! We are eager to work with scientists from all disciplines.  If you are interested, connect with the project team on the public gitter channel, join our community chat, or email Lilly at lilly.winfree@okfn.org!

Lilly in the fruit fly lab

   

Participatory Smart Environment Lab: Avoin tekninen ilmanlaatusensorityöpaja

- August 28, 2018 in Open Science, Working Group Meetup

Melutaanko kadullasi usein öisin?
Kulkeutuuko naapuriparvekkeen tupakoijan tuottama savu sisälle?
Erottaako kotona lähialueella olevalla festivaalilla soitettavat kappaleet?
Tuntuuko siltä, että ilma sisällä tuntuu raskaalta sen joulukynttilöiden polttamisen jälkeen? Jokainen on kokenut omassa elinympäristössään häiritseviä tekijöitä, mutta niiden laajuutta voi olla vaikea hahmottaa ja viestiä pelkästään omien kokemusten perusteella. Erinäisten ilmastoa ja ympäristöä mittaavien IoT-sensorien hinnat ovat pudonneet siinä määrin, että ei maksa kovinkaan paljoa (kokoonpanosta riippuen halvimmillaan n. 25-50e) istuttaa niitä omaan elinympäristöön. Kun sensorit tallentavat säännöllisesti tietoa elinympäristöistämme, voimme käyttää tuotettua tietoa paitsi omaan, mutta myös yhteiseen käyttöön. Open Knowledge Finlandin, Forum Virium Helsingin, Mehackitin ja XAMKin asiantuntijoista koostuva ryhmä järjestää to 27.9. klo 11:00-15:30 Maria 01:ssä työpajan, jonka tarkoituksena on tutkia yhdessä, että mitä vaatisi, että jonkinlaiset ilmastosensorit olisivat helposti otettavissa käyttöön, ja miten niiden tuottamasta tiedosta voisi saada helposti informaatiota paitsi oman, mutta myös jaetun elinympäristön hahmottamiseen. Työpajaa varten on käytettävissä tarvikkeet noin 6-8 laitteen rakentamiseen. Mukaan voi tulla omien tai aiemmin hankittujen laitteiden kanssa. Tilaan sopii noin 20 henkilöä. Ilmoittaudu tapahtumaan Facebookissa: https://www.facebook.com/events/484874521998267/ Kenelle tämä työpaja on? Työpajaan toivotaan IoT-sensorien asiantuntijoita, ilmanlaatua/ympäristöä koskevien mittausten asiantuntijoita, tiedon visualisoijia jne. Tämä työpaja on luonteeltaan tekninen ja osallistujilta toivotaan olla vahvaa osaamista joistain näistä alueista:
  • elektroniikka
  • mikrokontrollerit
  • sensorit (etenkin ilmanlaatu)
  • verkkotekniikat
  • kerätyn datan tallennus taustajärjestelmään
  • ymmärrys ilmanlaadun mittaamisesta, VOCeista, saasteista, kaasuista, pienhiukkaista ja edellisten merkittävyydestä ihmisille ja ympäristölle
Jos aihepiirin edistäminen tai seuraaminen kiinnostaa… Kannattaa liittyä kaikille avoimeen Participatory Smart Environment Lab Facebook-ryhmään: https://www.facebook.com/groups/206606553247690/ The post Participatory Smart Environment Lab: Avoin tekninen ilmanlaatusensorityöpaja appeared first on Open Knowledge Finland.

Participatory Smart Environment Lab: Avoin tekninen ilmanlaatusensorityöpaja

- August 28, 2018 in Open Science, Working Group Meetup

Melutaanko kadullasi usein öisin?
Kulkeutuuko naapuriparvekkeen tupakoijan tuottama savu sisälle?
Erottaako kotona lähialueella olevalla festivaalilla soitettavat kappaleet?
Tuntuuko siltä, että ilma sisällä tuntuu raskaalta sen joulukynttilöiden polttamisen jälkeen? Jokainen on kokenut omassa elinympäristössään häiritseviä tekijöitä, mutta niiden laajuutta voi olla vaikea hahmottaa ja viestiä pelkästään omien kokemusten perusteella. Erinäisten ilmastoa ja ympäristöä mittaavien IoT-sensorien hinnat ovat pudonneet siinä määrin, että ei maksa kovinkaan paljoa (kokoonpanosta riippuen halvimmillaan n. 25-50e) istuttaa niitä omaan elinympäristöön. Kun sensorit tallentavat säännöllisesti tietoa elinympäristöistämme, voimme käyttää tuotettua tietoa paitsi omaan, mutta myös yhteiseen käyttöön. Open Knowledge Finlandin, Forum Virium Helsingin, Mehackitin ja XAMKin asiantuntijoista koostuva ryhmä järjestää to 27.9. klo 11:00-15:30 Maria 01:ssä työpajan, jonka tarkoituksena on tutkia yhdessä, että mitä vaatisi, että jonkinlaiset ilmastosensorit olisivat helposti otettavissa käyttöön, ja miten niiden tuottamasta tiedosta voisi saada helposti informaatiota paitsi oman, mutta myös jaetun elinympäristön hahmottamiseen. Työpajaa varten on käytettävissä tarvikkeet noin 6-8 laitteen rakentamiseen. Mukaan voi tulla omien tai aiemmin hankittujen laitteiden kanssa. Tilaan sopii noin 20 henkilöä. Ilmoittaudu tapahtumaan Facebookissa: https://www.facebook.com/events/484874521998267/ Kenelle tämä työpaja on? Työpajaan toivotaan IoT-sensorien asiantuntijoita, ilmanlaatua/ympäristöä koskevien mittausten asiantuntijoita, tiedon visualisoijia jne. Tämä työpaja on luonteeltaan tekninen ja osallistujilta toivotaan olla vahvaa osaamista joistain näistä alueista:
  • elektroniikka
  • mikrokontrollerit
  • sensorit (etenkin ilmanlaatu)
  • verkkotekniikat
  • kerätyn datan tallennus taustajärjestelmään
  • ymmärrys ilmanlaadun mittaamisesta, VOCeista, saasteista, kaasuista, pienhiukkaista ja edellisten merkittävyydestä ihmisille ja ympäristölle
Jos aihepiirin edistäminen tai seuraaminen kiinnostaa… Kannattaa liittyä kaikille avoimeen Participatory Smart Environment Lab Facebook-ryhmään: https://www.facebook.com/groups/206606553247690/ The post Participatory Smart Environment Lab: Avoin tekninen ilmanlaatusensorityöpaja appeared first on Open Knowledge Finland.

Finland remains a leading country in the transparency of academic publishing costs

- August 27, 2018 in Freedom of Information, Open Science

The Finnish Ministry of Education and Culture (MoE) just released the price information for academic publishing agreements for 2017. With this, the price information for virtually all academic publishers is now openly available for most academic institutions in Finland 2010-2017. The data is available for download under creative commons. Further information of this data release is available in Finnish. The data release is said to increase the transparency of publishing prices, support international discussion on the licensing fees, and promote open science. With this, Finland maintains the leading position in the transparency of academic publishing prices and agreements. The 2017 price data release by MoE follows the recent release of full text agreements with several major publishers by FinELib, the consortium of Finnish academic libraries and a report commissioned by MoE, developing systematic evaluation criteria to assess the openness of major academic publishers. Notably, these price data releases have been triggered by the initial freedom of information (FOI) requests and a 2014 court appeal by Finnish open science advocates, coordinated by Open Science work group of the Open Knowledge Finland. This was initially inspired by related efforts in the UK and USA. To our knowledge the FInnish pricing data is, however, the most complete national data set to date in terms of institutional and temporal coverage. Related efforts have been subsequently taken place in several other countries. Without the dedicated grass-roots activities of the Finnish open science advocates this information might still remain closed today. In fact, this is the prevailing situation in most countries. At Open Knowledge FInland we hope that the Ministry of Education and Culture will continue to support the collection and availability of information on academic publishing costs in the long term. This will enable continuous transparent monitoring of the development of academic publishing prices  over time and sets a unique example for other countries to follow.           The post Finland remains a leading country in the transparency of academic publishing costs appeared first on Open Knowledge Finland.

Finland remains a leading country in the transparency of academic publishing costs

- August 27, 2018 in Freedom of Information, Open Science

The Finnish Ministry of Education and Culture (MoE) just released the price information for academic publishing agreements for 2017. With this, the price information for virtually all academic publishers is now openly available for most academic institutions in Finland 2010-2017. The data is available for download under creative commons. Further information of this data release is available in Finnish. The data release is said to increase the transparency of publishing prices, support international discussion on the licensing fees, and promote open science. With this, Finland maintains the leading position in the transparency of academic publishing prices and agreements. The 2017 price data release by MoE follows the recent release of full text agreements with several major publishers by FinELib, the consortium of Finnish academic libraries and a report commissioned by MoE, developing systematic evaluation criteria to assess the openness of major academic publishers. Notably, these price data releases have been triggered by the initial freedom of information (FOI) requests and a 2014 court appeal by Finnish open science advocates, coordinated by Open Science work group of the Open Knowledge Finland. This was initially inspired by related efforts in the UK and USA. To our knowledge the FInnish pricing data is, however, the most complete national data set to date in terms of institutional and temporal coverage. Related efforts have been subsequently taken place in several other countries. Without the dedicated grass-roots activities of the Finnish open science advocates this information might still remain closed today. In fact, this is the prevailing situation in most countries. At Open Knowledge FInland we hope that the Ministry of Education and Culture will continue to support the collection and availability of information on academic publishing costs in the long term. This will enable continuous transparent monitoring of the development of academic publishing prices  over time and sets a unique example for other countries to follow.           The post Finland remains a leading country in the transparency of academic publishing costs appeared first on Open Knowledge Finland.

Scaling up paywalled academic article sharing by legal means

- August 23, 2018 in Featured, Open Access, Open Science, r4r

“If you read a paper, 100% goes to the publisher. If you just email us to ask for our papers, we are allowed to send them to you for free, and will be genuinely delighted to do so.” This recent tweet by Holly Witteman inspired Iris.ai to launch the R4R initiative  (Research for Researchers) that is intended to facilitate sharing of research articles by legal means. This is implemented as an application that automates article requests and sharing among researchers via email. Sharing an article you authored via email with your peers is generally allowed. While being far from the most efficient way to share knowledge, email still remains the last resort when the alternative is that content is behind an expensive paywall. Technically, R4R is a fairly simple tool, implemented as a browser extension. The Iris.ai blog post explains it in more detail, but here’s the idea in a nutshell:
  1. Imagine you just found an interesting academic paper using search engines. It’s relevant, but behind a paywall.
  2. Having installed the R4R browser extension, a tab on your screen will let you know if sending an email to the author automatically is available. A single click on the tab sends an email requesting the paper to the author.
  3. R4R automatically drafts a response to the person requesting the paper and adds the relevant scholarly article as an attachment.
  4. The author reviews the request and makes the final decision on whether or not to share the paper with the requester.
In the beginning, the browser plug-in will only allow sending emails to the authors who have expressed their willingness to do so. If you are happy to share your publications with peers this way, you can add your name on this list. Or if you would like to be among the first ones to be notified when the software is ready, sign up for the waitlist via this link.  At the time of writing this blog post, the OKF Finland could not confirm yet whether the full source code of the service will be open but we support the general idea of promoting free sharing of articles that the plug-in implements. While the R4R initiative does not make copyrighted and paywalled articles open access, it increases knowledge exchange and thus hopefully also encourages openness on a personal level. This is why we at Open Knowledge Finland fully support this initiative. We hope that R4R will help researchers around the world to share their discoveries with those who need them, while working to advance more comprehensive shifts towards open access in the overall publishing system. Read more on Medium! Engage with us on Twitter: @mariaritola, @antagomir, @okffi The post Scaling up paywalled academic article sharing by legal means appeared first on Open Knowledge Finland.

Scaling up paywalled academic article sharing by legal means

- August 23, 2018 in Featured, Open Access, Open Science, r4r

“If you read a paper, 100% goes to the publisher. If you just email us to ask for our papers, we are allowed to send them to you for free, and will be genuinely delighted to do so.” This recent tweet by Holly Witteman inspired Iris.ai to launch the R4R initiative  (Research for Researchers) that is intended to facilitate sharing of research articles by legal means. This is implemented as an application that automates article requests and sharing among researchers via email. Sharing an article you authored via email with your peers is generally allowed. While being far from the most efficient way to share knowledge, email still remains the last resort when the alternative is that content is behind an expensive paywall. Technically, R4R is a fairly simple tool, implemented as a browser extension. The Iris.ai blog post explains it in more detail, but here’s the idea in a nutshell:
  1. Imagine you just found an interesting academic paper using search engines. It’s relevant, but behind a paywall.
  2. Having installed the R4R browser extension, a tab on your screen will let you know if sending an email to the author automatically is available. A single click on the tab sends an email requesting the paper to the author.
  3. R4R automatically drafts a response to the person requesting the paper and adds the relevant scholarly article as an attachment.
  4. The author reviews the request and makes the final decision on whether or not to share the paper with the requester.
In the beginning, the browser plug-in will only allow sending emails to the authors who have expressed their willingness to do so. If you are happy to share your publications with peers this way, you can add your name on this list. Or if you would like to be among the first ones to be notified when the software is ready, sign up for the waitlist via this link.  At the time of writing this blog post, the OKF Finland could not confirm yet whether the full source code of the service will be open but we support the general idea of promoting free sharing of articles that the plug-in implements. While the R4R initiative does not make copyrighted and paywalled articles open access, it increases knowledge exchange and thus hopefully also encourages openness on a personal level. This is why we at Open Knowledge Finland fully support this initiative. We hope that R4R will help researchers around the world to share their discoveries with those who need them, while working to advance more comprehensive shifts towards open access in the overall publishing system. Read more on Medium! Engage with us on Twitter: @mariaritola, @antagomir, @okffi The post Scaling up paywalled academic article sharing by legal means appeared first on Open Knowledge Finland.

Evidence Appraisal Data-Thon: A recap of our Open Data Day event

- May 23, 2018 in health, Open Data Day, open data day 2018, Open Research, open research data, Open Science

This blog has been reposted from Medium This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The events in this blog were supported through the mini-grants scheme under the Open Research Data theme.

Research can save lives, reduce suffering, and help with scientific understanding. But research can also be unethical, unimportant, invalid, or poorly reported. These issues can harm health, waste scientific and health resources, and reduce trust in science. Differentiating good science from bad, therefore, has big implications. This is happening in the midst of broader discussions about differentiating good information from misinformation. Current controversy regarding political ‘fake news’ has specifically received significant recent attention. Public scientific misinformation and academic scientific misinformation also are published, much of it derived from low quality science.

EvidenceBase is a global, informal, voluntary organization aimed at boosting and starting tools and infrastructure that enhance scientific quality and usability. The critical appraisal of science is one of many mechanisms seeking to evaluate and clarify published science, and evidence appraisal is a key area of EvidenceBase’s work. On March 3rd we held an Open Data Day event to introduce the public to evidence appraisal and to explore and work on an open dataset of appraisals. We reached out to a network in NYC of data scientists, software developers, public health professionals, and clinicians and invited them and their interested friends (including any without health, science, or data training).

 

Our data came from the US’s National Library of Medicine’s PubMed and PubMed Central datasets. PubMed offers indexing, meta-data, and abstracts for biomedical publications and PubMed Central (PMC) offers full-text in pdf and/or xml. PMC has an open-access subset. We explored the portion of this subset that 1) was indexed in PubMed as a “journal comment” and 2) was a comment on a clinical trial. The structure of our 10 hour event was an initial session introducing the general areas of health trials, research issues, and open data and then the remainder of the day consisted of parallel groups tackling three areas: lay exploration and Q&A; dataset processing and word embedding development; and health expertise-guided manual exploration and annotation of comments. We had 2 data scientists, 4 trial experts, 3 physicians, 4 public health practitioners, 4 participants without background but with curiosity, and 1 infant. Our space was donated, and the food was provided from a mix of a grant from Open Data Day provided by SPARC and Open Knowledge International (thank you!) and voluntary participant donations.

On the dataset front, we leveraged the clinical trial and journal comment meta-data in PubMed, and the links between PubMed and PMC, and PMC’s open subset IDs to create a data subset that was solely journal comments on clinical trials that were in PMC’s open subset with xml data. Initial exploration of this subset for quality issues showed us that PubMed metadata tags misindex non-trials as trials and non-comments as comments. Further data curation will be needed. We did use it to create word embeddings and so some brief similarity-based expansion.

 

The domain experts reviewed trials in their area of expertise. Some participants manually extracted text fragments expressing a single appraisal assertion, and attempted to generalize the assertion for future structured knowledge representation work. Overall participants had a fun, productive, and educational time! From the standpoint of EvidenceBase, the event was a success and was interesting. We are mainly virtual and global, so this in person event was new for us, energizing, and helped forge new relationships for the future.

We also learned:

  • We can’t have too much on one person’s plate for logistics and for facilitation. Issues will happen (e.g. food cancellation last minute).
  • Curiosity abounds, and people are thirsty for meaningful and productive social interactions beyond their jobs. They just need to be invited, otherwise this potential group will not be involved.
  • Many people who have data science skills have jobs in industries they don’t love, they have a particular thirst to leverage their skills for good.
  • People without data science expertise but who have domain expertise are keen on exploring the data and offering insight. This can help make sense of it, and can help identify issues (e.g. data quality issues, synonyms, subfield-specific differences).
  • People with neither domain expertise nor data science skills still add vibrancy to these events, though the event organizers need more bandwidth to help orient and facilitate the involvement of these attendees.
  • Public research data sets are messy, and often require further subsetting or transformation to make them usable and high quality.
  • Open data might have license and accessibility barriers. For us, this resulted in a large reduction in journal comments with full-text vs. not, and of those with full-text, a further large reduction in those where the text was open-access and licensed for use in text mining.

We’ll be continuing to develop the data set and annotations started here, and we look forward to the next Open Data Day. We may even host a data event before then!

Open Data Day: From entrepreneurship to open science

- April 26, 2018 in mexico, Open Data Day, open data day 2018, open research data, Open Science, spain

Authors: Virginia De Pablo (ODI Madrid) and Karla Ramos (Epicentro Inefable A.C.) This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The events in this blog were supported through the mini-grants scheme under the Open Research Data theme. For the last edition of Open Data Day, two very different cities Madrid (Spain) and Puebla (México) have joined efforts to demonstrate that open data is an essential tool for social development. We could see this in the sessions that took place that day, where students, journalists, political scientists, technologists and public servants gathered to prove that open data is useful to center the future of research and science, as well as building bridges between citizens and decision makers.

Puebla

During Open Data Day in Puebla, Epicentro Inefable AC and the State Coordinator for Transparency and Open Government (CETGA for its Spanish initials), along with the Engineering faculty of the  Benemérita Universidad Autónoma de Puebla organized the Open Data Day Puebla Bootcamp, with the goal of disseminating the benefits of data in open formats. During the welcome, we called teachers, students and people in general, to use the data that the government of Puebla publishes openly. We also mentioned that open data can be a bridge between government and people, and it works to generate better public policies and strengthen civic participation for decision making for social good. We had presentations for students of different public universities in Puebla by Karla Ramos, the director of Epicentro Inefable A.C.; Boris Cuapio and Hugo Osorio, founders and partners of Gobierno Fácil; Tony Rojas, director of Open Government of the CETGA; Juan Carlos Espinosa, youth ambassador of My World Mexico, and Luis Oidor, chief of the Open Government department in the CETGA.   In the panel “Morning Data, what is open data and what do they work for?”, the presenters highlighted the qualities that open data should have, like being free and of easy access. They also emphasized their usefulness as a digital tool that every person can use as a source of information, to improve the quality of life in their community. During his participation, Hugo Osorio highlighted that open data can be used as a tool for entrepreneurship. For example, he mentioned that apps like Waze and Uber use open data y for 2013 the generated more than 920 million USD in the US. To close the session, Luis oidor presented the actions that the government of Puebla is implementing to train, train public officers for publication of new data sets. He mentioned that up to now, 91% of the agencies and 81% of the municipalities have received training in this subject. As a result, they have published 416 data sets in topics like health, education, transportation, finance, employment, business, security and service delivery, which can be accessed through http://www.datos.puebla.gob.mx./. As a final activity, we navigated through the datasets available in the government portal, where 100 students and teachers participated in 20 different teams. Hugo, Boris and Karla were in charge of grading the results of the 12 questions we asked during the event and named the winners. The BootCamp took place in the University’s auditorium, we gathered 271 students and teachers from the BUAP, the Instituto Tecnológico Superior de San Martín, el Instituto de Estudios Superiores A.C., el Instituto Tecnológico Superior de Atlixco, el Instituto de Capacitación para el Trabajo del Estado de Puebla y el Colegio de Estudios Científicos y Tecnológicos del Estado de Puebla, as well as participants from civil society organizations.

Madrid

Open Data Day in Madrid was focused on Open Science. For two days -March 2 and 3- we gathered a distinguished group of professionals and students of many disciplines in Medialab Prado. The participants participated in the sessions organized by the Ontology Engineering Group (OEG), ODI Madrid and Datalab. Among the speakers we had David Abián, from Wikimedia Spain, María Poveda, from the Ontology Engineering Group (OEG) and ODI Madrid; Mariano Rico, a member of the OEG, responsible of explaining the use and utility of the DBPedia; Olga Giraldo, who presente “SMART protocols for Open Science”, and Fernando Blat, from Populate. Bastien Guerry, from the Office of the Prime Minister of France, in charge of maintaining the org-mode software org-mode closed the day. During the morning, David Abián taught us how to extract data from Wikimedia in order to do any research that might interest us. He explained the formats in which we can obtain and generate information in this wiki and taught us through a simple practical exercise: extract data about a specific topic: nuclear plants. As we went through, he explained what this information could be useful for. He made clear how open data can be used from scientific research, open science to writing journalistic papers or information for policy decisions. Maria Poveda explained what ontologies are for. She did this through a light chat that allowed us to understand how to develop them and how we can use them in the open data context. After the lunch break, Olga Giraldo presented the keynote, a chat about open science entitled “SMART protocols for Open Science”. She allowed us to know how, since when and why we gather and publish scientific data. “Data by itself doesn’t explain its use” Giraldo said. The researcher insisted that data should go “along with a document -lab protocol- where we can explain how we get to the data and how we can use them”. The importance of protocols and their content lies in its design and accessibility, two keys to find scientific data and the information you might need. Her work on the SMART protocols platform, where researchers can publish their protocols, besides gathering other information is a sample of this. Afterwards, Mariano Rico told us about the DBpedia del español: how they got their data, how it’s edited, how they’re downloaded, how many datasets it has, when it started to function, etc. DBpedia contains an immense information repository, a full set of structured data that makes it the center of a world of data that has been edited with controlled vocabularies. This is, without question, a link between many vocabularies and a useful tool for all kinds of solutions, from visualizations to apps, whether for scientific ends, industrial ends or any type of business. Finally, Bastien Guerry outlined the work he does leading org-mode and his work as editor and responsible person of it working for the French government.