You are browsing the archive for Open Science.

Frictionless DarwinCore Tool by André Heughebaert

- December 9, 2019 in Frictionless Data, Open Knowledge, Open Research, Open Science, Open Software, Technical

This blog is part of a series showcasing projects developed during the 2019 Frictionless Data Tool Fund.  The 2019 Frictionless Data Tool Fund provided four mini-grants of $5,000 to support individuals or organisations in developing an open tool for reproducible research built using the Frictionless Data specifications and software. This fund is part of the Frictionless Data for Reproducible Research project, which is funded by the Sloan Foundation. This project applies our work in Frictionless Data to data-driven research disciplines, in order to facilitate reproducible data workflows in research contexts.   logo

Frictionless DarwinCore, developed by André Heughebaert

  André Heughebaert is an open biodiversity data advocate in his work and his free time. He is an IT Software Engineer at the Belgian Biodiversity Platform and is also the Belgian GBIF (Global Biodiversity Information Facility) Node manager. During this time, he has worked with the Darwin Core Standards and Open Biodiversity data on a daily basis. This work inspired him to apply for the Tool Fund, where he has developed a tool to convert DarwinCore Archives into Frictionless Data Packages.   The DarwinCore Archive (DwCA) is a standardised container for biodiversity data and metadata largely used amongst the GBIF community, which consists of more than 1,500 institutions around the world. The DwCA is used to publish biodiversity data about observations, collections specimens, species checklists and sampling events. However, this domain specific standard has some limitations, mainly the star schema (core table + extensions), rules that are sometimes too permissive, and a lack of controlled vocabularies for certain terms. These limitations encouraged André to investigate emerging open data standards. In 2016, he discovered Frictionless Data and published his first data package on historical data from 1815 Napoleonic Campaign of Belgium. He was then encouraged to create a tool that would, in part, build a bridge between these two open data ecosystems.   As a result, the Frictionless DarwinCore tool converts DwCA into Frictionless Data Packages, and also gives access to the vast Frictionless Data software ecosystem enabling constraints validation and support of a fully relational data schema.  Technically speaking, the tool is implemented as a Python library, and is exposed as a Command Line Interface. The tool automatically converts: project architecture   * DwCA data schema into datapackage.json * EML metadata into human readable markdown readme file * data files are converted when necessary, this is when default values are described The resulting zip file complies to both DarwinCore and Frictionless specifications.    André hopes that bridging the two standards will give an excellent opportunity for the GBIF community to provide open biodiversity data to a wider audience. He says this is also a good opportunity to discover the Frictionless Data specifications and assess their applicability to the biodiversity domain. In fact, on 9th October 2019, André presented the tool at a GBIF Global Nodes meeting. It was perceived by the nodes managers community as an exploratory and pioneering work. While the command line interface offers a simple user interface for non-programmers, others might prefer the more flexible and sophisticated Python API. André encourages anyone working with DarwinCore data, including all data publishers and data users of GBIF network, to try out the new tool. 
“I’m quite optimistic that the project will feed the necessary reflection on the evolution of our biodiversity standards and data flows.”

To get started, installation of the tool is done through a single pip install command (full directions can be found in the project README). Central to the tool is a table of DarwinCore terms linking a Data Package type, format and constraints for every DwC term. The tool can be used as CLI directly from your terminal window or as Python Library for developers. The tool can work with either locally stored or online DwCA. Once converted to Tabular DataPackage, the DwC data can then be ingested and further processed by software such as Goodtables, OpenRefine or any other Frictionless Data software. André has aspirations to take the Frictionless DarwinCore tool further by encapsulating the tool in a web-service that will directly deliver Goodtables reports from a DwCA, which will make it even more user friendly. Additional ideas for further improvement would be including an import pathway for DarwinCore data into Open Refine, which is a popular tool in the GBIF community. André’s long term hope is that the Data Package will become an optional format for data download on GBIF.org.  workflow Further reading: Repository: https://github.com/frictionlessdata/FrictionlessDarwinCore Project blog: https://andrejjh.github.io/fdwc.github.io/

csv,conf returns for version 5 in May

- October 15, 2019 in #CSVconf, Events, Frictionless Data, News, Open Data, Open Government Data, Open Research, Open Science, Open Software

Save the data for csv,conf,v5! The fifth version of csv,conf will be held at the University of California, Washington Center in Washington DC, USA, on May 13 and 14, 2020.    If you are passionate about data and its application to society, this is the conference for you. Submissions for session proposals for 25-minute talk slots are open until February 7, 2020, and we encourage talks about how you are using data in an interesting way (like to uncover a crossword puzzle scandal). We will be opening ticket sales soon, and you can stay updated by following our Twitter account @CSVconference.   csv,conf is a community conference that is about more than just comma-sepatated-values – it brings together a diverse group to discuss data topics including data sharing, data ethics, and data analysis from the worlds of science, journalism, government, and open source. Over two days, attendees will have the opportunity to hear about ongoing work, share skills, exchange ideas (and stickers!) and kickstart collaborations.   
csv,conf,v4

Attendees of csv,conf,v4

First launched in July 2014,  csv,conf has expanded to bring together over 700 participants from 30 countries with backgrounds from varied disciplines. If you’ve missed the earlier years’ conferences, you can watch previous talks on topics like data ethics, open source technology, data journalism, open internet, and open science on our YouTube channel. We hope you will join us in Washington D.C. in May to share your own data stories and join the csv,conf community!   Csv,conf,v5 is supported by the Sloan Foundation through OKFs Frictionless Data for Reproducible Research grant as well as by the Gordon and Betty Moore Foundation, and the Frictionless Data team is part of the conference committee. We are happy to answer all questions you may have or offer any clarifications if needed. Feel free to reach out to us on csv-conf-coord@googlegroups.com, on twitter @CSVconference or our dedicated community slack channel   We are committed to diversity and inclusion, and strive to be a supportive and welcoming environment to all attendees. To this end, we encourage you to read the Conference Code of Conduct.
Rojo the Comma Llama

While we won’t be flying Rojo the Comma Llama to DC for csv,conf,v5, we will have other mascot surprises in store.

A recap of the 2019 eLife Innovation Sprint

- September 26, 2019 in Events, Frictionless Data, Open Science

Over 36 hours, Jo Barratt and Lilly Winfree from Open Knowledge Foundation’s Frictionless Data team joined 60 people from around the world to develop innovative solutions to open science obstacles at the 2019 eLife Innovation Sprint. This quick, collaborative event in Cambridge, UK, on September 4th and 5th brought together designers, scientists, coders, project managers, and communications experts to develop their budding ideas into functional prototypes. Projects focused on all aspects of open science, including but not limited to improving scientific publishing, data management, and increasing diversity, equity, and inclusion. Both Jo and Lilly pitched projects and thoroughly enjoyed working with their teams on these projects.  Lilly pitched creating an open science game that could be used to teach scientists about open best practices in a fun and informative way. Read on to learn more about these projects, and their experiences at the Sprint. Jo proposed making a podcast documenting the Sprint experience, projects, and people aiming to that would be fully produced and edited and publish the piece during the Sprint.  Lilly’s inspiration to create an open science game came from her experience at Force11 in 2018, where she played a game about FAIR data (Findable, Accessible, Interoperable, and Reusable). She realized that playing a game can be a great way to learn about a subject that might otherwise seem dry, and creating a game prototype seemed like a fun, accessible, and achievable goal for the Sprint. The open science game team formed with eight people from diverse backgrounds, including a game designer, board game enthusiasts, publishers, and scientists. This mix of backgrounds was a big asset to the team, and played a large role in the development of a functional game prototype. To start designing the game, the team first decided that the goal of the game should be to teach scientists about open science best practices, while the collaborative goal for the players would be to make an important scientific discovery – like curing a disease. The team crafted the storyline of the game, and finally worked on the game play mechanics. In the end, the game was made for 2-5 players and ideally would take about 30-45 minutes to play. To play, each player gets a role card — Lab Principal Investigator, Graduate Student, Data Management Librarian, Teaching Assistant, and Data Scientist. Each of these roles has personas and attributes that impact the game. For instance, the Principal Investigator has negative attributes that make sharing research openly harder, while the Teaching Assistant has positive attributes that make it easier to teach new tools to other players. On each turn, the players can draw research object cards or tool cards that help advance the game, but might also draw an event card, which can have positive of negative effects on the gameplay. The ultimate goal is for the players to share their research findings, which requires the player to draw and “research” an insight card and it’s related methods card, data collection card, and analysis card. The game ends once enough research findings are shared (either openly or with restricted access). A fun and interesting part of the game is that the players can role play their characters and see how attitudes towards open science differ and how those attitudes affect the progression of science. Hint: to win the game, the players have to cooperate with each other and openly share at least some of their research findings. The team is currently digitising the game so others can play it – keep track of their progress on their GitHub Repository.
“My team was fantastic to work with. I came to the Sprint with a basic idea and a hope that we could create a fun, educational game on open science, but my team really ran with the idea and created a game that is so much more than I had hoped for!” – Lilly Winfree, OKF

OKF delivery manager, Jo Barratt, brought his storytelling talents to the forefront for the eLife Sprint by proposing the creation of a podcast to document the people and ideas at the Sprint. Jo has produced many podcasts over the years, and thought the podcast format would offer a unique perspective into the inner workings of the Sprint. He was delighted to have two other Sprint members join his Podcast team: Hannah Drury and Elsa Loissel from eLife. Neither Hannah nor Elsa had worked on a podcast before, but both were eager and quick learners. Their project started with Jo giving Hannah and Elsa quick lessons on interviewing, using recording equipment, editing and sound design. Jo was really excited to have such collaborative team members to work with, which was very in line with the synergistic spirit of the Sprint. To capture a feel for the essence of the Sprint, Hannah and Elsa began by interviewing most Sprint members, asking them questions like about their backgrounds and what they hoped to get out of the sprint. Interviewees were also asked to give their views on what ‘open science’ means to them. Next, the team interviewed several projects for a more in depth discussion into how the Sprint works and what types of projects were being developed. In the final podcast, there are interviews with the teams from the open science game project, one on equitable preprints, the project looking at computational training best practices, and the high performance computing in Africa team. Each of these segments shows the people, methods, and progress of the projects, highlighting the diverse people and ideas at the Sprint and giving listeners insight into the process of this type of event as well as many of the problems that face the open science community. Jo’s highlight of the podcast was a conversation between current Innovation officer at eLife, Emmy Tsang, and the past officer, Naomi Penfold. They discussed their experiences hosting the Sprint, and to commented on changes they have witnessed in the open science movement. Listeners to the podcast will notice the overarching themes of openness, collaboration, excitement, and hope for the future of science, while also being challenged to think about who is being left behind in the progress towards a more open world. You can hear the full podcast (and see pictures from the Sprint) here, or listen on Soundcloud here.
“I supported them but really this was made by two scientists who had zero experience in this and I think making this in 2 days is really quite impressive!” – Jo Barratt, OKF
The OKF team would like to thank Emmy and eLife for a great experience at the Sprint!

Part of the Open Knowledge Foundation team met up in Cambridge the day before the Sprint began, and saved the world from a meteor (at an escape room)!

Frictionless Data at the EPFL Open Science in Practice Summer School

- September 16, 2019 in Featured, Frictionless Data, Open Science

In early September our Frictionless Data for Reproducible Research product manager, Lilly Winfree, presented a workshop at the Open Science in Practice Summer School at EPFL University in Lausanne, Switzerland.  Lilly’s workshop focused on teaching early career researchers about using Frictionless software and specs to make their research data more interoperable, shareable, and open. The audience learned about metadata, data schemas, creating data packages, and validating their data with Goodtables. The slides for her workshop are available here, and are licensed as CC-BY-4.0. The Summer School was organized by Luc Henry, Scientific Advisor at EPFL, and was a week-long series of talks and workshops on open science best practices for research students and early career researchers. A highlight of the workshop for Lilly was having the opportunity to work with Oleg Lavrovsky in person. Oleg is on the board of the Swizz chapter of OKF, Opendata.ch, and created the Frictionless Data Julia libraries as a Tool Fund grantee two years ago. Oleg wrote a recap of the workshop, which we are republishing below. The original can be read here. Thanks for your help, Oleg, and for Luc for organizing!

“Open” is the new black. Everybody talks about open science. But what does it mean exactly?

Lilly Winfree of the Frictionless Data for Reproducible Research project at OKF ran a workshop at Open Science in Practice, a week long training organized by the EPFL with Eurotech Universities. It was a top grade workshop delivered to a diverse room of doctoral students, early career researchers, “and beyond” in Lausanne. I had the opportunity to assist her, and learn from her professional delivery, get up to speed with key points about Open Knowledge Foundation, the latest news from the small, diligent people working to make open data more accessible and useful. With a fascinating science background, she connected well with the audience and made a strong case for well published open research data. The workshop reignited my desire to continue publishing Data Packages, contribute to the project, develop better support in various software environments, and be present in community channels. In our conversation afterwards, we talked about the remote work culture and global reach of the team, expectations management, and the challenges ahead. Thanks very much to @heluc and the rest of the #OSIP2019 team for organizing a great event, to all who participated in the workshop for patiently and interestedly hacking their first Data Packages together, and kudos to Lilly for crossing distances to bridge gaps and support Open Science in Switzerland.

Next events

There are two upcoming events that Oleg is involved with that might be of interest to the Frictionless Data and OKF communities: the DINAcon Digital Sustainability Conference, on October 18 in Bern, and the Opendata.ch Tourism Hackathon on November 29 in Lucerne.

A warm welcome to our Frictionless Data for Reproducible Research Fellows

- August 29, 2019 in Featured, Frictionless Data, Open Science

As part of our commitment to opening up scientific knowledge, we recently launched the Frictionless Data for Reproducible Research Fellows Programme, which will run from mid-September until June 2020.  We received over 200 impressive applications for the Programme, and are very excited to introduce the four selected Fellows:
  • Monica Granados, a Mitacs Canadian Science Policy Fellow; 
  • Selene Yang, a graduate student researcher at the National University of La Plata, Argentina; 
  • Daniel Ouso, a postgraduate researcher at the International Centre of Insect Physiology and Ecology; 
  • Lily Zhao, a graduate student researcher at the University of California, Santa Barbara. 
Next month, the Fellows will be writing blogs to further introduce themselves to the Frictionless Data community, so stay tuned to learn more about these impressive researchers. The Programme will train early career researchers to become champions of the Frictionless Data tools and approaches in their field. Fellows will learn about Frictionless Data, including how to use Frictionless tools in their domains to improve reproducible research workflows, and how to advocate for open science. Working closely with the Frictionless Data team, Fellows will lead training workshops at conferences, host events at universities and in labs, and write blogs and other communications content. As the programme progresses, we will be sharing the Fellows’ work on making research more reproducible with the Frictionless Data software suite by posting a series of blogs here and on the Fellows website. In June 2020, the Programme will culminate in a community call where all Fellows will present what they have learned over the nine months: we encourage attendance by our community. If you are interested in learning more about the Programme, the syllabus, lessons, and resources are open.

More About Frictionless Data

The Fellows Programme is part of the Frictionless Data for Reproducible Research project at Open Knowledge Foundation. This project, funded by the Sloan Foundation, applies our work in Frictionless Data to data-driven research disciplines, in order to facilitate data workflows in research contexts. Frictionless Data is a set of specifications for data and metadata interoperability, accompanied by a collection of software libraries that implement these specifications, and a range of best practices for data management. Frictionless Data’s other current projects include the Tool Fund, in which four grantees are developing open source tooling for reproducible research. The Fellows Programme will be running until June 2020, and we will post updates to the Programme as they progress.

Naturalist Datathon: Bogotá (Datatón Naturalista)

- May 15, 2019 in colombia, Open Data Day, open data day 2019, Open Science

This report is part of the event report series on International Open Data Day 2019. On Saturday 2nd March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. The Karisma Foundation from Colombia received funding through the mini-grant scheme by the Frictionless Data for Reproducible Research project to organise an event under the Open Science theme. This report was written by Karen Soacha: her biography is included at the bottom of this post.

Open data, naturalists and pizza were part of the Open Data Day celebration in Bogotá

Why and how to improve the quality of open data on biodiversity available in citizen science platforms, were the questions that brought together more than 40 naturalists in the event organized by the Karisma Foundation, the Humboldt Institute and the Biodiversity Information System of Colombia (SiB Colombia) on 2nd of March 2019 as part of the global celebration of Open Data Day. Expert naturalists, amateurs and those interested in citizen science came together to review the open data generated for Bogotá through the City Nature Challenge 2018. The City Nature Challenge is an annual event that invites city-dwellers across the world to hit the streets for two days to capture and catalogue nature which they might be too occupied to notice otherwise.  Using their smartphones, hundreds of people generate thousands of observations of plants, birds, insects and more, which they share through citizen science platforms such as iNaturalist. Generating the data is just the beginning of the process: improving its quality, so that they have the greatest possibility of being used, is the next step. During the Naturalist Datathon we shared guides to facilitate the identification of species, tips to review observations, as well as good practices for users and reviewers to improve the quality of the data. After a morning of collaborative work, the groups shared their learning and engaged in a discussion about the importance of data quality and its potential use in environmental monitoring especially in the context of environmental issues in Bogotá.

1.Introduction and guides for the activity 2. Roles of the participants 4. Organization of work groups 5. Collaborative review of observations 6. Discussion 7. Naturalist Kit for all the participants

The Datatón Naturalista left us with a set of outputs, specific lessons learned and a set of good practices for the participants, the organizers and the community of naturalists and open data. To begin with, this activity contributed to increasing the community of experts who actively participate in the “curation” of observations published in Naturalista Colombia, which is necessary in order to improve the quality of the data. At the end of the datathon, the quality of the data the participants worked on was vastly improved — so much so that the data will be integrated into the SiB Colombia (the official national continental biodiversity portal). As a result of this datathon, more Colombians were encouraged to participate as urban/rural naturalists. Participants also shared good practices for taking photographs and collecting data necessary for observations to be useful for multiple uses, they mentioned the importance of use licenses for facilitating the reuse and  sharing the information (Creative Commons). They also gave recommendations for the 2019 City Nature Challenge (CNC), such as the need for guides in easy-to-consume formats (such as short videos) that ought to be shared in advance of the CNC.  This guide should go beyond basic information on data capturing, and should include good practices, as well as ethical recommendations for the creators, curators, and users of information. One of the challenges that the participants highlighted was the need to recognize and integrate citizen science data as a source of information for the environmental management of the city. For the organizers, the datatón turned out to be an effective means to create conversation, connections and reflections on the how and for what of the open data, at the same time that allowed to strengthen capacities and contribute with open data of quality. Finally, this event showed that more and more citizens are becoming involved in citizen science, actively contributing to our knowledge of biodiversity, and are working collaboratively to further understand their environment and to generate information that is useful for decision-making. Therefore, it is necessary to continue promoting spaces that allow community-building and facilitate networking around open science and citizen science.  For that reason, we in Bogotá are looking forward to the next Open Data Day.   Biography Karen Soacha is interested in the connection between knowledge management, citizen science, governance and nature. She’s been working with environmental organizations for over 10 years, in the management of data and information networks, especially with open data on biodiversity. She is convinced that science is a way to build dialogue within the society. She is also a teacher, an amateur dancer, and an apprentice naturalist.  

Open Data Day: Open Science events in the Democratic Republic of the Congo and Costa Rica

- May 14, 2019 in congo, Costa Rica, Open Data Day, open data day 2019, Open Science

This report is part of the event report series on International Open Data Day 2019. On Saturday 2nd March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. AfricaTech from the Democratic Republic of the Congo and the Society for Open Science and Biodiversity Conservation (SCiAC) from Costa Rica received funding through the mini-grant scheme by the Frictionless Data for Reproducible Research project and by the Latin American Initiative for Open Data (ILDA), to organise events under the Open Science theme. This report was written by Stella Agama Mbiyi and Diego Gómez Hoyos.

AfricaTech

We organized in the UCC in Kinshasa on March 2, 2019, the Open Day event 2019. Our event was focused on Open Science in the Democratic Republic of the Congo. We had about 50 participants in the event, especially students and some researchers who participated positively in the different sessions and discussions on Open Science in the Democratic Republic of the Congo and its implications for sustainable development. 5 Speakers among 4 women presented various concepts related to Open Science to participants. The conference started at 8:00 and ended at 17:30. Several participants made positive comments about the event such as Florent Nday, a Biological student at University of Kinshasa who said: “This is my first time to hear about Open Science, it’s a huge opportunity for us students from developing countries. Because we will have access to a wide range of knowledge easily.” The social science researcher at Kinshasa’s Institute of Social Science, Mr. Jiress Mbumba commented, “It’s time for us Congolese researchers to promote Open Science in the  Democratic Republic of the Congo, we have an interest to share our researches, and findings with everyone to spur the development of science.” The event ended with a dinner offered to all participants.

Society for Open Science and Biodiversity Conservation (SciAc)

The training workshop on Reproducibility in Science as a link between Open Data, Open Science and Open Education, was organized by SCiAC (Society for Open Science and Biodiversity Conservation) in collaboration with the Biology Department of the University of Costa Rica, ProCAT International, Abriendo Datos Costa Rica and CR Wildlife Foundation. The workshop included general presentations on open ecosystems and data management plans during research projects, as well as training in the use of GitHub and R language for data release and data analysis code in a context of Open Science practices. The four speakers in the workshop were Diego Gómez Hoyos and Rocío Seisdedos from SCiAC, Susana Soto from Abriendo Datos Costa Rica and Ariel Mora from the University of Costa Rica. Fifteen people (66% women) from different provinces of Costa Rica (Puntarenas, Guanacaste, Heredia and San José) participated in the activity. In Central America, especially in Costa Rica, considerable advances have been made regarding open data and open government issues. Our workshop has been one of the first efforts to offer researchers tools in order for open science and open education practices. This workshop has been inspired by the project Open Science MOOC and the “Panama Declaration for Open Science”, led by Karisma Foundation and in which SCiAC took part. From this experience we see a great potential and interest of researchers in knowing the tools with which they can share the elements of their research processes. We also recognize that open science practices could have a significant impact on the teaching of scientific practice. Finally, we identify the need to carry out these training activities as a tool that seeks to democratize access to and generation of knowledge in order to face the environmental, social and economic problems faced by our society.

Open call: become a Frictionless Data Reproducible Research Fellow

- May 8, 2019 in Featured, fellowship program, Frictionless Data, grant, Open Science

The Frictionless Data Reproducible Research Fellows Program, supported by the Sloan Foundation, aims to train graduate students, postdoctoral scholars, and early career researchers how to become champions for open, reproducible research using Frictionless Data tools and approaches in their field. Fellows will learn about Frictionless Data, including how to use Frictionless tools in their domains to improve reproducible research workflows, and how to advocate for open science. Working closely with the Frictionless Data team, Fellows will lead training workshops at conferences, host events at universities and in labs, and write blogs and other communications content. In addition to mentorship, we are providing Fellows with stipends of $5,000 to support their work and time during the nine-month long Fellowship. We welcome applications using this form from 8th May 2019 until 30th July 2019, with the Fellowship starting in the fall. We value diversity and encourage applicants from communities that are under-represented in science and technology, people of colour, women, people with disabilities, and LGBTI+ individuals.

Frictionless Data for Reproducible Research

The Fellowship is part of the Frictionless Data for Reproducible Research project at Open Knowledge International. Frictionless Data aims to reduce the friction often found when working with data, such as when data is poorly structured, incomplete, hard to find, or is archived in difficult to use formats. This project, funded by the Sloan Foundation, applies our work to data-driven research disciplines, in order to help researchers and the research community resolve data workflow issues.  At its core, Frictionless Data is a set of specifications for data and metadata interoperability, accompanied by a collection of software libraries that implement these specifications, and a range of best practices for data management. The core specification, the Data Package, is a simple and practical “container” for data and metadata. The Frictionless Data approach aims to address identified needs for improving data-driven research such as generalized, standard metadata formats, interoperable data, and open-source tooling for data validation.

Fellowship program

During the Fellowship, our team will be on hand to work closely with you as you complete the work. We will help you learn Frictionless Data tooling and software, and provide you with resources to help you create workshops and presentations. Also, we will announce Fellows on the project website and will be publishing your blogs and workshops slides within our network channels.  We will provide mentorship on how to work on an Open project, and will work with you to achieve your Fellowship goals.

How to apply

We welcome applications using this form from 8th May 2019 until 30th July 2019, with the Fellowship starting in the fall. The Fund is open to early career research individuals, such as graduate students and postdoctoral scholars, anywhere in the world, and in any scientific discipline. Successful applicants will be enthusiastic about reproducible research and open science, have some experience with communications, writing, or giving presentations, and have some technical skills (basic experience with Python, R, or Matlab for example), but do not need to be technically proficient. If you are interested, but do not have all of the qualifications, we still encourage you to apply. If you have any questions, please email the team at frictionlessdata@okfn.org, ask a question on the project’s gitter channel, or check out the Fellows FAQ section. Apply soon, and share with your networks!

Citizen repository of open data and exploration of the contracting system in Bolivia

- April 1, 2019 in bolivia, Open Contracting, Open Data Day, open data day 2019, Open Science

This report is part of the event report series on International Open Data Day 2019. On Saturday 2nd March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. Fundación Internet Bolivia.org (FIB.org) and Pamela Gonzales, one of our School of Data Fellows from Bolivia received funding through the mini-grant scheme by the Latin American Initiative for Open Data (ILDA) and by Hivos / Open Contracting Partnership, to organise events under the Open Science and Open Contracting  themes respectively. This report was written by Pamela Gonzales and Wilfredo Jordan: their biographies are included at the bottom. Bolivia is past due on issues regarding access to information and data. This is why Open Data Day is an opportunity to talk about data and how to generate public value through certain projects. We celebrated it with two activities.

A public repository of open data

The Fundación Internet Bolivia (internetbolivia.org) along with 14 volunteers came together to identify and save data sets about social research in Bolivia. In our country, several institutions publish data sets, studies plus information to promote their research. However, with time, some entities cease to exist and with it them the data they collected over the years. The most recent case is the Center for Information and Development of Women (cidem.org.bo), which until 2015 published data on femicide and gender violence towards women in Bolivia. This organization closed its doors and shut down its website locking us out to the data they collected over 32 years. This is why we decided to create an open data repository that works as a backup, for these lost data sets, and ensures permanent access to databases in a single place. With this in mind, we identified databases and updated their metadata. This was a good occasion to talk about the importance of open data, how they are being used, their principles, characteristics and potential uses. The second part of our workshop was intended to talk about GitLab (https://gitlab.com) and its potential to become a collaborative open data repository. After a few hours of work, we created our accounts and published some databases with their respective metadata which you can see here: https://gitlab.com/bases-bolivia/test-of-base The participants were interested in working on more datasets and learning the ropes to free information, so we decided to meet again every 15 days to improve our work and expand the data community in La Paz. If the interest to catalog more databases continue and there are more interested citizens, we will be ready to take a more significant step, e.g. create a catalog of open data of Bolivia. The community will have to decide. We would like to thank the facilitators for their support: Miriam Jemio, Wilfredo Jordán, Guillermo Movia, activists and open data enthusiasts who took charge of the workshop’s dynamics, and the Internet Bolivia Foundation, for letting us work in their offices and it’s interest on a subject as important as this one.

Exploration to the public contracting system of Bolivia

In order to learn more about public budgets allocated to gender, on Saturday, March 9, 2019, we commemorated with a workshop held at the offices of Bolivia Tech Hub (the most influential collaborator of the technological ecosystem in Bolivia). This event was held in the city of La Paz, from 9:30 a.m. to 2:30 p.m., with the participation of 22 people with a technical background. The workshop was led by Pamela Gonzales, a fellow of the School of Data, and began with an explanation of basic concepts of open data, followed by an introduction to open contracting. In the second part, we reviewed the Public Contracting System of the Plurinational State of Bolivia (SICOES) and the procedure to extract data from its website sicoes.gob.bo, the types of searches, documents obtained and formats. Although the data and information on employing are public, the problem is that this website is designed for public officials to publish the procedures and not for the users or citizens who want to navigate the website, which hinders accessibility, search and collection of data. A second point is that one of the ways to obtain all the routes of a specific contract is only possible with the Unique Code of State Contracting (CUCE in Spanish), which makes it difficult to search and obtain information, so a standardization of this Transparency platform according to open contracting standards is still a pending issue in Bolivia. In the third part of the event, we started with the data expedition. We searched for contracts related to gender, for this, we put in practice the techniques learned in the first part of the workshop, for example using Open Refine to clean the obtained data. Finally, the participants exchanged their contacts to develop common projects. As a result of this workshop, a survey we conducted for the public showed 70% interest in knowing more about open contracting and learning its standards.  

Biographies

Pamela Gonzales: I’m a serial entrepreneur at Bolivia Tech Hub, a collaborative space for Technological Projects. I also organized hackathons. My main goal is to manage and develop projects. I enjoy very much building teams who enjoy taking on challenging projects. Wilfredo Jordan: Digital journalist specialized in new media. Open data activist. Blog: https://wilfredojordan.blogspot.com/  

Open Data Day in Venezuela

- March 29, 2019 in Open Data Day, open data day 2019, Open Science, venezuela

This report is part of the event report series on International Open Data Day 2019. On Saturday 2nd March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. Centro Latinoamericano de Investigaciones Sobre Internet from Venezuela received funding through the mini-grant scheme by the Latin American Initiative for Open Data (ILDA) to organise an event under the Open Science theme. This report was written by Jose Luis Mendoza. We organized events in two locations simultaneously, at two of the most important cities in Venezuela (at almost 700 Kms each) Mérida and Caracas. Within the facilities of the main universities, Central University of Venezuela and the University of the Andes. We used audiovisual aid and internet tools to host international speakers and to exchange experiences between cities. With a joint attendance of 52 students, plus authorities of the Universities and speakers from Geneva, London, Santiago de Chile, Caracas and Mérida. The event moved between the melancholy of remembering times of ingenuity and not so distant development, with great advances in open data and access to information, as well as the strong crisis that crosses the country with the challenges that this represents; and on the other hand of new proposals, programs and projected solutions that excited the assistants. Thus, we remember the times of the “bibliobus”, a walking library that the University of the Andes developed to take the reading to all the towns of the Andean mountain range where there are children of scarce resources and that makes it difficult to reach a traditional library, project that unfortunately was suppressed by the central state when removing the vehicle they had donated, also the crisis that crosses the country hindered the maintenance of said vehicle. The local chapter of Internet Society came to explain their career and the resources they make freely available to students and researchers on their website, as well as telecommunications infrastructure development projects for low-income schools. Speaking from Geneva and London, the founders of the Virtual Center of High Studies of High Energy for Venezuela, belonging to the European Organization for Nuclear Research and the Alan Turing Institute, explain how the open data of the Large Hadron Collider made possible one of the discoveries that have revolutionized science in our century, as well as the set of educational activities that in remote mode is directing your organization. The Universidad de  Andes has been recognized worldwide by its online library and the resources available in it to every user, we could not but invite the Director of the library to tell us how they make this possible even in the midst of the crisis in the country, as well as the telematics infrastructure management of the university and finally the Director of the library network of the state of Mérida. It was possible to appreciate the enthusiasm of the attendants for this type of activities, in which they learn in an experiential way the opportunities they can access through open data, both work and study, as well as being able to carry out investigations that are not visibly affected by the structural impoverishment of the universities of the country, enthusiasm that came to demand loudly that activities like these are repeated, and that cover in addition to the open data other areas and aspects of their interest, that is, the application of these data open to certain areas of knowledge.