You are browsing the archive for Lilly Winfree.

Join #Hacktoberfest 2019 with Frictionless Data

- October 3, 2019 in Frictionless Data, hackathon

The Frictionless Data team is excited to participate in #Hacktoberfest 2019! Hacktoberfest is a month-long event where people from around the world contribute to open source software (and – you can win a t-shirt!). How does it work? All October, the Frictionless Data repositories will have issues ready for contributions from the open source community. These issues will be labeled with ‘Hacktoberfest’ so they can be easily found. Issues will range from beginner level to more advanced, so anyone who is interested can participate. Even if you’ve never contributed to Frictionless Data before, now is the time!  To begin, sign up on the official website (https://hacktoberfest.digitalocean.com) and then read the OKF project participation guidelines + code of conduct and coding standards. Then find an issue that interests you by searching through the issues on the main Frictionless libraries (found here) and also on our participating Tool Fund repositories here. Next, write some code to help fix the issue, and open a pull request for the Frictionless Team to review. Finally, celebrate your contribution to an open source project! We value and rely on our community, and are really excited to participate in this year’s #Hacktoberfest. If you get stuck or have questions, reach out to the team via our Gitter channel, or comment on an issue. Let’s get hacking!

A halfway point update from the 2019 Frictionless Data Tool Fund

- September 25, 2019 in Featured, Frictionless Data, tool fund

In June 2019, we launched the Frictionless Data Tool Fund to facilitate reproducible data workflows in research contexts. Our four Tool Fund grantees are now at the halfway point of their projects, and have made great progress. Read on to learn more about these projects, their next steps, and how you can also contribute.

Stephan Max: Data Package tools for Google Sheets

Stephan’s Tool Fund work is focused on creating an add-on for Google Sheets to allow for Data Package import and export. With this tool, researchers (and other data wranglers) that use Google Sheets will be able to quickly and easily incorporate Data Packages into their existing data processing workflows. Recently, Stephan created a prototype that you can test at the project’s GitHub Repo by following the steps outlined in the README file: https://github.com/frictionlessdata/googlesheets-datapackage-tools. Next steps for Stephan’s project include enhancing the user interface, and adding additional information such as licensing options for the export button. If you try the prototype, please leave Stephan feedback as an issue in the repository.

João Peschanski and team: Neuroscience Experiments System (NES)

To improve the way neuroscience experimental data and metadata is shared, João and the team at the Research, Innovation and Dissemination Center for Neuromathematics (RIDC NeuroMat) are working on implementing Data Packages into their Neuroscience Experiments System (NES). NES is an open-source tool for data collection that stores large amounts of data in a structured way. This tool aims to assist neuroscience research laboratories in routine experimental procedures. During the Tool Fund, João and team have created a Data Package exportation module from within NES that reflects the Frictionless specifications for data and metadata interoperability. This export includes a JSON file descriptor (a datapackage.json file) with information related to how the experiment was performed, with a goal of increasing reproducibility. Next steps for the team include more testing and gathering feedback, and then a public release. The NES GitHub repository can be seen here: https://github.com/neuromat/nes.

André Heughebaert: DarwinCore Archive Data Package support

Inspired by his work with the Global Biodiversity Information Facility (GBIF), André is converting DarwinCore Archives into Data Packages for his Tool Fund project. The DarwinCore is a standard describing biological diversity that is intended to increase interoperability of biological data. André has recently completed a first release of the tool, which appends datapackage.json and README.md files containing the data descriptors and human readable metadata to the DarwinCore archive. This release supports all standard DarwinCore terms, and has been tested with several use cases. You can read more about Frictionless DarwinCore and see all of the use cases André tested for the beta release in the repo’s README file. If you want to test or contribute to this Tool Fund project, please open an issue in the repository.

Shelby Switzer and Greg Bloom: Open Referral Human Services data package support

Shelby’s Tool Fund work is building out datapackage support for Open Referral’s Human Service Data Specification (HSDS) and Human Service Data API Suite (HSDA). Open Referral develops data standards and open source tools for health, human, and social services. For the Tool Fund, Shelby has been developing on their HSDS-Transformer, which takes raw data, transforms it to HSDS format, and then packages it as a datapackage within a zip file, so users can work with tidily packaged data. For example, Shelby and the Open Referral team have been working with 2-1-1 in Miami-Dade, Florida, to help transform and share their resource directory database with their partners in a more sustainable fashion. Next steps for Shelby include creating a UI for their HSDS-Transformer so that anyone can access HSDS-compliant datapackages. Shelby will also be contributing to the improvement of the datapackage Ruby gem during this project.

A warm welcome to our Frictionless Data for Reproducible Research Fellows

- August 29, 2019 in Featured, Frictionless Data, Open Science

As part of our commitment to opening up scientific knowledge, we recently launched the Frictionless Data for Reproducible Research Fellows Programme, which will run from mid-September until June 2020.  We received over 200 impressive applications for the Programme, and are very excited to introduce the four selected Fellows:
  • Monica Granados, a Mitacs Canadian Science Policy Fellow; 
  • Selene Yang, a graduate student researcher at the National University of La Plata, Argentina; 
  • Daniel Ouso, a postgraduate researcher at the International Centre of Insect Physiology and Ecology; 
  • Lily Zhao, a graduate student researcher at the University of California, Santa Barbara. 
Next month, the Fellows will be writing blogs to further introduce themselves to the Frictionless Data community, so stay tuned to learn more about these impressive researchers. The Programme will train early career researchers to become champions of the Frictionless Data tools and approaches in their field. Fellows will learn about Frictionless Data, including how to use Frictionless tools in their domains to improve reproducible research workflows, and how to advocate for open science. Working closely with the Frictionless Data team, Fellows will lead training workshops at conferences, host events at universities and in labs, and write blogs and other communications content. As the programme progresses, we will be sharing the Fellows’ work on making research more reproducible with the Frictionless Data software suite by posting a series of blogs here and on the Fellows website. In June 2020, the Programme will culminate in a community call where all Fellows will present what they have learned over the nine months: we encourage attendance by our community. If you are interested in learning more about the Programme, the syllabus, lessons, and resources are open.

More About Frictionless Data

The Fellows Programme is part of the Frictionless Data for Reproducible Research project at Open Knowledge Foundation. This project, funded by the Sloan Foundation, applies our work in Frictionless Data to data-driven research disciplines, in order to facilitate data workflows in research contexts. Frictionless Data is a set of specifications for data and metadata interoperability, accompanied by a collection of software libraries that implement these specifications, and a range of best practices for data management. Frictionless Data’s other current projects include the Tool Fund, in which four grantees are developing open source tooling for reproducible research. The Fellows Programme will be running until June 2020, and we will post updates to the Programme as they progress.

Meet our 2019 Frictionless Data Tool Fund grantees

- July 4, 2019 in Featured, Frictionless Data

In order to facilitate reproducible data workflows in research contexts, we recently launched the Frictionless Data Tool Fund. This one-time $5,000 grant attracted over 90 applications from researchers, developers, and data managers from all over the world. We are very excited to announce the four grantees for this round of funding, and have included a short description of each grantee and their project in this announcement. For a more in depth profile of each grantee and their Tool Fund projects, as well as information about how the community can help contribute to their work, follow the links in each profile to learn more. We look forward to sharing their work on developing open source tooling for reproducible research built using the Frictionless Data specifications and software.   

Stephan Max

Stephan Max is a computer scientist based in Cologne, Germany, that is passionate about making the web a fair, open, and safe place for everybody. Outside of work, Stephan has contributed to the German OKF branch as a mentor for the teenage hackathon weekends project “Jugend Hackt” (Youth Hacks). Stephan’s Tool Fund project will be to create a Data Package import/export add-on to Google Sheets.
“How can we feed spreadsheets back into a Reproducible Research pipeline? I think Data Packages is a brilliant format to model and preserve exactly that information.”

Read more about Stephan and the Google Sheets Data Package add-on here.  

Carlos Ribas and João Peschanski

João Alexandre Peschanski and Carlos Eduardo Ribas work with the Research, Innovation and Dissemination Center for Neuromathematics (RIDC NeuroMat), from the São Paulo Research Foundation. They are focused on developing open-source computational tools to advance open knowledge, open science, and scientific dissemination. They will be using the Tool Fund to work on the Neuroscience Experiments System (NES), which is an open-source tool that aims to assist neuroscience research laboratories in routine procedures for data collection.
“The advantages of the Frictionless Data approach for us is fundamentally to be able to standardize data opening and sharing within the scientific community.”
Read more about Carlos, João, and NES here.  

André Heughebaert

André Heughebaert is an IT Software Engineer at the Belgian Biodiversity Platform and is the Belgian GBIF Node manager. As an Open Data advocate, André works with GBIF and the Darwin Core standards and related Biodiversity tools to support publication and re-use of Open Data. André’s Tool Fund project will automatically convert Darwin Core Archive into Frictionless Data Packages. 
“I do hope Frictionless and GBIF communities will help me with issuing/tracking and solving incompatibilities, and also to build up new synergies.”
Read more about André and the Darwin Core Data Package project here.  

Greg Bloom and Shelby Switzer

Shelby Switzer and Greg Bloom work with Open Referral, which develops data standards and open source tools for health, human, and social services. Shelby is a long-time civic tech contributor, and Greg is the founder of the Open Referral Initiative. For the Tool Fund, they will be building out Data Package support for all their interfaces, from the open source tools that transform and validate human services data to the Human Services API Specification.
“With the Frictionless Data approach, we can more readily work with data from different sources, with varying complexity, in a simple CSV format, while preserving the ability to easily manage transformation and loading.”
Read more about Greg, Shelby, and their Tool Fund project here.  

More About Frictionless Data

The Tool Fund is part of the Frictionless Data for Reproducible Research project at Open Knowledge Foundation. This project, funded by the Sloan Foundation, applies our work in Frictionless Data to data-driven research disciplines. Frictionless Data is a set of specifications for data and metadata interoperability, accompanied by a collection of software libraries that implement these specifications, and a range of best practices for data management. The Tool Fund projects will be running through the end of 2019, and we will post updates to the projects as they progress.

Open call: become a Frictionless Data Reproducible Research Fellow

- May 8, 2019 in Featured, fellowship program, Frictionless Data, grant, Open Science

The Frictionless Data Reproducible Research Fellows Program, supported by the Sloan Foundation, aims to train graduate students, postdoctoral scholars, and early career researchers how to become champions for open, reproducible research using Frictionless Data tools and approaches in their field. Fellows will learn about Frictionless Data, including how to use Frictionless tools in their domains to improve reproducible research workflows, and how to advocate for open science. Working closely with the Frictionless Data team, Fellows will lead training workshops at conferences, host events at universities and in labs, and write blogs and other communications content. In addition to mentorship, we are providing Fellows with stipends of $5,000 to support their work and time during the nine-month long Fellowship. We welcome applications using this form from 8th May 2019 until 30th July 2019, with the Fellowship starting in the fall. We value diversity and encourage applicants from communities that are under-represented in science and technology, people of colour, women, people with disabilities, and LGBTI+ individuals.

Frictionless Data for Reproducible Research

The Fellowship is part of the Frictionless Data for Reproducible Research project at Open Knowledge International. Frictionless Data aims to reduce the friction often found when working with data, such as when data is poorly structured, incomplete, hard to find, or is archived in difficult to use formats. This project, funded by the Sloan Foundation, applies our work to data-driven research disciplines, in order to help researchers and the research community resolve data workflow issues.  At its core, Frictionless Data is a set of specifications for data and metadata interoperability, accompanied by a collection of software libraries that implement these specifications, and a range of best practices for data management. The core specification, the Data Package, is a simple and practical “container” for data and metadata. The Frictionless Data approach aims to address identified needs for improving data-driven research such as generalized, standard metadata formats, interoperable data, and open-source tooling for data validation.

Fellowship program

During the Fellowship, our team will be on hand to work closely with you as you complete the work. We will help you learn Frictionless Data tooling and software, and provide you with resources to help you create workshops and presentations. Also, we will announce Fellows on the project website and will be publishing your blogs and workshops slides within our network channels.  We will provide mentorship on how to work on an Open project, and will work with you to achieve your Fellowship goals.

How to apply

We welcome applications using this form from 8th May 2019 until 30th July 2019, with the Fellowship starting in the fall. The Fund is open to early career research individuals, such as graduate students and postdoctoral scholars, anywhere in the world, and in any scientific discipline. Successful applicants will be enthusiastic about reproducible research and open science, have some experience with communications, writing, or giving presentations, and have some technical skills (basic experience with Python, R, or Matlab for example), but do not need to be technically proficient. If you are interested, but do not have all of the qualifications, we still encourage you to apply. If you have any questions, please email the team at frictionlessdata@okfn.org, ask a question on the project’s gitter channel, or check out the Fellows FAQ section. Apply soon, and share with your networks!

Open Knowledge Foundation community meet up at csv,conf,v4

- April 16, 2019 in #CSVconf, Events

  • When: May 7th, 5-7pm
  • Location: Eliot Center, Portland, OR
  • Cost: Free; pizza & beverages available
Join Open Knowledge Foundation (OKF) for a community event the night before csv,conf,v4! This meet and greet happy hour will feature lightning talks on open projects, designated networking time, and pizza. We invite OKF community members to submit ideas for short lightning talks (5 minutes maximum). Do you want to give a talk, but aren’t already a member of the OKF community? No problem! We are an inclusive community of Open enthusiasts (open data, open science, open source, open government, etc), and the evening is open to anyone who wants to share their ideas. Come learn more about what we do, the open projects our members are working on, ways to get involved with an open project, and meet others! This event is open to all (including csv,conf,v4 attendees as well as other open enthusiasts).     More about csv,conf,v4 csv,conf is a community conference for data makers everywhere, bringing diverse groups together to discuss data topics, and featuring stories about data sharing and data analysis from science, journalism, government, and open source. It takes place from May 8-9 2019 at the Eliot Center in Portland, Oregon. More information on the program is available from the website, and you can still get your conference tickets on Eventbrite.   More about Open Knowledge Foundation (OKF): OKF is a global non-profit organisation and worldwide network of people passionate about openness, and using advocacy, technology and training to unlock information and enable people to work with it to create and share knowledge. Chat with us on Gitter, join a discussion on our Forum, or check out our projects for ways to get involved!

Announcing the Frictionless Data Tool Fund

- February 18, 2019 in Frictionless Data

Warming up to csv,conf.v4

- February 1, 2019 in #CSVconf, Events, Frictionless Data

On May 8 and 9 2019, the fourth version of csv,conf is set to take place at Eliot Center in Portland, Oregon, United States. csv,conf is a community conference bringing together diverse groups to discuss data topics, and features stories about data sharing and data analysis from science, journalism, government, and open source. Over two days, attendees will have the opportunity to hear about ongoing work, share skills, exchange ideas (and stickers!) and kickstart collaborations. This year, our keynotes include Teon L. Brooks, a data scientists at Mozilla, and Kirstie Whitaker, a research fellow at the Alan Turing Institute, with more announcements to come soon. If you would like to share your work, submissions for session proposals for our 25-minute talk slots are open from now until end of day, February 9, 2019. When csv,conf first launched in July 2014 as a conference for data makers everywhere, it adopted the comma-separated-values format in its branding metaphorically. However, as a data conference that brings together people from different disciplines and domains, conversations and anecdotes shared at csv,conf are not limited to the CSV file format. We are keen on getting as many people as possible to csv,conf,v4, and the conference will award travel grants to subsidize travel and associated costs for interested parties that lack the resources and support to get them to Portland. To that end, we have set up our honor-system, conference ticketing page on Eventbrite. We encourage you to get your conference tickets as soon as possible, keeping in mind that as a non-profit and community-run conference, proceeds from ticket sales will help cover our catering and venue costs in addition to offering travel support for speakers and attendees where needed. Additionally, Open Knowledge International will host a community event during the main csv,conf meeting where you can learn more about our Network and catch up with what the community has been doing. From the work on data literacy with School of Data, to the community involved on Open Data Day and initiatives on OpenGLAM, personal data and open education, we want to share with you the state of open knowledge in our Network.  We will be announcing more details about our community event soon! From the first three conferences held in the last four years, csv,conf has brought together over 500 participants from 30 countries. More than 300 talks spanning over 180 hours have been presented, packaged and shared on our YouTube channel. Many post-conference narratives and think pieces, as well as interdisciplinary collaborations have also surfaced from previous conferences. This is only part of the story, and we can’t wait to see and hear from you in Portland in May, and are excited for all that awaits! Csv,conf,v4 is supported by the Sloan Foundation through OKIs Frictionless Data for Reproducible Research grant, and the Frictionless Data team is part of the conference committee. We are happy to answer all questions you may have or offer any clarifications if needed. Feel free to reach out to us on csv-conf-coord@googlegroups.com.

The commallama at csv,conf,v3 will return in this year!

Introducing our new Product Manager for Frictionless Data

- November 5, 2018 in Frictionless Data, Open Science

Earlier this year OKI announced new funding from  The Alfred P. Sloan Foundation to explore “Frictionless Data for Reproducible Research”. Over the next three years we will be working closely with researchers to support the way they are using data with the Frictionless Data software and tools. The project is delighted to announce that Lilly Winfree has come on board as Product Manager to work with research communities on a series of focussed pilots in the research space and to help us develop focussed training and support for researchers. Data practices in scientific research are transforming as researchers are facing a reproducibility revolution; there is a growing push to make research data more open, leading to more transparent and reproducible science. I’m really excited to join the team at OKI, whose mission of creating a world where knowledge creates power for the many, not the few really resonates with me and my desires to make science more open. During my grad school years as a neuroscience researcher, I was often frustrated with “closed” practices (inaccessible data, poorly documented methods, paywalled articles) and I became an advocate for open science and open data. While investigating brain injury in fruit flies (yes, fruit fly brains are actually quite similar to human brains!), I taught myself coding to analyse and visualise my research data. After my PhD research, I worked on integrating open biological data with the Monarch Initiative, and delved into the open data licensing world with the Reusable Data Project. I am excited to take my passion for open data and join OKI to work on the Frictionless Data project, where I will get to go back to my scientific research roots and work with researchers to make their data more open, shareable, and reproducible. Most people that use data know the frustrations of missing values, unknown variables, and confusing schema (just to name a few). This “friction” in data can lead to massive amounts of time being spent on data cleaning, with little time left for analysis. The Frictionless Data for Reproducible Research project will build upon years of work at OKI focused on making data more structured, discoverable, and usable.  The core of Frictionless Data is the data preparation and validation stages, and the team has created specifications and tooling centered around these steps. For instance, the Data Package Creator packages tabular data with its machine readable metadata, allowing users to understand the data structure, meaning of values, how the data was created, and the license. Also, users can validate their data for structure and content with Goodtables, which reduces errors and increases data quality. By creating specifications and tooling and promoting best practices, we are aiming to make data more open and more easily shareable among people and between various tools. For the next stage of the project, I will be working with organisations on pilots with researchers to work on reducing the friction in scientists’ data. I will be amassing a network of researchers interested in open data and open science, and giving trainings and workshops on using the Frictionless Data tools and specs. Importantly, I will work with researchers to integrate these tools and specs into their current workflows, to help shorten the time between experiment → data → analysis → insight. Ultimately, we are aiming to make science more open, efficient, and reproducible. Are you a researcher interested in making your data more open? Do you work in a research-related organization and want to collaborate on a pilot? Are you an open source developer looking to build upon frictionless tools? We’d love to chat with you! We are eager to work with scientists from all disciplines.  If you are interested, connect with the project team on the public gitter channel, join our community chat, or email Lilly at lilly.winfree@okfn.org!

Lilly in the fruit fly lab