You are browsing the archive for OKF Russia.

Open Knowledge Russia: Experimenting with data expeditions

- March 11, 2015 in #openeducationwk, Featured, OKF Russia, Open Knowledge, open-education, WG Open Education

As part of Open Education Week #openeducationwk activities we are publishing a post on how Open Knowledge Russia have been experimenting with data expeditions. This a follow up post to one that appeared on the Open Education Working Group Website which gave an overview of Open Education projects in Russia.
Anna

Anna Sakoyan

The authors of this post are Anna Sakoyan and Irina Radchenko, who together have founded DataDrivenJournalism.RU.
Irina

Irina Radchenko

Anna is currently working as a journalist and translator for a Russian analytical resource Polit.ru and is also involved in the activities of NGO InfoCulture. You can reach Anna on Twitter on @ansakoy, on Facebook and on LinkedIn. She blogs in English at http://ourchiefweapons.wordpress.com/. Irina Radchenko is a Associate Professor at ITMO University and Chief Coordinator of Open Knowledge Russia. You can reach Irina on Twitter on @iradche, on Facebook and on LinkedIn. She blogs in Russian at http://iradche.ru//.

1. DataDrivenJournalism.RU project and Russian Data Expeditions

The open educational project DataDrivenJournalism.RU was launched in April 2013 by a group of enthusiasts. Initially it was predominantly a blog, which accumulated translated and originally written manuals on working with data, as well as more general articles about data driven journalism. Its mission was formulated as promoting the use of data (Open Data first of all) in the Russian-language environment and its main objective was to create an online platform to consolidate the Russian-speaking people who were interested in working with data, so that they can exchange their experiences and learn from each other. As the number of the published materials grew, they had to be structured in a searchable way, which resulted in making it look more like a website with special sections for learning materials, interactive educational projects (data expeditions), helpful links, etc. russia1 On one hand, it operates as an educational resource with a growing collection of tutorials, a glossary and lists of helpful external links, as well as the central platform of its data expeditions; on the other hand, as a blog, it provides a broader context of open data application to various areas of activity, including data driven journalism itself. After almost two years of its existence, DataDrivenJournalism.RU has a team of 10 regular authors (comprised of enthusiasts from Germany, Kazakhstan, Russia, Sweden and UK). More than a hundred posts have been published, including 15 tutorials. It has also launched 4 data expeditions, the most recent in December 2014. The term data expedition was first coined by Open Knowledge’s School of Data, which launched such peer-learning projects both in online and offline formats. We took this model as the basic principle and tried to apply it to the Russian environment. It turned out to be rather perspective, so we began experimenting with it, in order to make this format a more efficient education tool. In particular, we have tried a very loose organisational approach where the participants only had a general subject in common, but were free to choose their own strategy in working with it; a rather rigid approach with a scenario and tasks; and a model, which included experts who could navigate the participants in the area that they had to explore. These have been discussed in our guest post on Brian Kelly’s blog ‘UK Web Focus’. Our fourth data expedition was part of a hybrid learning model. Namely, it was the practical part of a two-week’s offline course taught by Irina Radchenko in Kazakhstan. This experience appears to be rather inspiring and instructive.

2. International Data Expedition in Kazakhstan

The fourth Russian-language data expedition (DE4) was a part of a two-week’s course under the auspices of Karaganda State Technological University taught by Irina Radchenko. After the course was over the university participants who sucessfully completed all the tasks within DE4 received a certificate. Most interesting projects were later published at DataDrivenJournalism.RU. One of them is about industry in Kazakhstan by Asylbek Mubarak who also tells (in Russian) about his experience of participating in DE4 and also about the key stages of his work with data. The other, by Roman Ni is about some aspects of Kazakhstan budget. First off, it was a unique experience of launching a data expedition outside Russia. It was also interesting that DE4 was a part of a hybrid learning format, which combined traditional offline lectures and seminars with a peer-learning approach. The specific of the peer-learning part was that it was open, so that any online user could participate. The problem was that the decision to make it open occurred rather late, so there was not much time to properly promote its announcement. However, there were several people from Russia and Ukraine who registered for participation. Unfortunately none of them participated actively, but hopefully, they managed to make some use of course materials and tasks published in the DE4 Google group. russia2 This mixed format was rather time-taking, because it required not only preparation for regular lectures, but also a lot of online activity, including interaction with the participants, answering their questions in Google group and checking their online projects. The participants of the offline course seemed enthusiastic about the online part, many found it interesting and intriguing. In the final survey following DE4, most of the respondents emphasised that they liked the online part. The initial level of the participants was very uneven. Some of them knew how to program and work with data bases, others had hardly ever been exposed to working with data. DE4 main tasks were build in a way that they could be done from scratch based only on the knowledge provided within the course. Meanwhile, there were also more advanced tasks and techniques for those who might find them interesting. Unfortunately, many participants could not complete all the tasks, because they were students and were right in the middle of taking their midterm exams at university. russia3 Compared to our previous DEs, the percentage of completed tasks was much higher. The DE4 participants were clearly better motivated in terms of demonstrating their performance. Most importantly, some of them were interested in receiving a certificate. Another considerable motivation was participation in offline activities, including face-to-face discussions, as well as interaction during Irina’s lectures and seminars. russia4 russia5 Technically, like all the previous expeditions, DE4 was centered around a closed Google group, which was used by the organisers to publish materials and tasks and by participants to discuss tasks, ask questions, exchange helpful links and coordinate their working process (as most of them worked in small teams). The chief tools within DE4 were Google Docs, Google Spreadsheets, Google Refine and Infogr.am. Participants were also encouraged to suggest or use other tools if they find it appropriate. 42 people registered for participation. 36 of them were those who took the offline course at Karaganda State Technical University. Those were most active, so most of our observations are based on their results and feedback. Also, due to the university base of the course, 50% of the participants were undergraduate students, while the other half included postgraduate students, people with a higher education and PhD. Two thirds of the participants were women. As to age groups, almost a half of the participants were between 16 and 21 years old, but there was also a considerable number of those between 22 and 30 years old and two above 50. 13 per cent of the participants completed all the tasks, including the final report. According to their responses to the final survey, most of them did their practical tasks by small pieces, but regularly. As to online interaction, the majority of respondens said they were quite satisfied with their communication experience. About a half of them though admitted that they did not contribute to online discussions, although found others’ contributions helpful. General feedback was very positive. Many pointed out that they were inspired by the friendly atmosphere and mutual helpfulness. Most said they were going to keep learning how to work with open data on their own. Almost all claimed they would like to participate in other data expeditions.

3. Conclusions

DE4 was an interesting step in the development of the format. In particular, it showed that an open peer-learning format can be an important integral part of a traditional course. It had a ready-made scenario and an instructor, but at the same time it heavily relied on the participants’ mutual help and experience exchange, and also provided a great degree of freedom and flexibility regarding the choice of subjects and tools. It is also yet another contribution to the collection of materials, which might be helpful in future expeditions alongside with the materials from all the previous DEs. It is part of a process of gradual formation of an educational resources base, as well as a supportive social base. As new methods are applied and tested in DEs, the practices that proved best are stored and used, which helps to make this format more flexible and helpful. What is most important is that this model can be applied to almost any educational initiative, because it is easily replicated and based on using free online services.

BudgetApps: The First All-Russia Contest on Open Finance Data

- January 16, 2015 in Budget Data, historical data, OKF Russia, Open Data

This is a guest post by Ivan Begtin, Ambassador for Open Knowledge in Russia and co-founder of the Russian Local Group. budgetapps2 Dear friends, the end of 2014 and the beginning of 2015 have been marked by an event, which is terrific for all those who are interested in working with open data, participating in challenges for apps developers and generally for all people who are into the Open Data Movement. I’m also sure, by the way, that people who are fond of history will find it particularly fascinating to be involved in this event. On 23 December 2014, the Russian Ministry of Finance together with NGO Infoculture launched an apps developers’ challenge BudgetApps based on the open data, which have been published by the Ministry of Finance over the past several years. There is a number of various datasets, including budget data, audit organisations registries, public debt, national reserve and many other kinds of data. Now, it happened so that I have joined the jury. So I won’t be able to participate, but let me provide some details regarding this initiative. All the published data can be found at the Ministry website. Lots of budget datasets are also available at The Single Web Portal of the Russian Federation Budget System. That includes the budget structure in CSV format, the data itself, reference books and many other instructive details. Data regarding all official institutions are placed here. This resource is particularly interesting, because it contains indicators, budgets, statutes and numerous other characteristics regarding each state organisation or municipal institution in Russia. Such data would be invaluable for anyone who considers creating a regional data-based project. One of the challenge requirements is that the submitted projects should be based on the data published by the Ministry of Finance. However, it does not mean that participants cannot use data from other sources alongside with the Ministry data. It is actually expected that the apps developers will combine several data sources in their projects. To my mind, one should not even restrict themselves to machine-readable data, because there are also available human-readable data that can be converted to open data formats by participants. Many potential participants know how to write parsers on their own. For those who have never had such an experience there are great reference resources, e.g. ScraperWiki that can be helpful for scraping web pages. There are also various libraries for analysing Excel files or extracting spreadsheets from PDF documents (for instance, PDFtables, Abbyy Finereader software or other Abbyy services ). Moreover, at other web resources of the Ministry of Finance there is a lot of interesting information that can be converted to data, including news items that recently have become especially relevant for the Russian audience.

Historical budgets

There is a huge and powerful direction in the general process of opening data, which has long been missing in Russia. What I mean here is publishing open historical data that are kept in archives as large paper volumes of reference books containing myriads of tables with data. These are virtually necessary when we turn to history referring to facts and creating projects devoted to a certain event. The time has come at last. Any day now the first scanned budgets of the Russian Empire and the Soviet Union will be openly published. A bit later, but also in the near future, the rest of the existing budgets of the Russian Empire, the Soviet Union, and the Russian Soviet Federated Socialist Republic will be published as well. These scanned copies are being gradually converted to machine-readable formats, such as Excel and CSV data reconstructed from these reference books – both as raw data and as initially processed and ordered data. We created these ordered normalised versions to make it easier for developers to use them in further visualisations and projects. A number of such datasets have already been openly published. It is also worth mentioning that a considerable number of scanned copies of budget reference books (from both the Russian Empire and USSR) have already been published online by Historical Materials, a Russian-language grass-root project launched by a group of statisticians, historians and other enthusiasts. Here are the historical machine-readable datasets published so far: I find this part of the challenge particularly inspiring. If I were not part of the jury, I would create my own project based on historical budgets data. Actually, I may well do something like that after the challenge is over (unless somebody does it earlier).

More data?

There is a greater stock of data sources that might be used alongside with the Ministry data. Here are some of them: These are just a few examples of numerous available data sources. I know that many people also use data from Wikipedia and DBPedia.

What can be done?

First and foremost, there are great opportunities for creating projects aimed at enhancing the understandability of public finance. Among all, these could be visual demos of how the budget (or public debt, or some particular area of finance) is structured. Second, lots of projects could be launched based on the data on official institutions at bus.gov.ru. For instance, it could be a comparative registry of all hospitals in Russia. Or a project comparing all state universities. Or a map of available public services. Or a visualisation of budgets of Moscow State University (or any other Russian state university for that matter). As to the historical data, for starters it could be a simple visualisation comparing the current situation to the past. This might be a challenging and fascinating problem to solve.

Why is this important?

BudgetApps is a great way of promoting open data among apps developers, as well as data journalists. There are good reasons for participating. First off, there are many sources of data that provide a good opportunity for talented and creative developers to implement their ambitious ideas. Second, the winners will receive considerable cash prizes. And last, but not least, the most interesting and perspective projects will get a reference at the Ministry of Finance website, which is a good promotion for any worthy project. Considerable amounts of data have become available. It’s time now for a wider audience to become aware of what they are good for.