Το ΟK Greece υποστηρίζει τον μαραθώνιο καινοτομίας City Challenge crowdhackathon #smartcity 2

Χριστίνα Καρυπίδου - May 26, 2018 in Featured, Featured @en, hackathon, News, Εκδηλώσεις, Εφαρμογές, Νέα

Το Ίδρυμα Ανοικτής Γνώσης Ελλάδας (OK Greece), πιστός αρωγός σε διοργανώσεις οι οποίες αφορούν στην καινοτομία, στηρίζει το City Challenge crowdhackathon #smartcity 2. Το «Citylab Θεσσαλονίκη» θα πραγματοποιηθεί στις 30 Μαΐου 2018, στις 15:00, στο OK!THESS. Έλα να ακούσεις ή να παρουσιάσεις ιδέες και σύγχρονες τάσεις για την ψηφιακή καινοτομία. Έλα να προτείνεις ιδέες για […]

Γενικός Κανονισμός Προστασίας Δεδομένων (GDPR)

Χριστίνα Καρυπίδου - May 26, 2018 in Featured, Featured @en, News, Νέα, προσωπικά δεδομένα

ΑΝΑΚΟΙΝΩΣΗ Το Ίδρυμα Ανοικτής Γνώσης Ελλάδας (OK Greece), σε εναρμόνιση με τον Γενικό Κανονισμό Προστασίας Δεδομένων (GDPR), θα ήθελε να σας ενημερώσει ότι οι βάσεις που διέθετε με τα προσωπικά δεδομένα όσων είχαν εγγραφεί στο newsletter του έχουν διαγραφεί. Εάν επιθυμείτε να συνεχίσετε να λαμβάνετε ενημερώσεις του OK Greece, μπορείτε να κάνετε εκ νέου εγγραφή, […]

The Sky: A Film Lesson in “Nature Study” (1928)

Adam Green - May 24, 2018 in astrology, astronomy, moon, planets, sky, sun

Short educational film on what can be seen in the night sky through a telescope, including a look at constellations, the mountains of the moon, the planets, and the sun.

Lava Jato Hackathon: Journalists and developers creating algorithms and web apps to fight corruption

Convoca - May 24, 2018 in Follow the Money, Open Data Day, open data day 2018, peru

This blog has been translated from the Spanish blog report at Convoca This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The event in this blog was supported through the mini-grants scheme under the Follow the Money theme. We organised in Peru the first hackathon to develop apps to fight corruption. Using open data about public works and people involved in the “Lava Jato” case (Operation Car Wash) we gathered journalists, developers, different professionals and young students to work together on innovative proposals through more than 18 hours. Four years after the investigation of this biggest corruption scandal in Latin America started in Brazil, Convoca organized the “Hackathon Lava Jato”, on March 16 and 17 celebrating Open Data Day. This event brought together anti-corruption experts with young professionals. To do this, we made open data available about Odebrecht contracts and their increased costs. We gathered contracts through Freedom of Information requests (FOI), information of official websites and own sources, built together with the 20 Latin American and African journalists of the “Investiga Lava Jato” project. Avelino Guillén, the former prosecutor of the most important corruption cases in the country, including the former president Alberto Fujimori’s, and Vanessa Zorrilla, an expert lawyer in public procurement presented to about 70 participants. Guillén talked to them about the judiciary system to tackle corruption and its weakness to fight it, as well as the sophisticated strategies used to hide ill-gotten gains. Zorrilla highlighted the importance of transparency in the public procurement process and invited the youth to request information about contracts and transactions when public money is involved, and use the FOI and Transparency laws. Journalists, web developers, designers, lawyers; and academics created new tools to access information about the Lava Jato case. The criteria to select the winning projects were: project impact and viability; meeting goals of the event; innovation and creativity, as well as how developed the project was. The jury was formed by experts in the different topics: Avelino Guillén, former state prosecutor; Irina Ávilna the founding director of MakerLAB; Milagros Salazar, journalist and director of Convoca.pe; and Elvis Rivera, the developer and lead of Convoca Lab. Based on these criteria we got three winners:
  1. Face to Face”, a project developed by David Chapuis, Luis Castillo, José Osnar, Randy Ortiz and Joseph Patiño. A detector of gesture patterns that analyzes potential corrupt characters through and algorithm. People can also access public interest information like their bios, court processes and others. This project seeks to prevent cases like Lava Jato in Peru.  
  2. ‘Lava Jabot’, built by Jean Pierre Tincopa, Dulce Alarcon and Jorge Tuanama. This team built a bot using AI. They seek to use its preset responses to bring people closer to the information about contracts, public works and people involved in Lava Jato. They decided to show simple and interactive information to their users. Through Facebook Messenger, people can access infographics, audios (of the depositions), or geolocated information about the closest Odebrecht works and how big their cost overrun was.
  3. Sin Justicia” (Without Justice), developed by Luis Enrique Pérez, Luis Vertiz, Yesenia Chavarry, Edson Torres and Rocío Arteaga seeks to emphasize the consequences and inequalities caused by corruption. Their web app shows the amount and law office defending corrupt politicians paid with public money. This is compared with the public funds used to defend other citizens. It also compares the amount spent in defending public officials with the cost of improvements in the country.
Beside these projects, we had honorable mentions of two initiatives that seek to bring attention to corruption through comics, infographics and illustrations. The website “Jóvenes en acción” (Youth in action” built by Carolina Cortez López, Daniel Pumayauli, Tania Angulo, Rosio Ramos, Abel Salazar, and ‘Divina Aceitada’, a project developed by Patrick Valentín, Joel Romero, Rolly Rodríguez, Rodolfo Carrillo and Fernando Tincopa. This hackathon showed that there is great interest from the youth to fight corruption. Also, the projects they developed are an example of creativity and symbiosis of journalism with technology to benefit people. We spread the word about the results through social media and in the different open data, journalism and technology communities. Convoca published these achievements in its digital medium and interviewed the winners in the radio program “Café Convoca”. The next step is to keep supporting these initiatives that contribute significantly to transparency and accountability. The Lava Jato Hackathon was run with support from Hivos and Open Knowledge International as part of the “Investiga Lava Jato” initiative, the Centro de Innovación y Desarrollo Emprendedor de la Pontificia Universidad Católica del Perú (PUCP) and Lab San Isidro.

Transparency, algorithms and data

Georgia Panagiotidou - May 24, 2018 in algorithmic transparency, blog, data transparency, Events, International, Open Democracy, travel

Notes from the Data Transparency Lab 2017 in Barcelona

As part of Open Knowledge Finland, I attended the Data Transparency Lab(DTL) Conference in Barcelona last December. Unfortunately I only managed to attend the second day but since the sessions were recorded, I will be watching the rest online soon. Nevertheless, the second day had a lot of interesting talks and people, and I’d thought I’d note here some of the highlights.

The sessions, seemed to have an emphasis on transparency in the times of algorithmic complexity, which was great to listen to, even more so because the topic was covered from multiple angles. It was described from the angles of policy, design, ethics as well as practical demos accordingly in each session. So total the discussions were on a mix of high level discussion as well as practical applications. In this post I’ll try to summarise few of the talks, give some links to the people when possible, and add a few of my own thoughts as well.

The day started with a keynote from Isabella de Michelis from ErnieApp who described their startup’s point of view on companies and personal data. They argue that once users are aware of how they generate value for companies, they would be more willing to shift their permissions to a vendor type relationship rather than a customer one. For example, when owning an IoT connected fridge, a household should be able to get a discount on their electricity bill (since information about the fridge’s contents would be shared eg. to supermarkets for profit). An interesting position for sure, one which could be debated to clash with a later presentation from Illaria Liccardi.

Image from Liccardi’s research on user permission sharing. http://people.csail.mit.edu/ilaria/papers/ShihCHI15.pdf

Illaria Liccardi, as part of the ‘How to foster transparency’ session, presented her research from MIT on how people’s privacy permissions would change on apps they use, depending on the information provided. The results are possibly not as obvious as expected. They actually found that people are much more willing to give permissions for use of their data when no indication of how they will be used is given. However, they are less willing to give permissions to personal data when the reasoning of use is vaguely worded and yet somewhat permissive when there is a more detailed information from the side of the companies. The full research can be found on her page.

These are interesting findings that imply that for the general public to understand and give consent to give out personal data for company profit, there needs to be both an initial motivator from the company side, and also a good balance between sufficient and clear information on it’s use.

Namely, if people are more permissive when knowing less, then it is possible that the path to transparency won’t be as user-driven as expected.

Nozha Boujemaa from DATAIA institute, ( who was also the chair of that session), nicely put it that ‘data-driven does not mean objective’, which is something I can personally imagine myself repeating endlessly, especially in the context of data-driven journalism. A personal favourite entry point to the topic is on feminist data visualisation by Catherine Dignazio, who explainshow data and datasets can be wrongfully coupled with ideas that they are objective or presenting the truth. Nozha Boujemaa also discusses what would computational decision making systems need to be considered transparent. She notes for example that many Machine Learning (ML) algorithms are open sourced yet they are not open-data-ed. Meaning the data they have been trained on are actually proprietary, and since the decisions are actually based on exactly this trained models, then the openness of the algorithms are less useful.

The CNNum, the French national digital council, are actually trying to figure out how to incorporate and implement these principles for accountability in the french regulatory system. Moreover, their approach seemed to be really aware of the environment they are trying to regulate, in terms of diversity, speed of change cycles and they made a good note on the difficulty in actually assessing the impact of power asymmetry caused by algorithms.

Jun Huan (NSF) with the building blocks to model transparency holistically

Jun Huan (NSF) with the building blocks to model transparency holistically. Photo by author.


From the Nation Science Foundation (USA), Jun Huan, went a step beyond accountability for algorithmic systems towards their explainability and interpretability. In their research they are creating ways ML systems can identify and therefore indicate when they are actually ‘learning new tasks’ inspired by the human constructivism learning theory. It definitely sounded promising though due to my shallow knowledge on the topic, I am prone to algo-sensationalism!

Simone Fischer-Hubners, professor from Karlstad University, was chairing the session on ‘Discrimination and data Ethics’. She presented real world cases of discrimination and bias, such as the racial discrimination in online ad serving, price discrimination based on operating system used, predictive policing as well as an example case of gender discrimination in bibliometric profiling based on biased input data (women are less likely to self-cite). The last example is especially interesting because it highlights, that when we refer to biased computer decision making systems we tend to refer to the computational side of the system. However as in the example of the bibliometric profiling in academia, women are less likely to self-cite therefore already the ranking carries the initial biased sample of reality.

Julia Stoyanovich explaining why we should asses data science in its whole lifecycle.

Julia Stoyanovich explaining why we should asses data science in its whole lifecycle.

Julia Stoyanovich referred to that we should be assessing the fairness, accountability and transparency throughout the full data science lifecycle in order to be able to make valid conclusions on such systems. Her research with Data Responsibly, is actually centered around this topic as well. Last but not least, Gemma Galdon Clavell from Eticas added their own approach on research and consulting on ethical issues arising from technology and the importance of creating assessment criteria for technologies on society.

DTL actually funds projects globally, many actually awarded to university research groups, to create tools on the theme of data transparency. A part of the sessions was actually devoted exactly to those tools developed during 2017. The demos seemed by majority to be browser add-ons (and an app if I recall correctly ) that inform users on privacy leaks and ad targeting. Relevant topics for sure, though I admit I did catch myself pondering the irony of most privacy related add-ons being developed for Chrome..

The way I see it is that these subjects should eventually be discussed in even more wide audiences since they will affect the majority and in our daily life. It is therefore great to hear the researchers, dedicated circles and organisations who are actively working on these topics first hand.

Keep it open!

The post Transparency, algorithms and data appeared first on Open Knowledge Finland.

Novos ares na Open Knowledge Brasil

ariel-kogan - May 23, 2018 in Open Knowledge Brasil

Assumi a diretoria executiva da Open Knowledge Brasil em julho de 2016 com o objetivo principal de ajudar a organizar, re-estruturar e construir um planejamento a médio/longo prazo para a organização. Foi essa a nossa missão desde então e, hoje, quase dois anos depois, posso dizer que temos conseguido bons avanços nessa direção. Atualmente temos quatro programas estratégicos – Escola de Dados, Gastos Abertos, Open Data Index e Ciência de Dados – todos eles, sendo executados por pessoas qualificadas, com alto potencial de gerar impacto na sociedade, e com boas expectativas de sustentabilidade nos próximos anos. A organização é reconhecida no país como uma das mais relevantes na agenda de transparência, com um trabalho fundamental no aprimoramento do ecossistema de abertura, estruturação, análise e jornalismo de dados. Tudo isso mantendo sempre uma estrutura operacional enxuta, flexível, dinâmica e muito eficiente. As articulações e parcerias com organizações da sociedade civil, acadêmicas, meios de comunicação e instituições públicas nos três níveis (federal/estadual/municipal) têm sido chave no processo de aprimoramento dos mecanismos de transparência e controle social no país. O conhecimento relacionado à abertura de dados públicos nos melhores padrões internacionais e o uso de tecnologias inovadoras de ciência de dados e machine learning são, sem dúvidas, diferenciais do nosso trabalho. Acredito que não poderia ter um nome melhor do que a Natália Mazotte para assumir a diretoria executiva da organização. Ela é uma pessoa qualificada para o cargo, em termos de conhecimento e capacidade de liderança, além de apresentar uma excelente predisposição para enfrentar os desafios e crises que a organização enfrenta no dia a dia, e enxergar neles boas oportunidades. Nos próximos meses, a OKBR vai trabalhar na consolidação de três núcleos estratégicos para a organização: comunicação, administrativo/financeiro e desenvolvimento institucional. Áreas fundamentais para oferecer o apoio que os quatro programas e as diversas frentes de atuação precisam. Tenho a honra de ter sido convidado para participar do núcleo de desenvolvimento institucional, que vai focar principalmente na captação de recursos e na articulação com parceiros estratégicos para a organização. Também vou continuar contribuindo com o programa Open Data Index, em parceria com a FGV-DAPP. A transparência talvez seja a agenda mais importante para o Brasil de hoje. Ela é uma vacina para duas das principais doenças sociais que enfrentamos, a corrupção e a polarização da sociedade. O Brasil não poderia ter avançado na luta contra a corrupção, sem antes, ter conseguido grandes avanços na agenda da transparência, acesso à informação pública e integridade. Em tempos de desinformação e polarização da sociedade, a abertura de dados públicos, análise e jornalismo de dados são chaves para oferecer informação de qualidade e com dados primários para a população. Vida longa à Open Knowledge Brasil, que terá um papel fundamental no aprimoramento da transparência, integridade e acesso à informação pública no nível local, grande desafio dos municípios brasileiros. A organização também vai cumprir um papel estratégico na região, liderando essa agenda e ajudando na articulação e mobilização na América Latina. Flattr this!

MyData-periaatteilla luodaan GDPR:n pykälistä palveluita

Open Knowledge Finland - May 23, 2018 in aalto university, antti poikola, bbc, data ethics, Events, f-secure, Featured, fing, Google, human-centered, ihmiskeskeinen, konferenssi, My Data, mydata, mydata 2018, omadata, projects, viivi lähteenoja

MyData on eettisesti kestävä lähestymistapa henkilötiedon keräämiseen ja käyttöön. MyData 2018 -konferenssi kokoaa maailman johtavat asiantuntijat Helsinkiin 29.-31. elokuuta.

Euroopan unioni näyttää tietä muulle maailmalle yksityisyydensuojan ja digitaalisten oikeuksien edistämisessä, kun EU:n tietosuoja-asetus GDPR astuu voimaan perjantaina 25. toukokuuta.

”Tietosuoja-asetus on merkittävä askel oikeaan suuntaan. Lainsäädäntö yksin ei kuitenkaan riitä takaamaan oikeudenmukaista tietoyhteiskuntaa tai ruokkimaan innovatiivista uutta liiketoimintaa ja teknologiaa. Tarvitaan uusia käytäntöjä ja työkaluja, joilla oikeudet toteutetaan käytännössä. Siksi tarvitsemme MyDataa”, selittää Antti Poikola, Aalto-yliopiston tutkija ja yksi kansainvälisen MyData-verkoston perustajista.

MyDatan keskeinen tavoite on, että ihmiset voisivat paremmin hallita datajälkiä, joita jättävät jälkeensä päivittäin. MyData-periaatteet auttavat kansalaisia hyödyntämään heitä itseään koskevaa tietoa. Siten MyData-malli vahvistaa digitaalisia ihmisoikeuksia ja avaa mahdollisuuksia uudelle, ihmiskeskeiselle ja käyttäjiä kunnioittavalle datataloudelle.

”Verkon villistä lännestä pitää päästä tilanteeseen, jossa eettinen henkilötiedon käyttö on myös yrityksille aina kannattavin toimintatapa”, Poikola korostaa.

MyData-periaatteita toteuttavat palvelut helpottavat käyttäjien arkea hyödyntämällä tietoa useista eri lähteistä. Esimerkiksi aktiivirannekkeista ja ruokaostoksista kertynyttä dataa ja terveystietoa yhdistelemällä voidaan tarjota päivittyvää tilannekuvaa ja räätälöityä neuvontaa.

Jättitapahtuma pureutuu tietosuoja-asetuksen ensivaikutuksiin

Tietosuoja-asetuksen ensivaikutuksia yrityksille ja kansalaisille punnitaan elokuussa, kun kolmatta kertaa järjestettävä MyData-konferenssi tuo henkilötiedon ammattilaiset ympäri maailmaa Helsinkiin. Esiintyjäkaartiin kuuluu yli sata kotimaista ja kansainvälistä huippuasiantuntijaa mm. Googlelta, BBC:ltä ja F-Securelta.

”MyData-konferenssiin arvioidaan saapuvan yli 800 henkilöä yli 30 maasta. Mukana on liike-elämän johtajia, yrittäjiä, teknologian kehittäjiä, juristeja, yhteiskuntatieteilijöitä ja aktivisteja. GDPR:n lisäksi keskiössä ovat uusi liiketoiminta, tekoälyn ja henkilötiedon etiikka, tietojärjestelmien yhteentoimivuus sekä henkilötiedon yhteiskunnalliset vaikutukset”, kuvailee MyData-konferenssin projektijohtaja Viivi Lähteenoja.

Tänä vuonna tapahtuman pääkumppani on Suomen itsenäisyyden juhlarahasto Sitra ja sen järjestävät Open Knowledge Finland ry ja Aalto yliopisto yhteistyössä ranskalaisen ajatuspajan Fingin kanssa.


Viivi Lähteenoja Projektijohtaja, MyData 2018 Open Knowledge Finland ry viivi@mydata.org +358 50 375 8274

Antti ‘Jogi’ Poikola Ohjelmapäällikkö, MyData 2018 Aalto-yliopisto jogi@mydata.org +358 44 337 5439


MyData-konferenssi järjestetään kolmatta kertaa 29.-31.8.2018 Helsingin Kulttuuritalolla. Konferenssi on maailmanlaajuisen MyData-verkoston lippulaivatapahtuma, joka kokoaa monialaisen yleisön oppimaan toisiltaan ja rakentamaan toimivaa datataloutta ja reilua tietoyhteiskuntaa.

Open Knowledge Finland ry on vuonna 2012 perustettu yhteisölähtöinen, voittoa tavoittelematon kansalaisjärjestö, joka toimii osana kansainvälistä Open Knowledge -verkostoa. Yhdistys edistää tiedon avoimuutta, avoimen tiedon hyödyntämistä ja avoimen yhteiskunnan kehittymistä.

Aalto-yliopisto on monitieteinen yhteisö, jossa tiede ja taide kohtaavat tekniikan ja talouden.

Fing on riippumaton ranskalainen voittoa tavoittelematon tutkimusorganisaatio, joka löytää, luo ja jakaa uusia ja käytännöllisiä ideoita, jotka ennakoivat digitaalisia muutoksia.

The post MyData-periaatteilla luodaan GDPR:n pykälistä palveluita appeared first on Open Knowledge Finland.

Εργαστήριο με θέμα «Εργαλεία επαλήθευσης περιεχομένου» από το Τμήμα Δημοσιογραφίας και ΜΜΕ-ΑΠΘ, το OK Greece και την ΕΣΗΕΜ-Θ

Despoina Mantziari - May 23, 2018 in Εκδηλώσεις, Νέα

Tα Εργαστήρια Εφαρμογών Πληροφορικής στα ΜΜΕ & Ηλεκτρονικών ΜΜΕ του Τμήματος Δημοσιογραφίας & ΜΜΕ του Αριστοτελείου Πανεπιστημίου Θεσσαλονίκης, σε συνεργασία με το Ίδρυμα Ανοικτής Γνώσης Ελλάδας (OK Greece) και την Ένωση Συντακτών Ημερησίων Εφημερίδων Μακεδονίας-Θράκης (ΕΣΗΕΜ-Θ), σας προσκαλούν σε Εργαστήριο με θέμα: «Εργαλεία επαλήθευσης περιεχομένου» Το εργαστήριο θα πραγματοποιηθεί την Τετάρτη 6 Ιουνίου 2018, και […]

Evidence Appraisal Data-Thon: A recap of our Open Data Day event

EvidenceBase - May 23, 2018 in health, Open Data Day, open data day 2018, Open Research, open research data, Open Science

This blog has been reposted from Medium This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The events in this blog were supported through the mini-grants scheme under the Open Research Data theme.

Research can save lives, reduce suffering, and help with scientific understanding. But research can also be unethical, unimportant, invalid, or poorly reported. These issues can harm health, waste scientific and health resources, and reduce trust in science. Differentiating good science from bad, therefore, has big implications. This is happening in the midst of broader discussions about differentiating good information from misinformation. Current controversy regarding political ‘fake news’ has specifically received significant recent attention. Public scientific misinformation and academic scientific misinformation also are published, much of it derived from low quality science.

EvidenceBase is a global, informal, voluntary organization aimed at boosting and starting tools and infrastructure that enhance scientific quality and usability. The critical appraisal of science is one of many mechanisms seeking to evaluate and clarify published science, and evidence appraisal is a key area of EvidenceBase’s work. On March 3rd we held an Open Data Day event to introduce the public to evidence appraisal and to explore and work on an open dataset of appraisals. We reached out to a network in NYC of data scientists, software developers, public health professionals, and clinicians and invited them and their interested friends (including any without health, science, or data training).


Our data came from the US’s National Library of Medicine’s PubMed and PubMed Central datasets. PubMed offers indexing, meta-data, and abstracts for biomedical publications and PubMed Central (PMC) offers full-text in pdf and/or xml. PMC has an open-access subset. We explored the portion of this subset that 1) was indexed in PubMed as a “journal comment” and 2) was a comment on a clinical trial. The structure of our 10 hour event was an initial session introducing the general areas of health trials, research issues, and open data and then the remainder of the day consisted of parallel groups tackling three areas: lay exploration and Q&A; dataset processing and word embedding development; and health expertise-guided manual exploration and annotation of comments. We had 2 data scientists, 4 trial experts, 3 physicians, 4 public health practitioners, 4 participants without background but with curiosity, and 1 infant. Our space was donated, and the food was provided from a mix of a grant from Open Data Day provided by SPARC and Open Knowledge International (thank you!) and voluntary participant donations.

On the dataset front, we leveraged the clinical trial and journal comment meta-data in PubMed, and the links between PubMed and PMC, and PMC’s open subset IDs to create a data subset that was solely journal comments on clinical trials that were in PMC’s open subset with xml data. Initial exploration of this subset for quality issues showed us that PubMed metadata tags misindex non-trials as trials and non-comments as comments. Further data curation will be needed. We did use it to create word embeddings and so some brief similarity-based expansion.


The domain experts reviewed trials in their area of expertise. Some participants manually extracted text fragments expressing a single appraisal assertion, and attempted to generalize the assertion for future structured knowledge representation work. Overall participants had a fun, productive, and educational time! From the standpoint of EvidenceBase, the event was a success and was interesting. We are mainly virtual and global, so this in person event was new for us, energizing, and helped forge new relationships for the future.

We also learned:

  • We can’t have too much on one person’s plate for logistics and for facilitation. Issues will happen (e.g. food cancellation last minute).
  • Curiosity abounds, and people are thirsty for meaningful and productive social interactions beyond their jobs. They just need to be invited, otherwise this potential group will not be involved.
  • Many people who have data science skills have jobs in industries they don’t love, they have a particular thirst to leverage their skills for good.
  • People without data science expertise but who have domain expertise are keen on exploring the data and offering insight. This can help make sense of it, and can help identify issues (e.g. data quality issues, synonyms, subfield-specific differences).
  • People with neither domain expertise nor data science skills still add vibrancy to these events, though the event organizers need more bandwidth to help orient and facilitate the involvement of these attendees.
  • Public research data sets are messy, and often require further subsetting or transformation to make them usable and high quality.
  • Open data might have license and accessibility barriers. For us, this resulted in a large reduction in journal comments with full-text vs. not, and of those with full-text, a further large reduction in those where the text was open-access and licensed for use in text mining.

We’ll be continuing to develop the data set and annotations started here, and we look forward to the next Open Data Day. We may even host a data event before then!

Nos 6 anos de LAI, pedidos de informação sem anonimato expõem cidadãos e enfraquecem a democracia

Open Knowledge Brasil - May 22, 2018 in Destaque, Lei de Acesso

Por Vitor Baptista* O Brasil é um dos países mais desenvolvidos no mundo no quesito dados abertos. De acordo com o Global Open Data Index, estamos em 8º lugar entre os países com maior transparência. Muito disso é devido à Lei de Acesso à Informação (LAI), que está completando 6 anos em vigor, e garante a disponibilização das informações públicas. Ela é uma das mais importantes ferramentas de controle social, com mais de 500 mil pedidos já feitos só a nível federal. A LAI não restringe pedidos a nenhum cidadão. Todos podem solicitar informações, mas tem uma condição: a identificação do solicitante. Essa identificação abre a possibilidade de represálias. A organização Artigo 19 lançou na última semana semana o relatório “Identidade revelada: Entraves na busca por informações públicas no Brasil” que contém relatos de 16 casos em que o tratamento dos servidores à um pedido de informação foi diferenciado porque sabiam da identidade do solicitante. As represálias incluem constrangimentos, intimidações e até perseguições. Dentre os casos citados está o do jornalista Luiz Fernando Toledo, que descobriu que o chefe de gabinete da Secretaria de Comunicação de São Paulo afirmou dificultar a liberação de dados para jornalistas, tentando fazer que desistam da matéria. O problema não é exclusivo do Brasil, e pode ter consequências mais graves. Recentemente um jornalista investigativo e sua noiva foram assassinados na Eslováquia. A suspeita é de que sua identidade foi vazada por um servidor público, ao perceber que ele estava investigando um potencial relacionamento da máfia italiana com o governo eslovaco. Todos esses casos poderiam ser evitados com uma medida simples: pedidos de informação anônimos. O governo brasileiro tem como um dos seus compromissos na Aliança para o Governo Aberto (em inglês, OGP) “(…) permitir a proteção da identidade dos solicitantes, em casos justificáveis, por meios de ajustes nos procedimentos e canais de solicitação”. A única forma de garantir que a identidade dos solicitantes não seja conhecida é permitindo pedidos anônimos. Como a Lei de Acesso à Informação não permite isso, as pessoas que querem proteger sua identidade recorrem a seu círculo de amizades. Ao invés de fazer o pedido em seu nome, a pessoa pede para algum conhecido que não tenha relação com o órgão (melhor ainda se ela morar em outro Estado). Isso funciona, mas exige a boa vontade de um amigo, que pode ainda estar se arriscando ao fazer o pedido em seu nome. É possível fazer melhor. Com o apoio da Open Knowledge Brasil, um grupo de ativistas está desenvolvendo um sistema que irá permitir à população fazer pedidos de acesso à informação sem revelar sua identidade. O funcionamento é bem parecido com pedir que um amigo faça o pedido no nome dele, só que sem colocar ninguém em risco. Funciona assim:
  1. Você acessa um website nosso, escolhe o órgão e escreve o pedido
  2. O website gera um número de protocolo. É esse número que te permite acompanhar o pedido.
    1. Como esse website é nosso, não exigimos nenhuma forma de identificação nem guardamos logs dos acessos. Nem sequer nós mesmos sabemos quem fez o pedido. Essa é a única forma de garantir a anonimidade do cidadão.
  3. Esse pedido passa por moderação para confirmar que é válido e não (por exemplo) uma ameaça anônima.
  4. Passando pela moderação, o sistema cria o pedido no sistema do órgão em questão, usando o nome de uma organização que doou sua identidade (e.x. Open Knowledge Brasil).
  5. O sistema então monitora o pedido no sistema do órgão, cadastrando quaisquer respostas no nosso sistema. Assim, na próxima vez que o usuário consultar o pedido pelo seu número de protocolo, ele verá as respostas do órgão.
Do ponto de vista do órgão, quem está fazendo o pedido é a organização. Dessa forma, garantimos o anonimato do solicitante real, permitindo que qualquer pessoa possa exercer seu direito à informação sem medo de represálias. No futuro, esperamos que mais organizações “doem” sua identidade, evitando que todos os pedidos estejam em nome de uma única organização. O projeto ainda está em desenvolvimento e traz no nome a inspiração de um antigo projeto da Open Knowledge que também tinha por objetivo facilitar os pedidos de acesso à informação no Brasil: o Queremos Saber. Em breve lançaremos uma versão inicial com suporte a alguns órgãos públicos. Para sua expansão, porém, precisaremos de sua ajuda. Você consegue ajudar no desenvolvimento ou teste das ferramentas? Quer entender como pode ajudar o projeto “doando” a identidade de sua organização, para que os pedidos sejam feitos no nome dela? É ou conhece algum potencial financiador? Ou simplesmente tem alguma sugestão ou dúvida? Entre em contato conosco escrevendo para queremossaber@ok.org.br. Quanto menor o medo, maior a participação da população e maior a efetividade da Lei de Acesso à Informação. Esse é o nosso presente de 6 anos para a LAI.   * Vitor Baptista é membro associado da Open Knowledge Brasil e foi desenvolvedor da Open Knowledge Internacional. Flattr this!