You are browsing the archive for Data Journalism.

Tο OK Greece αρωγός στο project “Frictionless Data”

Spyridoula Markou - July 22, 2017 in Data Journalism, Featured, Featured @en, Εφαρμογές, Νέα

Το άρθρο που ακολουθεί αποτελεί μεταφρασμένη αναδημοσίευση της συνέντευξης που δόθηκε από το OK Greece στο http://frictionlessdata.io. Το Ίδρυμα Ανοικτής Γνώσης Ελλάδας, το επίσημο Παράρτημα του Open Knowledge International, ιδρύθηκε το 2012 από μία ομάδα ακαδημαϊκών, προγραμματιστών, πολιτών, χάκερ και εκπροσώπων του δημοσίου. Υποστηριζόμαστε από ένα εθνικό δίκτυο εθελοντών, οι περισσότεροι από τους οποίους είναι […]

Data is a Team Sport: One on One with Daniela Lepiz

Dirk Slater - July 3, 2017 in community, Data Blog, Data Journalism, data literacy, Event report, Fabriders, Nigeria, research, Team Sport, West Africa

Data is a Team Sport is our open-research project exploring the data literacy eco-system and how it is evolving in the wake of post-fact, fake news and data-driven confusion.  We are producing a series of videos, blog posts and podcasts based on a series of online conversations we are having with data literacy practitioners. To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher. This episode features a one on one episode with Daniela Lepiz, a Costa Rican data journalist and trainer, who is currently the Investigation Editor for CENOZO, a West African Investigative Journalism Project that aims to promote and support cross border data investigation and open data in the region. She has a masters degree in data journalism from the Rey Juan Carlos University in Madrid, Spain. Previously involved with OpenUP South Africa working with journalists to produce data driven stories.  Daniela is also a trainer for the Tanzania Media Foundation and has been involved in many other projects with South African Media, La Nacion in Costa Rica and other international organisations.

Notes from the conversation

Daniela spoke to us from Burkina Faso and reflected on the role of journalism and particularly data-driven journalism in functioning democracies.  The project she is working on empowering journalists working cross-border in western Africa to utilise data to expose corruption and violation of human rights.  To identify journalists to participate in the project, they have looked for individuals who are experienced, passionate and curious. The project engages existing media houses, such as Premium Times in Nigeria, to assure that there are places for their stories to appear. Important points Daniela raises:
  • Media is continually evolving and learning to evolve and Daniela can see that data literacy will be a required proficiency in the next five years.
  • The biggest barrier to achieving open-data in government are government officials who resist transparency
  • There is a real fear from journalists of having to be proficient in maths when they are considering improve their skills to produce data-driven stories.  They often fail to realise that its about working with others that have skills on statistics and data analysis.
  • Trust in media has declined in such a big way and it means journalists have to work that much harder, particularly in labelling things as opinion or being biased.

Resources she finds inspiring

Her blogs posts

The full online conversation:

Daniela’s bookmarks!

These are the resources she uses the most often. .Rddj – Resources for doing data journalism with RComparing Columns in Google Refine | OUseful.Info, the blog…Journalist datastores: where can you find them? A list. | Simon RogersAidInfoPlus – Mastering Aid Information for Change

Data skills

Mapping tip: how to convert and filter KML into a list with Open Refine | Online Journalism Blog
Mapbox + Weather Data
Encryption, Journalism and Free Expression | The Mozilla Blog
Data cleaning with Regular Expressions (NICAR) – Google Docs
NICAR 2016 Links and Tips – Google Docs
Teaching Data Journalism: A Survey & Model Curricula | Global Investigative Journalism Network
Data bulletproofing tips for NICAR 2016 – Google Docs
Using the command line tabula extractor tool · tabulapdf/tabula-extractor Wiki · GitHub
Talend Downloads

Github

Git Concepts – SmartGit (Latest/Preview) – Confluence
GitHub For Beginners: Don’t Get Scared, Get Started – ReadWrite
Kartograph.org
LittleSis – Profiling the powers that be

Tableau customized polygons

How can I create a filled map with custom polygons in Tableau given point data? – Stack Overflow
Using Shape Files for Boundaries in Tableau | The Last Data Bender
How to make custom Tableau maps
How to map geographies in Tableau that are not built in to the product (e.g. UK postcodes, sales areas) – Dabbling with Data
Alteryx Analytics Gallery | Public Gallery
TableauShapeMaker – Adding custom shapes to Tableau maps | Vishful thinking…
Creating Tableau Polygons from ArcGIS Shapefiles | Tableau Software
Creating Polygon-Shaded Maps | Tableau Software
Tool to Convert ArcGIS Shapefiles into Tableau Polygons | Tableau and Behold!
Polygon Maps | Tableau Software
Modeling April 2016
5 Tips for Making Your Tableau Public Viz Go Viral | Tableau Public
Google News Lab
HTML and CSS
Open Semantic Search: Your own search engine for documents, images, tables, files, intranet & news
Spatial Data Download | DIVA-GIS
Linkurious – Linkurious – Understand the connections in your data
Apache Solr –
Apache Tika – Apache Tika
Neo4j Graph Database: Unlock the Value of Data Relationships
SQL: Table Transformation | Codecademy
dc.js – Dimensional Charting Javascript Library
The People and the Technology Behind the Panama Papers | Global Investigative Journalism Network
How to convert XLS file to CSV in Command Line [Linux]
Intro to SQL (IRE 2016) · GitHub
Malik Singleton – SELECT needle FROM haystack;
Investigative Reporters and Editors | Tipsheets and links
Investigative Reporters and Editors | Tipsheets and Links

SQL_PYTHON

More data

2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
advanced-sql-nicar15/stats-functions.sql at master · anthonydb/advanced-sql-nicar15 · GitHub
2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
Malik Singleton – SELECT needle FROM haystack;
Statistical functions in MySQL • Code is poetry
Data Analysis Using SQL and Excel – Gordon S. Linoff – Google Books
Using PROC SQL to Find Uncommon Observations Between 2 Data Sets in SAS | The Chemical Statistician
mysql – Query to compare two subsets of data from the same table? – Database Administrators Stack Exchange
sql – How to add “weights” to a MySQL table and select random values according to these? – Stack Overflow
sql – Fast mysql random weighted choice on big database – Stack Overflow
php – MySQL: Select Random Entry, but Weight Towards Certain Entries – Stack Overflow
MySQL Moving average
Calculating descriptive statistics in MySQL | codediesel
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, …
R, MySQL, LM and quantreg
26318_AllText_Print.pdf
ddi-documentation-english-572 (1).pdf
Categorical Data — pandas 0.18.1+143.g3b75e03.dirty documentation
python – Loading STATA file: Categorial values must be unique – Stack Overflow
Using the CSV module in Python
14.1. csv — CSV File Reading and Writing — Python 3.5.2rc1 documentation
csvsql — csvkit 0.9.1 documentation
weight samples with python – Google Search
python – Weighted choice short and simple – Stack Overflow
7.1. string — Common string operations — Python v2.6.9 documentation
Introduction to Data Analysis with Python | Lynda.com
A Complete Tutorial to Learn Data Science with Python from Scratch
GitHub – fonnesbeck/statistical-analysis-python-tutorial: Statistical Data Analysis in Python
Verifying the email – Email Checker
A little tour of aleph, a data search tool for reporters – pudo.org (Friedrich Lindenberg)
Welcome – Investigative Dashboard Search
Investigative Dashboard
Working with CSVs on the Command Line
FiveThirtyEight’s data journalism workflow with R | useR! 2016 international R User conference | Channel 9
Six issue when installing package · Issue #3165 · pypa/pip · GitHub
python – Installing pip on Mac OS X – Stack Overflow
Source – Journalism Code, Context & Community – A project by Knight-Mozilla OpenNews
Introducing Kaggle’s Open Data Platform
NASA just made all the scientific research it funds available for free – ScienceAlert
District council code list | Statistics South Africa
How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code – Cloudera Engineering Blog
GitHub – gavinr/geojson-csv-join: A script to take a GeoJSON file, and JOIN data onto that file from a CSV file.
7 command-line tools for data science
Python Basics: Lists, Dictionaries, & Booleans
Jupyter Notebook Viewer

PYTHON FOR JOURNALISTS

New folder

Reshaping and Pivot Tables — pandas 0.18.1 documentation
Reshaping in Pandas – Pivot, Pivot-Table, Stack and Unstack explained with Pictures – Nikolay Grozev
Pandas Pivot-Table Example – YouTube
pandas.pivot_table — pandas 0.18.1 documentation
Pandas Pivot Table Explained – Practical Business Python
Pivot Tables In Pandas – Python
Pandas .groupby(), Lambda Functions, & Pivot Tables
Counting Values & Basic Plotting in Python
Creating Pandas DataFrames & Selecting Data
Filtering Data in Python with Boolean Indexes
Deriving New Columns & Defining Python Functions
Python Histograms, Box Plots, & Distributions
Resources for Further Learning
Python Methods, Functions, & Libraries
Python Basics: Lists, Dictionaries, & Booleans
Real-world Python for data-crunching journalists | TrendCT
Cookbook — agate 1.4.0 documentation
3. Power tools — csvkit 0.9.1 documentation
Tutorial — csvkit 0.9.1 documentation
4. Going elsewhere with your data — csvkit 0.9.1 documentation
2. Examining the data — csvkit 0.9.1 documentation
A Complete Tutorial to Learn Data Science with Python from Scratch
For Journalism
ProPublica Summer Data Institute
Percentage of vote change | CARTO
Data Science | Coursera
Data journalism training materials
Pythex: a Python regular expression editor
A secure whistleblowing platform for African media | afriLEAKS
PDFUnlock! – Unlock secured PDF files online for free.
The digital journalist’s toolbox: mapping | IJNet
Bulletproof Data Journalism – Course – LEARNO
Transpose columns across rows (grefine 2.5) ~ RefinePro Knowledge Base for OpenRefine
Installing NLTK — NLTK 3.0 documentation
1. Language Processing and Python
Visualize any Text as a Network – Textexture
10 tools that can help data journalists do better work, be more efficient – Poynter
Workshop Attendance
Clustering In Depth · OpenRefine/OpenRefine Wiki · GitHub
Regression analysis using Python
DataBasic.io
DataBasic.io
R for Every Survey Analysis – YouTube
Git – Book
NICAR17 Slides, Links & Tutorials #NICAR17 // Ricochet by Chrys Wu
Register for Anonymous VPN Services | PIA Services
The Bureau of Investigative Journalism
dtSearch – Text Retrieval / Full Text Search Engine
Investigation, Cybersecurity, Information Governance and eDiscovery Software | Nuix
How we built the Offshore Leaks Database | International Consortium of Investigative Journalists
Liz Telecom/Azimmo – Google Search
First Python Notebook — First Python Notebook 1.0 documentation
GitHub – JasonKessler/scattertext: Beautiful visualizations of how language differs among document types
  Flattr this!

Data is a Team Sport: Data-Driven Journalism

Dirk Slater - June 20, 2017 in Anti-corruption, community, Data Blog, data driven journalism, Data Journalism, data literacy, Event report, Fabriders, Gender Data, research, Rights

Our podcast series that explores the ever evolving data literacy eco-system. Cut and paste this link into your podcast app to subscribe: http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher. In this episode we speak with two veteran data literacy practitioners who have been involved with developing data-driven journalism teams. Our guests:
  • Eva Constantaras is a data journalist specialized in building data journalism teams in developing countries. These teams that have reported from across Latin America, Asia and East Africa on topics ranging from displacement and kidnapping by organized crime networks to extractive industries and public health. As a Google Data Journalism Scholar and a Fulbright Fellow, she developed a course for investigative and data journalism in high-risk environments.
  • Natalia Mazotte is Program Manager of School of Data in Brazil and founder and co-director of the digital magazine Gender and Number. She has a Master Degree in Communications and Culture from the Federal University of Rio de Janeiro and a specialization in Digital Strategy from Pompeu Fabra University (Barcelona/Spain). Natalia has been teaching data skills in different universities and newsrooms around Brazil. She also works as instructor in online courses in the Knight Center for Journalism in the Americas, a project from Texas University, and writes for international publications such as SGI News, Bertelsmann-Stiftung, Euroactiv and Nieman Lab.

Notes from this episode

They both describe the lessons learned in getting journalists to use data that can drive social change. For Eva, getting journalists to work harder and just reporting that corruption exists is not enough, while Natalia, talks about how they use data on gender to drive debate and discussion around equality. What is critical for democracy is the existence of good journalism and this includes data-driven journalism that uncovers facts and gets at the root causes.

Gaps in the Data Literacy EcoSystem:

Natalia points out that corporations and government has the power because they are data-literate and can use it effectively, while people in low-income communities, such as favela’s really suffer because they are at the mercy of what story gets told by looking at the ‘official’ data. Eva feels that there has been too much emphasis on short-term and quick solutions from individuals who have put a lot of money in making sure that data is ready and accessible.  Donors need to support more long-term efforts and engagement around data-literacy.

Adjusting to a ‘post-fact’ world means:

Western journalists have spent too much time focusing on reporting on polling data rather than reporting on policies and it’s important for newer journalists to understand why that was problematic. In Brazil, the main stream media is focusing on ‘what’s happened’ while independent media is focusing on ‘why it’s happened’ and this means the media landscape is changing.

They also talked about:

  • Ethics and the responsibility inherent in gathering and storing data, along with the grey areas around privacy.
  • How to get media outlets to value data-driven journalism by getting them to understand that people are increasingly getting their ‘breaking news’ from social media, so they need to look at providing more in-depth stories.

They wanted to plug:

Readings/Resources they find inspiring for their work.

Resources contributed from the participants:

View the online conversation in full:

Flattr this!

Data is a Team Sport: Data-Driven Journalism

Dirk Slater - June 20, 2017 in Anti-corruption, community, Data Blog, data driven journalism, Data Journalism, data literacy, Event report, Fabriders, Gender Data, research, Rights

Our podcast series that explores the ever evolving data literacy eco-system. Cut and paste this link into your podcast app to subscribe: http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher. In this episode we speak with two veteran data literacy practitioners who have been involved with developing data-driven journalism teams. Our guests:
  • Eva Constantaras is a data journalist specialized in building data journalism teams in developing countries. These teams that have reported from across Latin America, Asia and East Africa on topics ranging from displacement and kidnapping by organized crime networks to extractive industries and public health. As a Google Data Journalism Scholar and a Fulbright Fellow, she developed a course for investigative and data journalism in high-risk environments.
  • Natalia Mazotte is Program Manager of School of Data in Brazil and founder and co-director of the digital magazine Gender and Number. She has a Master Degree in Communications and Culture from the Federal University of Rio de Janeiro and a specialization in Digital Strategy from Pompeu Fabra University (Barcelona/Spain). Natalia has been teaching data skills in different universities and newsrooms around Brazil. She also works as instructor in online courses in the Knight Center for Journalism in the Americas, a project from Texas University, and writes for international publications such as SGI News, Bertelsmann-Stiftung, Euroactiv and Nieman Lab.

Notes from this episode

They both describe the lessons learned in getting journalists to use data that can drive social change. For Eva, getting journalists to work harder and just reporting that corruption exists is not enough, while Natalia, talks about how they use data on gender to drive debate and discussion around equality. What is critical for democracy is the existence of good journalism and this includes data-driven journalism that uncovers facts and gets at the root causes.

Gaps in the Data Literacy EcoSystem:

Natalia points out that corporations and government has the power because they are data-literate and can use it effectively, while people in low-income communities, such as favela’s really suffer because they are at the mercy of what story gets told by looking at the ‘official’ data. Eva feels that there has been too much emphasis on short-term and quick solutions from individuals who have put a lot of money in making sure that data is ready and accessible.  Donors need to support more long-term efforts and engagement around data-literacy.

Adjusting to a ‘post-fact’ world means:

Western journalists have spent too much time focusing on reporting on polling data rather than reporting on policies and it’s important for newer journalists to understand why that was problematic. In Brazil, the main stream media is focusing on ‘what’s happened’ while independent media is focusing on ‘why it’s happened’ and this means the media landscape is changing.

They also talked about:

  • Ethics and the responsibility inherent in gathering and storing data, along with the grey areas around privacy.
  • How to get media outlets to value data-driven journalism by getting them to understand that people are increasingly getting their ‘breaking news’ from social media, so they need to look at providing more in-depth stories.

They wanted to plug:

Readings/Resources they find inspiring for their work.

Resources contributed from the participants:

View the online conversation in full:

Flattr this!

Δημόσιοι φορείς και ανοικτά δεδομένα στην Ελλάδα

Spyridoula Markou - June 18, 2017 in Data Journalism, Featured, Featured @en, News, Uncategorized @en, Νέα

Τα ανοικτά δεδομένα μονοπωλούν τα τελευταία χρόνια τις συζητήσεις που αφορούν τη διαφάνεια και τη διάχυση των πληροφοριών μεταξύ των οργανισμών (δημοσίων και μη) και των πολιτών. Η Ελλάδα σε συμμόρφωση προς τις κοινοτικές οδηγίες της ΕΕ (2013/37/ΕΕ) και τη διεθνή πρακτική που ακολουθείται για τα ανοικτά δεδομένα, έχει θεσπίσει νομοθετικό πλαίσιο από το 2006 […]

#DataJournalismDisclosure: Μυστικές παρακολουθήσεις …από αέρος

Despoina Mantziari - May 28, 2017 in Data Journalism, Featured, Featured @en

Των Δέσποινα Μάντζιαρη, Κατερίνα Μπακιρτζή Τον Απρίλιο του 2016, οι δημοσιογράφοι Peter Aldhous και Charles Seife “ταρακούνησαν” την Αμερική με τις αποκαλύψεις τους για συστηματικές παρακολουθήσεις των μυστικών υπηρεσιών ασφαλείας (FBI και DHS) των ΗΠΑ από αέρος στο εσωτερικό της χώρας και ιδιαίτερα σε μεγάλες πόλεις . Τα αεροπλάνα των μυστικών κρατικών υπηρεσιών πετούσαν μόλις ένα […]

Πώς οι δημοσιογράφοι μπορούν να χειριστούν καλύτερα τα δημόσια οικονομικά δεδομένα και να παράγουν ιστορίες βασισμένες σ’ αυτά; Συνέντευξη με τον Nicolas Kayser-Bril

Despoina Mantziari - May 22, 2017 in Data Journalism, Featured, Featured @en

Της Diana Krebs Ο Nicolas Kayser-Bril είναι ο πρώην CEO και συνιδρυτής του Journalism++ (J++), μιας ομάδας ερευνητών δημοσιογράφων, η οποία εξειδικεύεται στην δημοσιογραφία δεδομένων. Στο πλαίσιο της συμμετοχής του στο Openbudgets.eu του OKI, είχαμε την τύχη να δουλέψουμε με το J++ πάνω στο ζήτημα του πώς οι δημόσιοι προϋπολογισμοί και τα δεδομένα δαπανών μπορούν να […]

How can journalists best handle public fiscal data to produce data-driven stories? An interview with Nicolas Kayser-Bril

Diana Krebs - May 16, 2017 in Data Journalism

Nicolas Kayser-Bril is the former CEO and co-founder of Journalism++ (J++), a group of investigative journalists that specialises in data-driven reporting. As part of OKI’s own involvement in Openbudgets.eu, we had the good fortune of working with  J++ on the question how public budget and spending data can be used to tackle corruption. In this short interview, Diana Krebs (Project Manager for Fiscal Projects at OKI) asked Nicolas about his experience on how journalists today can best handle public fiscal data to produce data-driven stories.   Are journalists today equipped to work with fiscal data such as budget and spending data? Different sorts of journalists use budget and spending data in different ways. Investigative outlets such as the International Consortium of Investigative Journalists (of Panama-Papers fame) or investigative lone wolves such as Dirk Laabs (who investigated privatizations in East Germany) are very much able to seek and use such data. Most other types of journalists are not able to do so.   Where do you see the gaps? What kind of skill sets, technical and non-technical, do journalists need to have to write data-driven stories that stick and are water-proof? The largest gap is the lack of incentive. Very few journalists are tasked with investigating government spending and budgets. The ones who do, either because they are interested in the topic or because they are paid investigative journalists, sometimes lack the field-specific expertise that allows for quick judgments. One can only know what’s abnormal (and therefore newsworthy) if one knows what the normal state of things is. In public budgets, few journalists know what is normal and what’s not.   Do you think it’s helpful for journalists to, when in doubt, work closely with experts from the public administration to enhance their fiscal data knowledge? Journalists are trained to find experts to illustrate their articles or to provide information. It would help to have easy-to-reach experts on public funding that journalists could contact.   What are the ingredients for a sustainable increase of fiscal data knowledge among journalists, so that the public can be informed in a credible and informative way? These are two different issues; it would be a mistake to believe that the information the public receives is in any way linked to the work of journalists. This was true in the last century, when journalists were de facto intermediaries between what happened and reports of what had happened. (They were de facto intermediaries because all means of communication involved a need to package information for film, radio, TV or newspapers). For journalists to produce more content on budget and spending issues, they must be incentivised to do so by their organizations. This could mean for news organizations to shift their focus towards public accountability. Organizations that have, such as ProPublica in the USA and Correctiv in Germany, happen to employ journalists who know how to decipher budget data. For the public to be informed about public budget and spending, the availability of interesting and entertaining content on the issue would help. However, demand for such content could also be boosted by the administration, who could celebrate citizens who ask questions on public budgets, which is currently not the case. They could also teach the basics of how government – and government finance – works at school, which is barely done, when at all.   J++ has developed several projects around unlocking fiscal data such as Cookingbudgets.com, a quite serious satire tutorial webpage for journalists and civil society activists to look for budget stories in the public administration. Their latest coup is “The Good, the Bad and the Accountant”, an interactive online application that puts users in the shoes of a manager of a big cities to learn about and recognize patterns of corruption within the public administration.

Ολοκληρώθηκε η πρώτη μέρα του εργαστηρίου δημοσιογραφίας δεδομένων- Red Flags σε ΕΣΠΑ

Spyridoula Markou - April 2, 2017 in Data Journalism, Featured, Featured @en, Εκδηλώσεις, Εφαρμογές

Με επιτυχία ολοκληρώθηκε η πρώτη μέρα του εργαστήριου “Data Journalism Hackathon: Κόκκινες Σημαίες στα προγράμματα ΕΣΠΑ” που πραγματοποιήθηκε στις 31 Μαρτίου από το το Ίδρυμα Ανοικτής Γνώσης Ελλάδας, το Εργαστήριο Εφαρμογών Πληροφορικής στα ΜΜΕ (Τμήμα Δημοσιογραφίας και ΜΜΕ του ΑΠΘ) και την Ένωση Συντακτών Ημερήσιων Εφημερίδων Μακεδονίας Θράκης. Στο πρώτο μέρος του εργαστηρίου ο κ. […]

Το Open Knowlege Greece στην ημερίδα του Ευρωκοινοβουλίου για τις ψευδείς ειδήσεις

Spyridoula Markou - March 3, 2017 in Data Journalism, Featured, Featured @en, Εκδηλώσεις

Ομιλητής στην ημερίδα με θέμα «“Fake News” in Social Media as Reality Shapers: Unfounded information and legitimacy crisis at the new media period» θα είναι ο συντονιστής του School of Data του Ιδρύματος Ανοικτής Γνώσης Ελλάδας κ. Ανδρέας Βέγλης, Καθηγητής Δημοσιογραφίας και ΜΜΕ στο ΑΠΘ. Η εκδήλωση, που θα πραγματοποιηθεί στις 8 Μαρτίου 2017 στο […]