You are browsing the archive for Data Journalism.

Νέα έκδοση του Data Journalism Handbook για τη διερεύνηση δημοσιογραφικών παρεμβάσεων στην κοινωνία δεδομένων

Χριστίνα Καρυπίδου - January 20, 2018 in Data Journalism, Featured, Featured @en, News, Δημοσιογραφία Δεδομένων, Νέα

Από τον Jonathan Gray Αυτή η ανάρτηση έχει αναδημοσιευθεί από το  http://jonathangray.org/2017/12/20/new-edition-data-journalism-handbook/ Η πρώτη έκδοση του The Data Journalism Handbook έχει χρησιμοποιηθεί ευρέως και αναφέρεται ευρέως από τους σπουδαστές, τους επαγγελματίες και τους ερευνητές, χρησιμεύοντας τόσο ως εγχειρίδιο όσο και ως sourcebook για έναν αναδυόμενο τομέα. Έχει μεταφραστεί σε περισσότερες από 12 γλώσσες και χρησιμοποιείται για […]

Publication: A Field Guide to “Fake News” and Other Information Disorders

Jonathan Gray - January 18, 2018 in Data Journalism, News

This blog has been reposted from http://jonathangray.org/2018/01/08/field-guide-to-fake-news/
Last week saw the launch of A Field Guide to “Fake News and Other Information Disorders, a new free and open access resource to help students, journalists and researchers investigate misleading content, memes, trolling and other phenomena associated with recent debates around “fake news”. The field guide responds to an increasing demand for understanding the interplay between digital platforms, misleading information, propaganda and viral content practices, and their influence on politics and public life in democratic societies. It contains methods and recipes for tracing trolling practices, the publics and modes of circulation of viral news and memes online, and the commercial underpinnings of this content. The guide aims to be an accessible learning resource for digitally-savvy students, journalists and researchers interested in this topic. The guide is the first project of the Public Data Lab, a new interdisciplinary network to facilitate research, public engagement and debate around the future of the data society – which includes researchers from several universities in Europe, including King’s College London, Sciences Po Paris, Aalborg University in Copenhagen, Politecnico of Milano, INRIA, École Normale Supérieure of Lyon and the University of Amsterdam. It has been undertaken in collaboration with First Draft, an initiative dedicated to improving skills and standards in the reporting and sharing of information that emerges online, which is now based at the Shorenstein Center on Media, Politics, and Public Policy at the John F. Kennedy School of Government at Harvard University. Claire Wardle who leads First Draft comments on the release: “We are so excited to support this project as it provides journalists and students with concrete computational skills to investigate and map these networks of fabricated sites and accounts. Few people fully recognize that in order to understand the online disinformation ecosystem, we need to develop these computational mechanisms for monitoring this type of manipulation online. This project provides this skills and techniques in a wonderfully accessible way.” A number of universities and media organisations have been testing, using and exploring a first sample of the guide which was released in April 2017. Earlier in the year, BuzzFeed News drew on several of the methods and datasets in the guide in order to investigate the advertising trackers used on “fake news” websites. The guide is freely available at on the project website at fakenews.publicdatalab.org (direct PDF link here), as well as on Zenodo at doi.org/10.5281/zenodo.1136271. It is released under a Creative Commons Attribution license to encourage readers to freely copy, translate, redistribute and reuse the book. A translation is underway into Japanese. All the assets necessary to translate and publish the guide in other languages are available on the Public Data Lab’s GitHub page. Further details about contributing researchers, institutions and collaborators are available on the website. The project is being launched at the Digital Methods Winter School 2018 organised by the Digital Methods Initiative at the University of Amsterdam, a year after we first started working on the project at the Winter School 2017. We are also in discussion with Sage about a book drawing on this project.

New edition of Data Journalism Handbook to explore journalistic interventions in the data society

Jonathan Gray - January 12, 2018 in Data Journalism, data journalism handbook, data literacy, journalism, Open Access

This blog has been reposted from http://jonathangray.org/2017/12/20/new-edition-data-journalism-handbook/ The first edition of The Data Journalism Handbook has been widely used and widely cited by students, practitioners and researchers alike, serving as both textbook and sourcebook for an emerging field. It has been translated into over 12 languages – including Arabic, Chinese, Czech, French, Georgian, Greek, Italian, Macedonian, Portuguese, Russian, Spanish and Ukrainian – and is used for teaching at many leading universities, as well as teaching and training centres around the world. A huge amount has happened in the field since the first edition in 2012. The Panama Papers project undertook an unprecedented international collaboration around a major database of leaked information about tax havens and offshore financial activity. Projects such as The Migrants Files, The Guardian’s The Counted and ProPublica’s Electionland have shown how journalists are not just using and presenting data, but also creating and assembling it themselves in order to improve data journalistic coverage of issues they are reporting on.

The Migrants’ Files saw journalists in 15 countries work together to create a database of people who died in their attempt to reach or stay in Europe.

Changes in digital technologies have enabled the development of formats for storytelling, interactivity and engagement with the assistance of drones, crowdsourcing tools, satellite data, social media data and bespoke software tools for data collection, analysis, visualisation and exploration. Data journalists are not simply using data as a source, they are also increasingly investigating, interrogating and intervening around the practices, platforms, algorithms and devices through which it is created, circulated and put to work in the world. They are creatively developing techniques and approaches which are adapted to very different kinds of social, cultural, economic, technological and political settings and challenges. Five years after its publication, we are developing a revised second edition, which will be published as an open access book with an innovative academic press. The new edition will be significantly overhauled to reflect these developments. It will complement the first edition with an examination of the current state of data journalism which is at once practical and reflective, profiling emerging practices and projects as well as their broader consequences.

“The Infinite Campaign” by Sam Lavigne (New Inquiry) repurposes ad creation data in order to explore “the bizarre rubrics Twitter uses to render its users legible”.

Contributors to the first edition include representatives from some of the world’s best-known newsrooms data journalism organisations, including the Australian Broadcasting Corporation, the BBC, the Chicago Tribune, Deutsche Welle, The Guardian, the Financial Times, Helsingin Sanomat, La Nacion, the New York Times, ProPublica, the Washington Post, the Texas Tribune, Verdens Gang, Wales Online, Zeit Online and many others. The new edition will include contributions from both leading practitioners and leading researchers of data journalism, exploring a diverse constellation of projects, methods and techniques in this field from voices and initiatives around the world. We are working hard to ensure a good balance of gender, geography and themes. Our approach in the new edition draws on the notion of “critical technical practice” from Philip Agre, which he formulates as an attempt to have “one foot planted in the craft work of design and the other foot planted in the reflexive work of critique” (1997). Similarly, we wish to provide an introduction to a major new area of journalism practice which is at once critically reflective and practical. The book will offer reflection from leading practitioners on their experiments and experiences, as well as fresh perspectives on the practical considerations of research on the field from leading scholars. The structure of the book reflects different ways of seeing and understanding contemporary data journalism practices and projects. The introduction highlights the renewed relevance of a book on data journalism in the current so-called “post-truth” moment, examining the resurgence of interest in data journalism, fact-checking and strengthening the capacities of “facty” publics in response to fears about “alternative facts” and the speculation about a breakdown of trust in experts and institutions of science, policy, law, media and democracy. As well as reviewing a variety of critical responses to data journalism and associated forms of datafication, it looks at how this field may nevertheless constitute an interesting site of progressive social experimentation, participation and intervention. The first section on “data journalism in context” will review histories, geographies, economics and politics of data journalism – drawing on leading studies in these areas. The second section on “data journalism practices” will look at a variety of practices for assembling data, working with data, making sense with data and organising data journalism from around the world. This includes a wide variety of case studies – including the use of social media data, investigations into algorithms and fake news, the use of networks, open source coding practices and emerging forms of storytelling through news apps and data animations. Other chapters look at infrastructures for collaboration, as well as creative responses to disappearing data and limited connectivity. The third and final section on “what does data journalism do?”, examines the social life of data journalism projects, including everyday encounters with visualisations, organising collaborations across fields, the impacts of data projects in various settings, and how data journalism can constitute a form of “data activism”. As well as providing a rich account of the state of the field, the book is also intended to inspire and inform “experiments in participation” between journalists, researchers, civil society groups and their various publics. This aspiration is partly informed by approaches to participatory design and research from both science and technology studies as well as more recent digital methods research. Through the book we thus aim to explore not only what data journalism initiatives do, but how they might be done differently in order to facilitate vital public debates about both the future of the data society as well as the significant global challenges that we currently face.

«Διασυνδεδεμένα Δεδομένα και σε Βιβλιοθήκες» OpenAccessWeek 2017

marilia mavrikiou - October 30, 2017 in Data Journalism, Εκδηλώσεις, Νέα

  Με επιτυχία πραγματοποιήθηκε η ημερίδα «Ανοικτή πρόσβαση για …» που διοργάνωσε η Βιβλιοθήκη & Κέντρο Πληροφόρησης του Αριστοτελείου Πανεπιστημίου Θεσσαλονίκης στα πλαίσια του OpenAccessWeek,την Τρίτη 24 Οκτωβρίου, στο αμφιθέατρο της Κεντρικής Βιβλιοθήκης. Τα θέματα που κέντρισαν το ενδιαφέρον, αφορούσαν την χρησιμότητα των ανοικτών αλλά και διασυνδεδεμένων δεδομένων με στόχο την ίση και ολοκληρωμένη πρόσβαση […]

Μνημόνιο συνεργασίας ανάμεσα στο Ίδρυμα Ανοικτής Γνώσης Ελλάδας και το Εργαστήριο Εφαρμογών Πληροφορικής στα ΜΜΕ

Χριστίνα Καρυπίδου - October 15, 2017 in Data Journalism, Featured, Featured @en, News, Δημοσιογραφία Δεδομένων, μνημόνιο συνεργασίας, Νέα

Μνημόνιο συνεργασίας υπέγραψαν οι κ.κ. Δρ. Χαράλαμπος Μπράτσας, Πρόεδρος του Ιδρύματος Ανοικτής Γνώσης Ελλάδας  (Open Knowledge Greece – OK Greece),  και Ανδρέας Βέγλης, Διευθυντής του Εργαστηρίου Εφαρμογών Πληροφορικής στα ΜΜΕ (Media Informatics Lab – M.I.L.), Καθηγητής και Πρόεδρος του Τμήματος Δημοσιογραφίας και ΜΜΕ του Αριστοτέλειου Πανεπιστήμιου Θεσσαλονίκης,  την Παρασκευή 13 Οκτωβρίου 2017. Η υπογραφή του […]

Γεφυρώνοντας το χάσμα μεταξύ δημοσιογραφίας και ανάλυσης δεδομένων

Kosmas Panagiotidis - October 8, 2017 in Data Journalism, Featured, Featured @en, News, Νέα

του Chikezie Omeje. Αυτό το άρθρο γράφτηκε από τους Chikezie Omeje,  Kunle Adelowo και Vershima Tingir ως μέρος του προγράμματος υποτροφιών Open Data for Development (OD4D). Το εν λόγω πρόγραμμα που ξεκίνησε πρόσφατα, σχεδιάστηκε με στόχο να οικοδομήσει την οργανωτική ικανότητα των οργανώσεων την κοινωνίας των πολιτών να χρησιμοποιούν αποτελεσματικά τα δεδομένα: αυξάνοντας το επίπεδο […]

Bridging the gap between journalism and data analysis

Chikezie Omeje - October 5, 2017 in Data Journalism, OD4D

This blogpost was written by Chikezie Omeje,  Kunle Adelowo and Vershima Tingir as part of the Open Data for Development (OD4D) embedded fellowship programme. This recently initiated programme is designed to build the organisational capacity of civil society organisations to use data effectively by raising the level of data literacy of the staff of the partner organisation(s), supporting the organisation(s) to deliver a specific data project, and developing an initial data strategy for the organisation’s future engagement. Chikezie Omeje is a journalist at the International Centre for Investigative Reporting (ICIR), Kunle Adelowo and Vershima Tingir are developers at the Public and Private Development Centre (PPDC). They are all based in Abuja. OD4D is a global network of leaders in the open data community, of which Open Knowledge International forms part, working together to develop open data solutions around the world. For this fellowship the Public and Private Development Centre (PPDC) will develop the International Centre for Investigative Reporting (ICIR) capacity to investigate and report on open contracting related stories. 

The threat to traditional journalism

Older journalists will agree that journalism is no longer what it used to be. It is rapidly changing. Within the past decade, the profession has been disrupted to the extent that the question of who is a journalist is now difficult to answer. Technology has democratised journalism in a way that is now within the reach of anyone who is interested. The rise of social media and digital publishing platforms have made it easier for those who were formally referred to as audience to become news producers.  Members of the audience who are interested in journalism can now do the work of a journalist comfortably. The traditional line between journalists and audience has been blurred. Anyone who has the digital tools can produce and publish news without the help of a journalist. This disruption in the media industry presents both a threat and an opportunity for journalists. This threat can be seen in the declining  revenue of legacy media organisations  which means traditional journalists now stand to lose their jobs. The old business model of journalism is no longer sustainable and journalists are facing fierce competition from a multitude of individual, online publishers. The implication is that being just a journalist who covers and writes news stories is no longer enough. Anybody who is willing and able can do that now. To survive the existential threat facing traditional journalism, journalists need to build new skills that were not even taught in journalism schools a decade ago. The emergence of  buzz words such as “tech-savvy journalist” and  “data journalist”  in the newsroom is evidence of this shift. To practice journalism, every journalist needs to have digital skills that are imperative for 21st century journalism. As ordinary citizens are increasingly able to perform the work of journalism, a professional journalist  needs to take  further steps to acquire the necessary skills beyond the old concept of journalism.  Therefore, a journalist must have both the reporting and technical skills. Among the skills that a journalist should have is the ability to process, analyse and visualise data. Despite the increasing amount of information that is now available to the citizens, people are still not adequately informed on critical issues bordering on data. This is why data journalism is receiving attention around the world. Data analysis and visualisation are useful skills for today’s journalist. A lot of critical information is buried in data and a journalist must now have the skills (or access to the skills)  to harness and report the data. When journalists have data skills, it will facilitate timely production of high value and impactful information. But many journalists have been complaining on why they should acquire these technical skills. They often complain that the new skills being demanded of them are too technical and complicated.  For example, some journalists usually want to know why should they learn certain aspect of computer programming, arguing that it is so difficult.  The truth is that there are a growing number of digital tools that have made these essential skills easy to acquire which reduces the initial technical barrier for most journalists. To become proficient in data journalism, there are three essential technical skills we think journalists would need.

Data Gathering, Conversion and Extraction Techniques

Reporters often get information from different sources. Information may be presented in different formats, some of which may not be directly usable until they are converted to another format. File formats like PDFs, HTML, hard copy documents makes it hard to gather data in a structured and reusable way. Therefore data presented in these formats have to be converted to more flexible, structured and reusable formats such as Excel, Word, and CSV. There are tools that make conversion easier and they require minimal technical capabilities to use. Some of these tools include Tabula for extracting tables from PDF to CSV, online optical character recognition (OCR) which is a handy tool for converting tables in scanned document to csv, small PDF etc.

Screenshot of Tabula interface

Screenshot of Online OCR web application

 Data cleaning tools

After gathering, conversion and extraction you would have all the data you want and more at your fingertips. Most times, the data you are looking for often come in large datasets and the data that you need might be a small portion of it.  Essentially, another set of skills that you will need to get exactly what  you are looking for will be how  to use data cleaning tools. Data cleaning is the process of correcting wrong fields by removing  or adding, rearranging a dataset. For this purpose, the go-to tool is Microsoft Excel. It is very powerful and can be used for simple tasks like sorting, filtering, simple maths and text functions, pivot tables and data validation.

Sort and filter buttons on Microsoft Excel

Visualisation tools

So now you have your data and it makes sense to you, but your job as a journalist is to gather this information and present to your audience. Among your audience you have people who like numbers: they like to see the exact digits in its rawest form while others are suckers for aesthetics. They want to see colors and animations that tell stories. For the latter, the solution would be visualisations and as you would guess, data visualisation would be to transform and present datasets in form of graphical representation. A typical example of this would be creating a bar chart out of the annual salary received by each employee of an organization. Simple tools that can be used to create data visualization include Microsoft Excel, Google charts. To be a tech-savvy journalist, you need to step out of your comfort zone to acquire these essential skills. Journalism is changing rapidly and nobody has complete idea how journalism will be practiced in the next decade. This change  will not slow down as long as there are emerging technologies.The internet has made basic information readily and easily available, anybody with a computer and internet access can start a blog become a journalist. This has lowered the value of basic everyday information.Therefore journalist have to go the extra mile in using technology to do more factual reporting. Journalism is at the mercy of technology and those who cannot master these new technical tools can not report on more meaningful, factual and high value information. The worst thing that can happen to a journalist is to be outdated or irrelevant in the new demands of the profession.

To ΟΚ Greece στο International Journalism Summer School

Spyridoula Markou - August 28, 2017 in Data Journalism, Featured, Featured @en, Εκδηλώσεις

Στο θερινό σχολείο International Journalism and media organizations in a turbulent age: European and Asian perspectives που έλαβε χώρα στη Θεσσαλονίκη από τις 16-23 Ιουλίου 2017 έλαβε μέρος το OK Greece.Στα πλαίσια των διαλέξεων του summer school η ομάδα μας πραγματοποίησε ένα workshop με θέμα την δημοσιογραφία δεδομένων και την ανάγκη της κατανόησης από όσους εργάζονται […]

Tο OK Greece αρωγός στο project “Frictionless Data”

Spyridoula Markou - July 22, 2017 in Data Journalism, Featured, Featured @en, Εφαρμογές, Νέα

Το άρθρο που ακολουθεί αποτελεί μεταφρασμένη αναδημοσίευση της συνέντευξης που δόθηκε από το OK Greece στο http://frictionlessdata.io. Το Ίδρυμα Ανοικτής Γνώσης Ελλάδας, το επίσημο Παράρτημα του Open Knowledge International, ιδρύθηκε το 2012 από μία ομάδα ακαδημαϊκών, προγραμματιστών, πολιτών, χάκερ και εκπροσώπων του δημοσίου. Υποστηριζόμαστε από ένα εθνικό δίκτυο εθελοντών, οι περισσότεροι από τους οποίους είναι […]

Data is a Team Sport: One on One with Daniela Lepiz

Dirk Slater - July 3, 2017 in community, Data Blog, Data Journalism, data literacy, Event report, Fabriders, Nigeria, research, Team Sport, West Africa

Data is a Team Sport is our open-research project exploring the data literacy eco-system and how it is evolving in the wake of post-fact, fake news and data-driven confusion.  We are producing a series of videos, blog posts and podcasts based on a series of online conversations we are having with data literacy practitioners. To subscribe to the podcast series, cut and paste the following link into your podcast manager : http://feeds.soundcloud.com/users/soundcloud:users:311573348/sounds.rss or find us in the iTunes Store and Stitcher. This episode features a one on one episode with Daniela Lepiz, a Costa Rican data journalist and trainer, who is currently the Investigation Editor for CENOZO, a West African Investigative Journalism Project that aims to promote and support cross border data investigation and open data in the region. She has a masters degree in data journalism from the Rey Juan Carlos University in Madrid, Spain. Previously involved with OpenUP South Africa working with journalists to produce data driven stories.  Daniela is also a trainer for the Tanzania Media Foundation and has been involved in many other projects with South African Media, La Nacion in Costa Rica and other international organisations.

Notes from the conversation

Daniela spoke to us from Burkina Faso and reflected on the role of journalism and particularly data-driven journalism in functioning democracies.  The project she is working on empowering journalists working cross-border in western Africa to utilise data to expose corruption and violation of human rights.  To identify journalists to participate in the project, they have looked for individuals who are experienced, passionate and curious. The project engages existing media houses, such as Premium Times in Nigeria, to assure that there are places for their stories to appear. Important points Daniela raises:
  • Media is continually evolving and learning to evolve and Daniela can see that data literacy will be a required proficiency in the next five years.
  • The biggest barrier to achieving open-data in government are government officials who resist transparency
  • There is a real fear from journalists of having to be proficient in maths when they are considering improve their skills to produce data-driven stories.  They often fail to realise that its about working with others that have skills on statistics and data analysis.
  • Trust in media has declined in such a big way and it means journalists have to work that much harder, particularly in labelling things as opinion or being biased.

Resources she finds inspiring

Her blogs posts

The full online conversation:

Daniela’s bookmarks!

These are the resources she uses the most often. .Rddj – Resources for doing data journalism with RComparing Columns in Google Refine | OUseful.Info, the blog…Journalist datastores: where can you find them? A list. | Simon RogersAidInfoPlus – Mastering Aid Information for Change

Data skills

Mapping tip: how to convert and filter KML into a list with Open Refine | Online Journalism Blog
Mapbox + Weather Data
Encryption, Journalism and Free Expression | The Mozilla Blog
Data cleaning with Regular Expressions (NICAR) – Google Docs
NICAR 2016 Links and Tips – Google Docs
Teaching Data Journalism: A Survey & Model Curricula | Global Investigative Journalism Network
Data bulletproofing tips for NICAR 2016 – Google Docs
Using the command line tabula extractor tool · tabulapdf/tabula-extractor Wiki · GitHub
Talend Downloads

Github

Git Concepts – SmartGit (Latest/Preview) – Confluence
GitHub For Beginners: Don’t Get Scared, Get Started – ReadWrite
Kartograph.org
LittleSis – Profiling the powers that be

Tableau customized polygons

How can I create a filled map with custom polygons in Tableau given point data? – Stack Overflow
Using Shape Files for Boundaries in Tableau | The Last Data Bender
How to make custom Tableau maps
How to map geographies in Tableau that are not built in to the product (e.g. UK postcodes, sales areas) – Dabbling with Data
Alteryx Analytics Gallery | Public Gallery
TableauShapeMaker – Adding custom shapes to Tableau maps | Vishful thinking…
Creating Tableau Polygons from ArcGIS Shapefiles | Tableau Software
Creating Polygon-Shaded Maps | Tableau Software
Tool to Convert ArcGIS Shapefiles into Tableau Polygons | Tableau and Behold!
Polygon Maps | Tableau Software
Modeling April 2016
5 Tips for Making Your Tableau Public Viz Go Viral | Tableau Public
Google News Lab
HTML and CSS
Open Semantic Search: Your own search engine for documents, images, tables, files, intranet & news
Spatial Data Download | DIVA-GIS
Linkurious – Linkurious – Understand the connections in your data
Apache Solr –
Apache Tika – Apache Tika
Neo4j Graph Database: Unlock the Value of Data Relationships
SQL: Table Transformation | Codecademy
dc.js – Dimensional Charting Javascript Library
The People and the Technology Behind the Panama Papers | Global Investigative Journalism Network
How to convert XLS file to CSV in Command Line [Linux]
Intro to SQL (IRE 2016) · GitHub
Malik Singleton – SELECT needle FROM haystack;
Investigative Reporters and Editors | Tipsheets and links
Investigative Reporters and Editors | Tipsheets and Links

SQL_PYTHON

More data

2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
advanced-sql-nicar15/stats-functions.sql at master · anthonydb/advanced-sql-nicar15 · GitHub
2016-NICAR-Adv-SQL/SQL_queries.md at master · taggartk/2016-NICAR-Adv-SQL · GitHub
Malik Singleton – SELECT needle FROM haystack;
Statistical functions in MySQL • Code is poetry
Data Analysis Using SQL and Excel – Gordon S. Linoff – Google Books
Using PROC SQL to Find Uncommon Observations Between 2 Data Sets in SAS | The Chemical Statistician
mysql – Query to compare two subsets of data from the same table? – Database Administrators Stack Exchange
sql – How to add “weights” to a MySQL table and select random values according to these? – Stack Overflow
sql – Fast mysql random weighted choice on big database – Stack Overflow
php – MySQL: Select Random Entry, but Weight Towards Certain Entries – Stack Overflow
MySQL Moving average
Calculating descriptive statistics in MySQL | codediesel
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, …
R, MySQL, LM and quantreg
26318_AllText_Print.pdf
ddi-documentation-english-572 (1).pdf
Categorical Data — pandas 0.18.1+143.g3b75e03.dirty documentation
python – Loading STATA file: Categorial values must be unique – Stack Overflow
Using the CSV module in Python
14.1. csv — CSV File Reading and Writing — Python 3.5.2rc1 documentation
csvsql — csvkit 0.9.1 documentation
weight samples with python – Google Search
python – Weighted choice short and simple – Stack Overflow
7.1. string — Common string operations — Python v2.6.9 documentation
Introduction to Data Analysis with Python | Lynda.com
A Complete Tutorial to Learn Data Science with Python from Scratch
GitHub – fonnesbeck/statistical-analysis-python-tutorial: Statistical Data Analysis in Python
Verifying the email – Email Checker
A little tour of aleph, a data search tool for reporters – pudo.org (Friedrich Lindenberg)
Welcome – Investigative Dashboard Search
Investigative Dashboard
Working with CSVs on the Command Line
FiveThirtyEight’s data journalism workflow with R | useR! 2016 international R User conference | Channel 9
Six issue when installing package · Issue #3165 · pypa/pip · GitHub
python – Installing pip on Mac OS X – Stack Overflow
Source – Journalism Code, Context & Community – A project by Knight-Mozilla OpenNews
Introducing Kaggle’s Open Data Platform
NASA just made all the scientific research it funds available for free – ScienceAlert
District council code list | Statistics South Africa
How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code – Cloudera Engineering Blog
GitHub – gavinr/geojson-csv-join: A script to take a GeoJSON file, and JOIN data onto that file from a CSV file.
7 command-line tools for data science
Python Basics: Lists, Dictionaries, & Booleans
Jupyter Notebook Viewer

PYTHON FOR JOURNALISTS

New folder

Reshaping and Pivot Tables — pandas 0.18.1 documentation
Reshaping in Pandas – Pivot, Pivot-Table, Stack and Unstack explained with Pictures – Nikolay Grozev
Pandas Pivot-Table Example – YouTube
pandas.pivot_table — pandas 0.18.1 documentation
Pandas Pivot Table Explained – Practical Business Python
Pivot Tables In Pandas – Python
Pandas .groupby(), Lambda Functions, & Pivot Tables
Counting Values & Basic Plotting in Python
Creating Pandas DataFrames & Selecting Data
Filtering Data in Python with Boolean Indexes
Deriving New Columns & Defining Python Functions
Python Histograms, Box Plots, & Distributions
Resources for Further Learning
Python Methods, Functions, & Libraries
Python Basics: Lists, Dictionaries, & Booleans
Real-world Python for data-crunching journalists | TrendCT
Cookbook — agate 1.4.0 documentation
3. Power tools — csvkit 0.9.1 documentation
Tutorial — csvkit 0.9.1 documentation
4. Going elsewhere with your data — csvkit 0.9.1 documentation
2. Examining the data — csvkit 0.9.1 documentation
A Complete Tutorial to Learn Data Science with Python from Scratch
For Journalism
ProPublica Summer Data Institute
Percentage of vote change | CARTO
Data Science | Coursera
Data journalism training materials
Pythex: a Python regular expression editor
A secure whistleblowing platform for African media | afriLEAKS
PDFUnlock! – Unlock secured PDF files online for free.
The digital journalist’s toolbox: mapping | IJNet
Bulletproof Data Journalism – Course – LEARNO
Transpose columns across rows (grefine 2.5) ~ RefinePro Knowledge Base for OpenRefine
Installing NLTK — NLTK 3.0 documentation
1. Language Processing and Python
Visualize any Text as a Network – Textexture
10 tools that can help data journalists do better work, be more efficient – Poynter
Workshop Attendance
Clustering In Depth · OpenRefine/OpenRefine Wiki · GitHub
Regression analysis using Python
DataBasic.io
DataBasic.io
R for Every Survey Analysis – YouTube
Git – Book
NICAR17 Slides, Links & Tutorials #NICAR17 // Ricochet by Chrys Wu
Register for Anonymous VPN Services | PIA Services
The Bureau of Investigative Journalism
dtSearch – Text Retrieval / Full Text Search Engine
Investigation, Cybersecurity, Information Governance and eDiscovery Software | Nuix
How we built the Offshore Leaks Database | International Consortium of Investigative Journalists
Liz Telecom/Azimmo – Google Search
First Python Notebook — First Python Notebook 1.0 documentation
GitHub – JasonKessler/scattertext: Beautiful visualizations of how language differs among document types
  Flattr this!