You are browsing the archive for github.

CKAT (Connected Knowledge and Tools) #1 – 서울시 프로젝트

- October 14, 2015 in Featured, github, linked-open-data, seoul, 프로젝트

지난번에 페이스북과 블로그에서 사전 공지해 드린 것과 같이, Open Knowledge Korea 멤버들이 지금까지 수행하였던 다수의 프로젝트 중 대표적인 사례인 서울시 프로젝트의 산출물을 공개합니다. 공개 내용은 Linked Data 서비스를 제공하기 위해 사용되었던 오픈 소스들과 Linked Data 시스템 구성도, 데이터 간의 연계를 위해 고민했던 온톨로지 모델링 사례(행정구역 온톨로지) 입니다. 이에 대한 설명 자료는 SlideShare에서 보실 수 있으며, GitHub에 접근하여 오픈 소스를 사용하실 수 있습니다. 관련 문서 및 소스에 대한 궁금하신 것이 있으시면 댓글 또는 페이스북 그룹에 남겨주시길 바라며, 이번 프로젝트 산출물에 대한 공개를 통해 내가 가지고 있었던 데이터를 보다 쉽게 공유하고, 내가 원하는 다른 데이터들과 연결 고리를 만들수 있는 장이 활성화 되기를 바랍니다. Open Knowldege Korea는 공유와 개방의 모토아래 의지와 열정이 가진 분들 누구에게나 열려있습니다.  

링크드 데이터 구축을 위한 오픈 소스 모음 공개

- October 5, 2015 in github, OKF, Open Source, 소식

Open Knowledge Korea (OK-Korea) 에서는 Linked Open Data (beta) 프로젝트를 공개 한 이후, 지속적으로 해커톤 및 오픈 데이터 활동을 하고 있습니다. 최근, 국내에서는 정부 3.0이라는 새로운 정부운영 패러다임에 맞춰 공공정보의 개방 및 공유가 활발하게 진행되고 있으며,
데이터를 보다 효율적으로 공개하고 활용하기 위한 노력들이 계속되고 있습니다. 이에 Open Knowledge Korea 에서는 여러 도메인에서 활용되고 있는 데이터를 보다 손 쉽게 공유하고 연결하기 위한 방법의 하나로,
링크드 데이터 구축을 위한 오픈 소스들공개합니다. 이번에 공개되는 오픈 소스들을 통해 여러분들도 가지고 계신 데이터를 이용하여 링크드 데이터를 직접 구축할 수 있습니다.
또한, 링크드 데이터 구축의 편의성을 위해 ‘사용 가이드’ 및 ‘시스템 개요도’ 등을 함께 공개합니다.
  • 참고 사이트 : 공공 데이터와 링크드 데이터의 만남 첫 번째 사례
그리하여, 향후에는 여러분들의 다양한 데이터들이 모여 ‘Open Knowledge’ 생태계를 만들 수 있을 것이라고 생각합니다. 이번에 공개되는 오픈 소스들은, Linked Open Data (beta) 허브 중에서 Connected Data Hub를 만들기 위해 사용된 오픈 소스들이며, 누구나 함께 참여하실 수 있도록 깃허브(Github)를 통해서 공개합니다. 깃허브 주소는 추후 업데이트 예정입니다. 궁금하신 것이 있으면 댓글 또는 페이스북 그룹에 남겨주세요. ^^ 감사합니다.

Vita Huset Släpper Budget för 2016 Open Source!

- February 3, 2015 in budget, github, Medium, Open Source, open-government, Vita Huset, White House

Vita Husets Budget för 2016

Vita Husets Budget för 2016 CC: WhiteHouse.gov

Idag släppte Vita Huset och USA:s regering 2016 års budget. Det spännande med detta års budget är att den publiceras som öppen källkod (open source) på GitHub. Den består i nuläget av tre filer i förrådet, och är ett utdrag ur budgetsiffrorna, därmed kanske inte helt open source. Däremot finns ytterligare filer att hämta vid behov. Det finns en användarguide för hur en kan använda denna budgetdata. Detta initiativ gör det fritt fram för vem som helst att analysera data, kombinera och dela, helt fritt i enlighet med Creative Commons Zero-licens! Den går också att läsa helt öppet via blogg-plattformen Medium. Detta spännande drag är en möjlig framgång för Open Government-rörelsen inom Open Knowledge och kan bana vägen för fler! Än så länge publicerar Finansdepartementet bara nyckeltabeller ur Regeringens budgetar. När får vi se Sverige göra sin budget mer öppen på detta sätt? Dags att de digitala möjligheterna tas på allvar! Med initiativen följer även vissa frågor: Hur kan detta initiativ stärka demokratin? Och hur kan det påverka förtroende till anförtrodda makthavare? Vilka berättelser och lärdomar kommer vi uppleva baserat på denna data? De som öppnar får se! Läs mer om nyheten på Vita Husets hemsida!

New version of open source visualization Head Start released

- February 24, 2014 in altmetrics, github, Open Source, Panton Fellowships, scholarly communication, timeline, visualization

In July last year, I released the first version of a knowledge domain visualization called Head Start. Head Start is intended for scholars who want to get an overview of a research field. They could be young PhDs getting into a new field, or established scholars who venture into a neighboring field. The idea is that you can see the main areas and papers in a field at a glance without having to do weeks of searching and reading.
Interface of Head Start

Interface of Head Start

You can find an application for the field of educational technology on Mendeley Labs. Papers are grouped by research area, and you can zoom into each area to see the individual papers’ metadata and a preview (or the full text in case of open access publications). The closer two areas are, the more related they are subject-wise. The prototye is based on readership data from the online reference management system Mendeley. The idea is that the more often two papers are read together, the closer they are subject-wise. More information on this approach can be found in my dissertation (see chapter 5), or if you like it a bit shorter, in this paper and in this paper. Head Start is a web application built with D3.js. The first version worked very well in terms of user interaction, but it was a nightmare to extend and maintain. Luckily, Philipp Weißensteiner, a student at Graz University of Technology became interested in the project. Philipp worked on the visualization as part of his bachelor’s thesis at the Know-Center. Not only did he modularize the source code, he also introduced Javascript Finite State Machine that lets you easily describe different states of the visualization. To setup a new instance of Head Start is now only a matter of a couple of lines. Philipp developed a cool proof of concept for his approach: a visualization that shows the evolution of a research field over time using small multiples. You can find his excellent bachelor’s thesis in the repository (German).
Head Start Timeline View

Head Start Timeline View

In addition, I cleaned up the pre-processing scripts that do all the clustering, ordination and naming. The only thing that you need to get started is a list of publications and their metadata as well as a file containing similarity values between papers. Originally, the similarity values were based on readership co-occurrence, but there are many other measures that you can use (e.g. the number of keywords or tags that two papers have in common). So without further ado, here is the link to the Github repository. Any questions or comments, please send them to me or leave a comment below.

Working on the Open Design Definition

- February 26, 2013 in definition, Featured, github, open design, Social Network Analysis

After few months of pause, it is finally time to give an update about the development of the Open Design Definition, a project I’ve started while working at Aalto Media Factory (and co-organizing the Open Knowledge Festival) together …

Working on the Open Design Definition

- February 26, 2013 in definition, Featured, github, open design, Social Network Analysis

After few months of pause, it is finally time to give an update about the development of the Open Design Definition, a project I’ve started while working at Aalto Media Factory (and co-organizing the Open Knowledge Festival) together with many others from this group. 01. The Open Design Definition workshop at Open Knowledge Festival During […]

Working on the Open Design Definition

- February 26, 2013 in definition, Featured, github, open design, Social Network Analysis

After few months of pause, it is finally time to give an update about the development of the Open Design Definition, a project I’ve started while working at Aalto Media Factory (and co-organizing the Open Knowledge Festival) together …

Bundes-Git – German Laws on GitHub

- January 4, 2013 in bundesgit, Bundestag, Featured, gesetze, github, innovation, offene Daten, Politik

If you compare software code and legislation you can find many similarities: both are big bodies of text spread over multiple units (laws/files). The total amount of text inevitably grows bigger over time with many small changes to existing parts while most of the corpus stays the same. However, the tooling and editing process for these domains is very different: while developers are in the fortunate position that they can build and improve their own tools, legislators are stuck with proprietary tools like MS Word that are simply not built to collaboratively work on a big corpus of text. But if source code and laws have a similar information structure, why not apply the tools used in software development to the legislative process? That is what Bundes-Git (“Federal Git”) is currently trying out in Germany. Bundes-Git is a Git version control repository of all German Federal Laws and Regulations as Markdown. The goal was to come up with the simplest solution to handle laws that could possibly work and integrate it well into the existing developer ecosystem.
  The idea has been well received with an article on Wired.com and articles on German IT news sites Heise and Golem. The popularity can surely also be attributed to our marvelous Bundes-Git mascot, dubbed octo eagle, thought up by myself and designed by Konstantin Käfer released under CC0 (please go this way if you are interested in a t-shirt or hoodie).

Design decisions explained

All other law storage formats use XML. But to me XML is neither human readable nor human writable. Let me get into the details of some of the design decisions:
  • Git because it’s the most popular distributed version control system right now.
  • GitHub because it’s the most popular Git host right now and comes with some nice perks like Pull Request and GitHub Pages.
  • Markdown because any more structure like XML or JSON would make it harder for humans to read or write the format and diffs would be difficult to read.
  • Naming files index.md because it works nicely with Jekyll and GitHub Pages renders all laws into a currently very simple page.
  • YAML Front Matter is necessary for Jekyll but also serves as nice a meta data store on laws.
  • Committing from branches with non-fast-forward merges because… uhmm. This is really up for discussion. I want to keep track of where changes originate and branches are created for each law publication but this heavily diverts from the clean commit history philosophy that e.g. the Linux kernel lives by.
There are some more software development concepts that can be applied to the legislation process. Here are some fun things I’d like to try:
  • A prose.io-like editor to easily create law proposals and make a pull request.
  • Measuring the complexity of corpus/laws/paragraphs and using Travis CI to test pull requests if they make the complexity worse. Pattern is a Python NLP library and they recently released a German module which I want to try on our laws.
  • Testing foreign key integrity: are all referenced paragraphs still available?
  • Create an informative visualization out of the Git log automatically like Gregor Aisch did by hand for the German political party law.
  • Let the German president sign off on commits to master.
The design decisions around Bundes-Git fit nicely into the Git/GitHub eco system but they are not set in stone. They also create some problems and annoyances that need to be fixed or circumvented. While I believe the general philosophy and the freshness of the approach is the right direction, we clearly need more discussion.

Future happenings around Bundes-Git:

  • We applied for funding at Testing 123 Global Integrity Innovation Fund. Bundes-Git definitely fits their criteria of brand new, innovative and high-risk. The decision will be made later this month, fingers crossed!
  • There will be Bundes-Git Hacker Meetup in mid January. If you are interested, sign up here.
We decided that the language of discussion on GitHub will be German, but feel free to start a conversation on the OKF Open Legislation mailing list. Also be sure to follow @bundesgit on Twitter!

Open Design Definition at FAB*, Future Everything (Manchester)

- June 25, 2012 in definition, Featured, Git, github, Workshop


Open Design Working Group at Future Everything

This post is a slightly edited version of a message sent to our discusssion list here, which is why it addresses the participants of this working group! Interested in getting involved? Join us!

A quick recap of the last …

Open Design Definition at FAB*, Future Everything (Manchester)

- June 25, 2012 in definition, Featured, Git, github, Workshop


Open Design Working Group at Future Everything

This post is a slightly edited version of a message sent to our discusssion list here, which is why it addresses the participants of this working group! Interested in getting involved? Join us!

A quick recap of the last …