You are browsing the archive for wp9.

Final report: JISC Open Bibliography 2

- August 23, 2012 in BibServer, JISC OpenBib, jiscopenbib2, wp10, wp9

Following on from the success of the first JISC Open Bibliography project we have now completed a further year of development and advocacy as part of the JISC Discovery programme. Our stated aims at the beginning of the second year of development were to show our community (namely all those interested in furthering the cause of Open via bibliographic data, including: coders; academics; those with interest in supporting Galleries, Libraries, Archives and Museums; etc) what we are missing if we do not commit to Open Bibliography, and to show that Open Bibliography is a fundamental requirement of a community committed to discovery and dissemination of ideas. We intended to do this by demonstrating the value of carefully managed metadata collections of particular interest to individuals and small groups, thus realising the potential of the open access to large collections of metadata we now enjoy. We have been successful overall in achieving our aims, and we present here a summary of our output to date (it may be useful to refer to this guide to terms).

Outputs

BibServer and FacetView

The BibServer open source software package enables individuals and small groups to present their bibliographic collections easily online. BibServer utilises elasticsearch in the background to index supplied records, and these are presented via the frontend using the FacetView javascript library. This use of javascript at the front end allows easy embedding of result displays on any web page.

BibSoup and more demonstrations

Our own version of BibServer is up and running at http://bibsoup.net, where we have seen over 100 users sharing more than 14000 records across over 60 collections. Some particularly interesting example collections include: Additionally, we have created some niche instances of BibServer for solving specific problems – for example, check out http://malaria.bibsoup.net; here we have used BibServer to analyse and display collections specific to malaria researchers, as a demonstration of the extent of open access materials in the field. Further analysis allowed us to show where best to look for relevant materials that could be expected to be openly available, and to begin work on the concept of an Open Access Index for research. Another example is the German National Bibliography, as provided by the German National Library, which is in progress (as explained by Adrian Pohl and Etienne Posthumus here). We have and are building similar collections for all other national bibliographies that we receive.

BibJSON

At http://bibjson.org we have produced a simple convention for presenting bibliographic records in JSON. This has seen good uptake so far, with additional use in the JISC TEXTUS project and in Total Impact, amongst others.

Pubcrawler

Pubcrawler collects bibliographic metadata, via parsers created for particular sites, and we have used it to create collections of articles. The full post provides more information.

datahub collections

We have continued to collect useful bibliographic collections throughout the year, and these along with all others discovered by the community can be found on the datahub in the bibliographic group.

Open Access / Bibliography advocacy videos and presentations

As part of a Sprint in January we recorded videos of the work we were doing and the roles we play in this project and wider biblio promotion; we also made a how-to for using BibServer, including feedback from a new user: Setting up a Bibserver and Faceted Browsing (Mark MacGillivray) from Bibsoup Project on Vimeo. Peter and Tom Murray-Rust’s video, made into a prezi, has proven useful in explaining the basics of the need for Open Bibliography and Open Access:

Community activities

The Open Biblio community have gathered for a number of different reasons over the duration of this project: the project team met in Cambridge and Edinburgh to plan work in Sprints; Edinburgh also played host to a couple of Meet-ups for the wider open community, as did London; and London hosted BiblioHack – a hackathon / workshop for established enthusasiasts as well as new faces, both with and without technical know-how. These events – particularly BiblioHack – attracted people from all over the UK and Europe, and we were pleased that the work we are doing is gaining attention from similar projects world-wide.

Further collaborations

Lessons

Over the course of this project we have learnt that open source development provides great flexibility and power to do what we need to do, and open access in general frees us from many difficult constraints. There is now a lot of useful information available online for how to do open source and open access. Whilst licensing remains an issue, it becomes clear that making everything publicly and freely available to the fullest extent possible is the simplest solution, causing no further complications down the line. See the open definition as well as our principles for more information. We discovered during the BibJSON spec development that it must be clear whether a specification is centrally controlled, or more of a communal agreement on use. There are advantages and disadvantages to each method, however they are not compatible – although one may become the other. We took the communal agreement approach, as we found that in the early stages there was more value in exposing the spec to people as widely and openly as possible than in maintaining close control. Moving to a close control format requires specific and ongoing commitment. Community building remains tricky and somewhat serendipitous. Just as word-of-mouth can enhance reputation, failure of certain communities can detrimentally impact other parts of the project. Again, the best solution is to ensure everything is as open as possible from the outset, thereby reducing the impact of any one particular failure.

Opportunities and Possibilities

Over the two years, the concept of open bibliography has gone from requiring justification to being an expectation; the value of making this metadata openly available to the public is now obvious, and getting such access is no longer so difficult; where access is not yet available, many groups are now moving toward making it available. And of course, there are now plenty tools to make good use of available metadata. Future opportunities now lie in the more general field of Open Scholarship, where a default of Open Bibliography can be leveraged to great effect. For example, recent Open Access mandates by many UK funding councils (eg Finch Report) could be backed up by investigative checks on the accessibility of research outputs, supporting provision of an open access corpus of scholarly material. We intend now to continue work in this wider context, and we will soon publicise our more specific ideas; we would appreciate contact with other groups interested in working further in this area.

Further information

For the original project overview, see http://openbiblio.net/p/jiscopenbib2; also, a full chronological listing of all our project posts is available at http://openbiblio.net/tag/jiscopenbib2/. The work package descriptions are available at http://openbiblio.net/p/jiscopenbib2/work-packages/, and links to posts relevant to each work package over the course of the project follow:
  • WP1 Participation with Discovery programme
  • WP2 Collaborate with partners to develop social and technical interoperability
  • WP3 Open Bibliography advocacy
  • WP4 Community support
  • WP5 Data acquisition
  • WP6 Software development
  • WP7 Beta deployment
  • WP8 Disruptive innovation
  • WP9 Project management (NB all posts about the project are relevant to this WP)
  • WP10 Preparation for service delivery
All software developed during this project is available on open source licence. All the data that was released during this project fell under OKD compliant licenses such as PDDL or CC0, depending on that chosen by the publisher. The content of our site is licensed under a Creative Commons Attribution 3.0 License (all jurisdictions). The project team would like to thank supporting staff at the Open Knowledge Foundation and Cambridge University Library, the OKF Open Bibliography working group and Open Access working group, Neil Wilson and the team at the British Library, and Andy McGregor and the rest of the team at JISC.

BiblioHack: Day 2, part 2

- June 14, 2012 in BibServer, Data, event, Events, JISC OpenBib, jiscopenbib2, minutes, News, OKFN Openbiblio, Talks, wp1, wp2, wp3, wp4, wp5, wp6, wp7, wp8, wp9

Pens down! Or, rather, key-strokes cease! BiblioHack has drawn to a close and the results of two days’ hard labour are in:

A Bibliographic Toolkit

Utilising BibServer Peter Murray-Rust reported back on what was planned, what was done, and the overlap between the two! The priority was cleaning up the process for setting up BibServers and getting them running on different architectures. (PubCrawler was going to be run on BibServer but currently it’s not working). Yesterday’s big news was that Nature has released 30 million references or thereabouts – this furthers the cause of scholarly literature whereby we, in principle, can index records rather than just corporate organisations being able / permitted to do so. National Bibliographies have been put on BibSoup – UK (‘BL’), Germany, Spain and Sweden – with the technical problem character encodings raising its head (UTF8 solves this where used). Also, BibSoup is useful for TEXTUS so the overall ‘toolkit’ approach is reinforced! Open Access Index Emanuil Tolev presented on ACat – Academic Catalogue. The first part of an index is having things to access – so gathering about 55,000 journals was a good start! Using Elastic Search within these journals will give list of contents which will then provide lists of articles (via facet view), then other services will determine licensing / open access information (URL checks assisted in this process). The ongoing plan is to use this tool to ascertain licensing information for every single record in the world. (Link to ACat to follow). Annotation Tools Tom Oinn talked about the ideas that have come out of discussions and hacking around annotators and TEXTUS. Reading lists and citation management is a key part of what TEXTUS is intended to assist with, so the plan is for any annotation to be allowed to carry a citation – whether personal opinion or related record. Personalised lists will come out of this and TEXTUS should become a reference management tool in its own right. Keep your eye on TEXTUS for the practical applications of these ideas! Note: more detailed write-ups will appear courtesy of others, do watch the OKFN blog for this and all things open… Postscript: OKFN blog post here Huge thanks to all those who participated in the event – your ideas and enthusiasm have made this so much fun to be involved with. Also thanks to those who helped run the event, visible or behind-the-scenes, particularly Sam Leon. Here’s to the next one :-)

BiblioHack: Day 2, part 1

- June 14, 2012 in BibServer, Data, event, Events, JISC OpenBib, jiscopenbib2, minutes, News, OKFN Openbiblio, Talks, wp1, wp2, wp3, wp4, wp5, wp6, wp7, wp8, wp9

After easing into the day with breakfast and coffee, each of the 3 sub-groups gave an overview of the mini-project’s aim and fed back on the evening’s progress:
  • Peter Murray-Rust revisited the overarching theme of ‘A Bibliographic Toolkit’ and the BibServer sub-group’s specific work on adding datasets and easily deploying BibServer; Adrian Pohl followed up to explain that he would be developing a National Libraries BibServer.
  • Tom Oinn explained the Annotation Tools sub-groups’s work on developing annotation tools – ie TEXTUS – looking at adding fragments of text, with your own comments and metadata linked to it, which then forms BibSoup collections. Collating personalised references is enhanced with existing search functionality, and reading lists with annotations can refer to other texts within TEXTUS.
  • Mark MacGillivray presented the 3rd group’s work on an Open Access Index. This began with listing all the journals that can be found in the whole world, with the aim of identifying the licence of each article. They have been scraping collections (eg PubMed) and gathering journals – at the time of speaking they had around 50,000+! The aim is to enable a crowd-sourced list of every journal in the world which, using PubCrawler, should provide every single article in the world.
With just 5 hours left before stopping to gather thoughts, write-up and feedback to the rest of the group, it will be very interesting to see the result…

BiblioHack: Day 1

- June 14, 2012 in BibServer, Data, event, Events, JISC OpenBib, jiscopenbib2, licensing, lod-lam, minutes, OKFN Openbiblio, Talks, wp1, wp2, wp3, wp4, wp5, wp6, wp7, wp8, wp9

The first day of BiblioHack was a day of combinations and sub-divisions! The event attendees started the day all together, both hackers and workshop / seminar attendees, and Sam introduced the purpose of the day as follows: coders – to build tools and share ideas about things that will make our shared cultural heritage and knowledge commons more accessible and useful; non-coders – to get a crash course in what openness means for galleries, libraries, archives and museums, why it’s important and how you can begin opening up your data; everyone – to get a better idea about what other people working in your domain do and engender a better understanding between librarians, academics, curators, artists and technologists, in order to foster the creation of better, cooler tools that respond to the needs of our communities. The hackers began the day with an overview of what a hackathon is for and how it can be run, as presented by Mahendra Mahey, and followed with lightning talks as follows:
  • Talk 1 Peter Murray Rust & Ross Mounce – Content and Data Mining and a PDF extractor
  • Talk 2 Mike Jones – the m-biblio project
  • Talk 4 Ian Stuart – ORI/RJB (formerly OA-RJ)
  • Talk 5 Etienne Posthumus – Making a BibServer Parser
  • Talk 6 Emanuil Tolev – IDFind – identifying identifiers (“Feedback and real user needs won’t gather themselves”)
  • Talk 7 Mark MacGillivray – BibServer – what the project has been doing recently, how that ties into the open access index idea.
  • Talk 8 Tom Oinn – TEXTUS
  • Talk 9 Simone Fonda – Pundit – collaborative semantic annotations of texts (Semantic Web-related tool)
  • Talk 10 Ian Stuart – The basics of Linked Data
We decided we wanted to work as a community, using our different skills towards one overarching goal, rather than breaking into smaller groups with separate agendas. We formed the central idea of an ‘open bibliographic tool-kit’ and people identified three main areas to hack around, playing to their skills and interests:
  • Utilising BibServer – adding datasets and using PubCrawler
  • Creating an Open Access Index
  • Developing annotation tools
At this point we all broke for lunch, and the workshoppers and hackers mingled together. As hoped, conversations sprung up between people from the two different groups and it was great to see suggestions arising from shared ideas and applications of one group being explained to the theories of the other. We re-grouped and the workshop continued until 16.00 – see here for Tim Hodson’s excellent write-up of the event and talks given – when the hackers were joined by some who attended the workshop. Each group gave a quick update on status, to try to persuade the new additions to the group to join their particular work-flow, and each group grew in number. After more hushed discussions and typing, the day finished with a talk from Tara Taubman about her background in the legalities of online security and IP, and we went for dinner. Hacking continued afterwards and we celebrated a hard day’s work down the pub, lookong forward to what was to come. Day 2 to follow…

Open source development – how we are doing

- May 29, 2012 in BibServer, JISC OpenBib, jiscopenbib2, licensing, progress, progressPosts, projectMethodology, projectPlan, riskAnalysis, software, WIN, wp10, wp2, wp3, wp6, wp9

Whilst at Open Source Junction earlier this year, I talked to Sander van der Waal and Rowan Wilson about the problems of doing open source development. Sander and Rowan work at OSS watch, and their aim is to make sure that open source software development delivers its potential to UK HEI and research; so, I thought it would be good to get their feedback on how our project is doing, and if there is anything we are getting wrong or could improve on. It struck me that as other JISC projects such as ours are required to make their output similarly publicly available, this discussion may be of benefit to others; after all, not everyone knows what open source software is, let alone the complexities that can arise from trying to create such software. Whilst we cannot help avoid all such complexities, we can at least detail what we have found helpful to date, and how OSS Watch view our efforts. I provided Sander and Rowan a review of our project, and Rowan provided some feedback confirming that overall we are doing a good job, although we lack a listing of the other open source software our project relies on, and their licenses. Whilst such data can be discerned from the dependencies of the project, this is not clear enough; I will add a written list of dependencies to the README. The response we received is provided below, followed by the overview I initially provided, which gives a brief overview of how we managed our open source development efforts: ==== Rowan Wilson, OSS Watch, responds: Your work on this project is extremely impressive. You have the systems in place that we recommend for open development and creation of community around software, and you are using them. As an outsider I am able to quickly see that your project is active and the mailing list and roadmap present information about ways in which I could participate. One thing I could not find, although this may be my fault, is a list of third party software within the distribution. This may well be because there is none, but it’s something I would generally be keen to see for the purposes of auditing licence compatibility. Overall though I commend you on how tangible and visible the development work on this project is, and on the focus on user-base expansion that is evident on the mailing list. ==== Mark MacGillivray wrote: Background – May 2011, OKF / AIM bibserver project Open Knowledge Foundation contracted with American Institute of Mathematics under the direction of Jim Pitman in the dept. of Maths and Stats at UC Berkeley. The purpose of the project was to create an open source software repository named BibServer, and to develop a software tool that could be deployed by anyone requiring an easy way to put and share bibliographic records online. A repository was created at http://github.com/okfn/bibserver, and it performs the usual logging of commits and other activities expected of a modern DVCS system. This work was completed in September 2011, and the repository has been available since the start of that project with a GNU Affero GPL v3 licence attached. October 2011 – JISC Open Biblio 2 project The JISC Open BIblio 2 project chose to build on the open source software tool named BibServer. As there was no support from AIM for maintaining the BibServer repository, the project took on maintenance of the repository and all further development work, with no change to previous licence conditions. We made this choice as we perceive open source licensing as a benefit rather than a threat; it fit very well with the requirements of JISC and with the desires of the developers involved in the project. At worst, an owner may change the licence attached to some software, but even in such a situation we could continue our work by forking from the last available open source version (presuming that licence conditions cannot be altered retrospectively). The code continues to display the licence under which it is available, and remains publicly downloadable at http://github.com/okfn/bibserver. Should this hosting resource become publicly unavailable, an alternative public host would be sought. Development work and discussion has been managed publicly, via a combination of the project website at http://openbiblio.net/p/jiscopenbib2, the issue tracker at http://github.com/okfn/bibserver/issues, a project wiki at http://wiki.okfn.org/Projects/openbibliography, and via a mailing list at openbiblio-dev@lists.okfn.org February 2012 – JISC Open Biblio 2 offers bibsoup.net beta service In February the JISC Open Biblio 2 project announced a beta service available online for free public use at http://bibsoup.net. The website runs an instance of BibServer, and highlights that the code is open source and available (linking to the repository) to anyone who wishes to use it. Current status We believe that we have made sensible decisions in choosing open source software for our project, and have made all efforts to promote the fact that the code is freely and publicly available. We have found the open source development paradigm to be highly beneficial – it has enabled us to publicly share all the work we have done on the project, increasing engagement with potential users and also with collaborators; we have also been able to take advantage of other open source software during the project, incorporating it into our work to enable faster development and improved outcomes. We continue to develop code for the benefit of people wishing to publicly put and share their bibliographies online, and all our outputs will continue to be publicly available beyond the end of the current project.

Planning for the next three months

- March 20, 2012 in BibServer, JISC OpenBib, jiscopenbib2, minutes, OKFN Openbiblio, wp10, wp2, wp3, wp4, wp5, wp6, wp7, wp8, wp9

We have developed BibJSON
We’ve improved BibServer
We’ve made BibSoup

…But what’s next? The nature of cutting-edge technology is that it is fast-paced and constantly adapting. We may think we’ve come up with a good idea, but if it turns out someone else has already had that idea and developed it – that’s great and means we incorporate it and go on to the next exciting thing. We may think that this next thing is important, but if it turns out it doesn’t quite do the helpful thing needed to make our users delighted or promote open bibliographic data – we change tack and try something else. We know what we want to do, ie make useful and smart tools for the people doing wonderful things in the public domain, but, as for what our end product looks like (if indeed there is the one product to play with) – well, that all depends on the emerging requirements, other technologies that come to light and how successful our ideas are along the way. Taking all that into account, at the Sprint last week we attempted to plan for the next three months. Our work will be more successful the more focused we are, and having an end-result in mind is useful for that. So, here’s a rough guide to how we think our project will shape up between now and June: To-Do Timeline NB the images are a little fuzzy, but do click on them to follow the links to Flickr where these are stored and appear more clearly. We have already published the CUL blog post and Mark has written about BiBServer functionality that arose from ideas at the Sprint. We’ll develop these ideas into workable and worthwhile tools or processes, and before we know it we’ll be three months down the line and thinking ‘…but what’s next?’

Day 3 of the March Sprint

- March 14, 2012 in event, JISC OpenBib, jiscopenbib2, OKFN Openbiblio, wp2, wp3, wp4, wp5, wp6, wp9

This morning we were buzzing from the Meet-up, excited about the interesting people we met and the cool things they talked about. Graham Steel, who was in town for yesterday’s event, stopped by to see what the team was up to (largely coding / blogging and ignoring one another) and Mahendra headed home with our thanks for a really great event. Work on new functionality for BibServer continued today, as Mark wired up the new front end into the back end which should – after testing – add some smart and helpful options to your user experience. Etienne worked away on the back end, with a new asynchronous parser sub-system; he was also excited about his development of an example parser plug-in to query Wikipedia and parse results using BibJSON. The idea of this is that, when searching for something in Wikipedia, each page result for that word is parsed for citations; these citations are then put through BibJSON and dropped into BibSoup as a collection. So, you search for X in and a moment later there is a BibSoup collection by that name displaying all related citations! This is still in the testing phase, and search phrases have to be precise as all of Wikipedia’s relevant results are returned, but we are ‘guardedly excited’, to borrow Etienne’s elegant phrasing. More on this, and the other cool coding Mark and Etienne have been doing, later. Meanwhile, I have been writing up about last night’s Meet-up and following up with the lovely attendees, as well as thinking more about the Hackathon in June. The group discussed OpenGLAM and publicdomainworks which are projects / areas we have had / will have a lot in common with, and looked at ongoing opportunities together. There will be more blog posts as coding is tested, events confirmed and collaborations agreed, so watch this space.

Day 2 of the March Sprint

- March 13, 2012 in event, JISC OpenBib, jiscopenbib2, OKFN Openbiblio, wp2, wp3, wp4, wp5, wp6, wp9

Today started well: Berkeley and PubMed contacted us about running a BibServer, which is great! It was also a day of comings and goings: Etienne arrived to code with Mark, Richard Jones of Cottage Labs dropped by to play around with parsers, and Thomas headed off after exploring the benefits of JSON-LD and BibJSON. Etienne and Mark have been developing BibServer, merging facet view changes with existing software in order to present new functionality and provide a better user experience for the creation and indexing of data. This is expected to be completed tomorrow after some testing. Also available tomorrow will be an update from Thomas, who has taken data from BibSoup and put it into 3lib; he has also been at polling AuthorClaim for author information and looking at linking the metadata with BibSoup records. Mahendra and I got Sam / OpenGLAM involved with the Hackathon we’re planning for 12th-14th June and got thinking of London-based venues… suggestions welcome! The schedule for tomorrow is to discuss interaction with other OKFN projects including TEXTUS, the Open Data Handbook and Public Domain Works, as well as testing BibServer’s new functionality. In other news, Sam and Laura were busy writing their Lightning Talks for tonight’s Meet-Up… More on that later!

Day 1 of the March Sprint

- March 12, 2012 in event, JISC OpenBib, jiscopenbib2, OKFN Openbiblio, wp2, wp3, wp4, wp5, wp6, wp9

Agendas are funny things; you have an idea of what you want to do, you write a few bullet-points to focus it a little and you presume things will naturally lead on from one thing to the next… Well, not this week. Barely had we settled down to the intros when the agenda was out the window! As well as the Usual Suspects (Mark, me) and the Collaborators (Sam, Laura) we welcomed Thomas Krichel, an expert in scholarly communication over from the States, and were joined by two additional OKFNers, Jilly Matthews and Will Waites, who are based in Edinburgh and popped by to catch up on the project’s recent developments. To begin with, all of us set our minds to the collaborative opportunities with CKAN as Jilly explained the project and the difference between thedatahub.org and CKAN (the former is a publicly available instance of the technology of the latter, which drives this and other instances). Then we split up into groups:
  • Thomas and Mark explored connecting BibSoup data with AuthorClaim and refined some ideas for the future of BibJSON, with Will (who was involved with previous iterations of the Open Biblio project/s) and Jilly contributing to discussions around simple / complex JSON following on from Mark’s post;
  • Mahendra and I finalised the details of tomorrow’s Meet-up;
  • Laura and Sam ducked in to various conversations, suggesting improvements in technology and running events as key phrases caught their ears… Sam was looking at Open Biblio’s overlap with OpenGLAM and Laura was advising on tomorrow’s event, having arranged several Cambridge Meet-ups before.
The plan for tomorrow is for Mahendra, Laura and I to plan the June Hackathon and for Mark to get some good coding done with Etiennne… but we’ll see how the agenda shifts!

March Sprint and Meet-up

- March 1, 2012 in event, JISC OpenBib, jiscopenbib2, OKFN Openbiblio, wp2, wp4, wp6, wp9

There will be a coding and planning sprint for the project team in Edinburgh on Monday 12th and Tuesday 13th March, with tying-up of loose ends on Wednesday 14th for those still around. Following on from the productivity of January’s sprint, we aim to update project documentation, code and refine development, explore integration with other projects, plan for the remaining three months including demonstrations and user engagement, etc. We will be joined by representatives of other projects including Textus, the School of Open Data and DevCSI. If anyone is interested in seeing what we’re up to, or talking open data / knowledge in general, come along on the Tuesday evening as we have arranged a Meet-up with others from OKFN and DevCSI, and all are welcome – more details here. This promises to be a great opportunity for some Edinburgh-based folk (and anyone willing to travel!) to get together to discuss ideas, projects and generally set the world to rights over a brew. For more information contact naomi.lillie [@] okfn.org. Twitter: #OpenDataEDB