BOSC 2015 wrap up

Posted July 13th, 2015 by Konrad Förstner

I am just back from this year’s BOSC (Bioinformatics Open Source Conference) 2015 in Dublin. BOSC is definitely one of my favorite community gatherings. One reason for this is that the common value – the understanding that openness is an essential part of science – is omnipresent and the foundation of every project presented there. During the two days of the meeting numerous, mostly short talks were given and I think the format is ideal to be exposed to many projects in a short time. The talks are recorded and will soon be publicly available. Anybody who is interested in the discussion that took place online (a tweet dump of #BOSC2015 had more than 3000 tweets!) can have look at the storyfied collections (day 1, day 2). Some of the photos I took are on flickr. Despite the broad range of topics certain patterns were crearly detectable:

1) Docker, docker, docker

There was barely a talk that did not mention docker. Either it was already part of e.g. the deployment process or planned to be integrated. As seen in other fields it is clear that docker (or other container implementations) is here to stay.

2) Common Workflow Language

The Common Workflow Language seems to be on a similar track as docker but in a much earlier phase. Still, considering that it is a rather new initiative it could win quite many supporters and adaptation by several projects.

3) Funding of projects

The majority of presented projects that could be considered as backbone or at least as very important toolboxes of the bioinformatical community like Biopython, BioGems, BioJS or bionode are mostly run as side projects and don’t have dedicated funding. The same is the case for other very promising endeavors like OpenSNP. The development is done in the spare time and/or with little bit of tolerance of the developers’ funding/hosting institutions. This is not a sustainable working mode and quite scary considering how much we rely on them. Funding bodies still seem to miss the importance of such projects and common grants are hard to get for them and/or the structure is no well suited for this open source community developing mode. It is an old, but unsolved problem that gets bigger the with the growing importants of the bioinformatics data analysis.

URIs for scientific software (maybe using figshare)

Posted October 22nd, 2013 by Konrad Förstner

Update: Mark Hahnel just pointed out that there should be no such test entities made public (so you don’t need to test it – I can tell you that it works) and that there will be a sandbox fur such purposes soon. But in general the approach could be used. Figshare is considering code submission for a while but wants to make sure to make things right and do this without inventing the wheel. He also pointed me to a recent tweet linking to an even better solution (which is so far a proof of concept) than the one presented below.

detourI am planning to submit a paper which is covering a bioinformatical software tool. As usually I would like to provide a link to a web page where the software can be downloaded and which contains a reference to the software repository. So far I am not sure if this page will be hosted on the institute sever, github or another location. Here a problem arises that many scientist face if they want to add an URL to a scientific publication. As long as an articles is a static piece of text and is not easily editable it is nearly impossible for the author to inform the reader about changes of an URL. Maybe I am missing something but there seems to be no common recommendation by publishers about how tackle this issue. The general solution would be to create a stable Uniform Resource Identifier (URI) for the content one would like to share. The URI can be an Uniform Resource Locator (URL) e.g. a web address – so in our case the trouble maker which might change – or a Uniform resource name (URN). Well known examples of URN systems are the ISBN system or the DOI system. The later one works pretty well for scientific journal publications. Using this for smaller pieces of research like images, software and data was not very practical as you have to be a paying member to get a name space and to generate DOIs. As far as I see, figshare was a game changer as it is the first platform which offers the creation of DOIs for each item (image, data set, poster, etc.) that is shared on it. Back to the initially described problem – how can I generate a resilient URN for my software?

File types offered by figshare

File types offered by figshare

A test entry

A test entry

Unfortunately figshare does not offer a data type for this purpose – software or URL. But one can create a “data sets” base on a plain text and simply write the current URL of the software into that file. This is definitely not very clean and just a small hack but it seems to work. It would be great if figshare sees the need for such a redirection to software sources and offers a new entity type for this. But maybe I am on a wrong track anyway and there are better solution for my problem. I would be very happy if anybody could refer me to one.

[1] Wren JD. URL decay in MEDLINE–a 4-year follow-up study. Bioinformatics. 2008 Jun 1;24(11):1381-5. doi:10.1093/bioinformatics/btn127. Epub 2008 Apr 15. PubMed PMID: 18413326

Love is in the air – about the influence of volatile molecules

Posted May 22nd, 2012 by Konrad Förstner

It’s time to resurrect this blog after a long time of silence. So let’s start with a quicky. I hope to manage to write more frequently again.

Today Jean-Marc Ghigo – principal investigator at the Institute Pasteur – gave at talk at our instiute. He presented his research results which reveal that volatile molecules can mediate long-range chemical interactions between physically separated bacteria. For example ammonia was shown to lead to antibiotic resistant phenotype in E. coli [1]. Such an influence was found for many other volatile agents. I think is is quite important to keep the effect of such chemicals interference in mind when doing wed-lab work (luckily I don’t do that). Just think about the situation that just before you start your experiment you clean the desk with a detergent that contains ammonia. This sounds like a nightmare for reproducibility of biological experiments to me.

[1] Bernier SP, Létoffé S, Delepierre M, Ghigo JM. Biogenic ammonia modifies antibiotic resistance at a distance in physically separated bacteria. Mol Microbiol. 2011 Aug;81(3):705-16. doi:10.1111/j.1365-2958.2011.07724.x. Epub 2011 Jun 23. PMID: 21651627

Flattr and science (blogging)?

Posted August 5th, 2010 by Konrad Förstner

It looks like Michael Kuhn kick-started the Flattr wave for the science blogging scene. As not everybody knows what this service is – here is a short intro: Flattr is a social micropayment system that makes it very easy to pay content creators like bloggers, podcasters, programmers or photographs little amounts of money just with a click (okay, after a previous registration and money transfer). Your donation (e.g. 5 Euro) is equally shared between all the items you flattred in a month. Especially in Germany Flattr took off in the blogging and podcasting scene – one of the reasons might be Peter Sunde‘s talk at the re:publica 2010 in Berlin. Until now the projects is in a invitation-based closed beta stage. Recently Tim Pritlove – one of the most important podcasters in Germany – roughly measured that there are currently around 26.000+ Flattr users.

Your piece of the cake?What’s about Flattr and science / scientific communication? Until Michael’s tweet I didn’t consider it as an option and embedded Flattr buttons only in my coding related blog. Somehow I had a weird feeling putting them on this science related blog (although this is not the only topic of it). Maybe this was due to my personal understanding that science is not about money. Maybe it was due to the fact that scientist have usually (or do I need to say ideally?) a fixed income and don’t need extra money. What ever it was I guess we should give it also in this field a try. As for the generation of other content Flattr could give new impulses and opportunists for scientific activities and communication. The currency of science is usually attribution not necessarily money. Flattr is a mixture of both. People pay attribution to your work but also are willing to give you a little piece of their resources.

The main application of Flattr in the scholarly field might be to motivate more scientist to communicate their research to the public. And this hopefully without locking content away but making the knowledge a common good by using open licenses. It might be a long way but maybe this service could enable some scientist to become semi-professional communicators of science. I am not sure if it will be a driving force for the communication of scientists to scientists. I also doubt that it can be used to fund (semi)-professional research in the near future. We will see – let’s give it a chance.

Photo by David Goehring.

Augmented Reality in the wet lab

Posted June 14th, 2010 by Konrad Förstner

Eppis with augmented information

It has been quite a while since I was working in a wet lab the last time but I still have a lot of respect for the busy bees generating data there. It’s hard to imagine such a lab without personal computers which get data automatically from the detecting apparatus or manually from the experimenter. But as far as I am informed the stream of information of an experiment is mostly going in one direction – from the experimenter to the computer. Using the channel from the data pool to the experimenter could offer interesting opportunities to improve research. There are devices like electronic lab journals but I think about a further step: Augmented reality (AR) for the wet lab.

Petri dish with augmented information

I prepared two quick and dirty mock-ups (upper part = raw image, lower part = image + data layer) and hope they are sufficient to give a rough impression of the idea. In both cases QR codes would be used to identify the object. A smartphone with a camera would detect these items (e.g. microtrubes or petri dishes) and overlay the given image with information about the experiment. In the case of the petri dish the size of the colony could be measured (assuming that the petri dish can be used as a reference) additionally and stored as part of the experiment. The applications of such systems are endless and all the technologies are available for low cost. It’s just about connecting everything. Is any project bringing this to reality already? It’s somehow hard to believe that this is not the case but I couldn’t find anything like it. This could be a cool community project.

PS: I have to admit there are some small technical hurdles in my examples. The QR codes are pretty small and my smartphone was unable to interpret the code correct if they are bend around something.

Attribution: Original images “32-day-wristband-culture” by Flickr user justinbaeder and Bacteria solution by Flickr user kaibara.

Tree - Help us gardeningSome weeks ago my brother who works in the field of Neuroscience made me aware of a genealogical tree of his scientific community called NeuroTree. This inspired me to think about such a tree for science in general which might give interesting insights into the scientific community and reveal connections that usually only insiders have (or would like to have). Michael Kuhn pointed out that such an endeavor exists already. In fact the NeuroTree is a part of this general Academic Family Tree. Until recently it comprised 15 subtrees of different scientific communities. As a tree for Computational Biology/Bioinformatics was missing at that point I asked Stephen V David who is responsible for this project if it would be possible for him to set up a tree for the Computational Biology and Bioinformatics crowd. Additionally I proposed to use a Creative Commons license (we discussed different options – Stephen decided to take the CC-BY) for its data. Soon after that the freshly planted Computational Biology Tree (suggested abbreviation: CompBioTree) got its first branches and leaves. Michael happily joined me gardening. We both hope that we are not the only ones who want to see the tree grow so please feel free to contribute.

Additionally we are open for suggestions and feedback. I have also some points in mind to improve the tree and the underlying system. To mention some:

  • A new logo (the one I took is just a initial solution)
  • RSS for changes
  • SSL login
  • Show image source
  • Pubmed links
  • Show history of an entry
  • ORCID (if this ever comes true it would have many implication to this project e.g. they could be used a base id)
  • (RESTful) API
  • SVG generation
  • Frequent data dumps in different formats
  • Publish the software under a open source license

All of these points are nothing more than my wishes and have to be discussed with Stephen. I cannot promise anything at this point only that any feedback is welcome. Apropos feedback and communication: I first thought about creating a blog and a mailing list for updates about the tree but Micheal convinced me that this is not necessary currently. So far I created a Friendfeed group. If there is the need at some point maybe OpenWetWare could be a place to set up a blog and page for collecting ideas.

Image by Jim Linwood

A first taste of the distributed social web

Posted April 13th, 2010 by Konrad Förstner

The days of social data silos are finally counted … yesterday OneSocialWeb made a plug-in for OpenFire and a console client available for download. These two programs are the first bricks for building a distributed social network.

A screen shot of the web client

A screen shot of the web client

A web client and a client for Android will follow soon (I had luck – Laurent Eschenauer provide me with an account for an installation of the rudimentary web client). If you play around with the programs you will recognize that they have many rough edges and miss many functions so far – they are in a very early stage. Despite this OneSocialWeb is in my opinion on of the most promising projects in this field and a look at the protocol specification made me quite exited. Everything is build on XMPP (=> real-time) and other open standards. On the list of functions are Activity Stream over XMPP (which works currently), VCard4 over XMPP, Social Relationships and a Personal Eventing Protocol (PEP) Inbox. If you like to play around with early stage (console) tools have a look at the github repositories of the project and help to improve them.

Update: If you need an account for a running server – ping Laurent.

Last week I registered to Cliqset in the hope to finally find the last über-aggretator I need to combine all my live stream services. As my expectations were not fulfilled I tried to explore my disappointment and to find out what I was missing in all the aggregators I tested so far. In my opinion the problem is that platforms like friendfeed, cliqset, lifestream.ff etc. are build to spread my content to other platforms and to bundle my various output streams into one. The function that is missing is the insertion of streams of other people than me from outside the platform. An example might make this clearer: I would like to read the postings of all the people I follow in e.g. twitter, and friendfeed in one service. This service could be friendfeed. What I see there in my personal view are all the streams of people who have a friendfeed account and who I follow. If all these people would have a twitter account and added that one to their friendfeed stream I could theoretically unfollow them on twitter and just read their posts on friendfeed. Unfortunately this is not reality and only a fraction of the people who I follow on twitter and own a friendfeed account. So I would like to tell friendfeed to take all the status messages of the people I follow on twitter and and integrate them like postings of friendfeed users into the stream that I see. At this point redundant postings (same person – same text – different platforms) could be removed.

Let’s talk tech for the case that I still could not make myself understood: I want a combined, real-time, non-redundant stream of

  • and

If you step back now and have a look at what I just have described you see that this is kind of a real-time RSS reader (time for PubSub, isn’t it?) including all the other goodies that friendfeed offers (e.g. sending to multiple platforms, commenting). An other description would be a multi-service client for status services.

My question: I this already possible to do so with friendfeed or one of it’s clones and I just missed it? Or am I the only one interested in that? Is there something around like this? Even Regine who is usually quite informed about the options of such tools could not help me (but inspired me to do a little visualization). … Maybe the solution will be Google buzz that just arrived (not yet available for me).

Update: Google buzz seems to be more a part of the problem than a part of the solutions. Maybe Cliqset was not a too bad hint. It offers a FeedProxy for many other services and makes it very easy to harvest the streams nicely wrapped using Activity Streams. This could be easily used to write something from scratch. But I still hope to find a ready solution.