This is the final post of “the dao of flow”. So Long, and Thanks for All the Fish.
tldr; Please endorse my application for the Open Science Policy Platform of the European Commission in the comments of this blog post. Please also consider applying or support others in becoming a member of that platform.
The European Commission has currently an open call for applicants that will represent the community at the Open Science Policy Platform (OSPP). It will be an expert group which will advise the European Commission. This is a great opportunity for the community to push for Open Science top-down in addition to the commonly practiced bottom-up approaches.
There are two modes of applying to become a member of this platform: Either as representative of an organization or as an individual representing the community. I am happy to see that some of the rock stars of Open Science – Björn Brembs and Cameron Neylon (Update: also Daniel Mietchen) – are going to apply and are currently asking for endorsement. Without any doubt they MUST be member of the OSPP. Furthermore, I have suggested that the OKF Open Science Group will send one representative and think Jenny Molloy is the ideal candidate for this.
The call states that the OSPP is going to consist of 20 to 30 members and I think it is very important that more people from the community try to join this platform. Due to this I am planning to apply to become a member and like to state why I think I bring the required competence.
As a bioinformatician at the University of Würzburg, Germany, I am an active researcher who is involved in numerous projects (see my list of publications). I am developing open source software (e.g. READemption), trying to make my analyses reproducible (e.g. this repo), use pre-print servers (READemption manuscript) and generate open educational resources (e.g. a Unix Shell course for Biologists). Since the start of my PhD I am contributing to the discussion how the exchange of information in science can be improved. I started a “Science 2.0” talk series about this in 2006 at the EMBL (the page is unfortunately down meanwhile) and organized the First Online EMBL PhD Symposium in the same year. Since then I gave several talks and co-authored papers (e.g. An Open Science Peer Review Oath) about Open Science. I am a founding member of the German speaking OKFN Open Science Working Group in which I am trying to connect members of the community e.g. by organizing calls and workshops. Furthermore, I am co-host of Open Science Radio – a podcast about Open Science founded by Matthias Fromm. Due to my activities I know the struggle one faces if one would like to make science open and transparent. Openness is a cornerstone of science and I personally would like to work much more open but am restricted as I have not the required permissions. Especially as someone who is mostly working in collaborations I see that many researchers would like to work more openly but are afraid of that as the current evaluation system is not addressing/rewarding this adequately.
I would be very happy if I could be one of the representatives of the open science community in OSPP and I would like to ask you for endorsing me by posting a comment to this blog post.
I am just back from this year’s BOSC (Bioinformatics Open Source Conference) 2015 in Dublin. BOSC is definitely one of my favorite community gatherings. One reason for this is that the common value – the understanding that openness is an essential part of science – is omnipresent and the foundation of every project presented there. During the two days of the meeting numerous, mostly short talks were given and I think the format is ideal to be exposed to many projects in a short time. The talks are recorded and will soon be publicly available. Anybody who is interested in the discussion that took place online (a tweet dump of #BOSC2015 had more than 3000 tweets!) can have look at the storyfied collections (day 1, day 2). Some of the photos I took are on flickr. Despite the broad range of topics certain patterns were crearly detectable:
1) Docker, docker, docker
There was barely a talk that did not mention docker. Either it was already part of e.g. the deployment process or planned to be integrated. As seen in other fields it is clear that docker (or other container implementations) is here to stay.
2) Common Workflow Language
The Common Workflow Language seems to be on a similar track as docker but in a much earlier phase. Still, considering that it is a rather new initiative it could win quite many supporters and adaptation by several projects.
3) Funding of projects
The majority of presented projects that could be considered as backbone or at least as very important toolboxes of the bioinformatical community like Biopython, BioGems, BioJS or bionode are mostly run as side projects and don’t have dedicated funding. The same is the case for other very promising endeavors like OpenSNP. The development is done in the spare time and/or with little bit of tolerance of the developers’ funding/hosting institutions. This is not a sustainable working mode and quite scary considering how much we rely on them. Funding bodies still seem to miss the importance of such projects and common grants are hard to get for them and/or the structure is no well suited for this open source community developing mode. It is an old, but unsolved problem that gets bigger the with the growing importants of the bioinformatics data analysis.
Update: Mark Hahnel just pointed out that there should be no such test entities made public (so you don’t need to test it – I can tell you that it works) and that there will be a sandbox fur such purposes soon. But in general the approach could be used. Figshare is considering code submission for a while but wants to make sure to make things right and do this without inventing the wheel. He also pointed me to a recent tweet linking to an even better solution (which is so far a proof of concept) than the one presented below.
I am planning to submit a paper which is covering a bioinformatical software tool. As usually I would like to provide a link to a web page where the software can be downloaded and which contains a reference to the software repository. So far I am not sure if this page will be hosted on the institute sever, github or another location. Here a problem arises that many scientist face if they want to add an URL to a scientific publication. As long as an articles is a static piece of text and is not easily editable it is nearly impossible for the author to inform the reader about changes of an URL. Maybe I am missing something but there seems to be no common recommendation by publishers about how tackle this issue. The general solution would be to create a stable Uniform Resource Identifier (URI) for the content one would like to share. The URI can be an Uniform Resource Locator (URL) e.g. a web address – so in our case the trouble maker which might change – or a Uniform resource name (URN). Well known examples of URN systems are the ISBN system or the DOI system. The later one works pretty well for scientific journal publications. Using this for smaller pieces of research like images, software and data was not very practical as you have to be a paying member to get a name space and to generate DOIs. As far as I see, figshare was a game changer as it is the first platform which offers the creation of DOIs for each item (image, data set, poster, etc.) that is shared on it. Back to the initially described problem – how can I generate a resilient URN for my software?Unfortunately figshare does not offer a data type for this purpose – software or URL. But one can create a “data sets” base on a plain text and simply write the current URL of the software into that file. This is definitely not very clean and just a small hack but it seems to work. It would be great if figshare sees the need for such a redirection to software sources and offers a new entity type for this. But maybe I am on a wrong track anyway and there are better solution for my problem. I would be very happy if anybody could refer me to one.
It’s time to resurrect this blog after a long time of silence. So let’s start with a quicky. I hope to manage to write more frequently again.
Today Jean-Marc Ghigo – principal investigator at the Institute Pasteur – gave at talk at our instiute. He presented his research results which reveal that volatile molecules can mediate long-range chemical interactions between physically separated bacteria. For example ammonia was shown to lead to antibiotic resistant phenotype in E. coli . Such an influence was found for many other volatile agents. I think is is quite important to keep the effect of such chemicals interference in mind when doing wed-lab work (luckily I don’t do that). Just think about the situation that just before you start your experiment you clean the desk with a detergent that contains ammonia. This sounds like a nightmare for reproducibility of biological experiments to me.
 Bernier SP, Létoffé S, Delepierre M, Ghigo JM. Biogenic ammonia modifies antibiotic resistance at a distance in physically separated bacteria. Mol Microbiol. 2011 Aug;81(3):705-16. doi:10.1111/j.1365-2958.2011.07724.x. Epub 2011 Jun 23. PMID: 21651627
It looks like Michael Kuhn kick-started the Flattr wave for the science blogging scene. As not everybody knows what this service is – here is a short intro: Flattr is a social micropayment system that makes it very easy to pay content creators like bloggers, podcasters, programmers or photographs little amounts of money just with a click (okay, after a previous registration and money transfer). Your donation (e.g. 5 Euro) is equally shared between all the items you flattred in a month. Especially in Germany Flattr took off in the blogging and podcasting scene – one of the reasons might be Peter Sunde‘s talk at the re:publica 2010 in Berlin. Until now the projects is in a invitation-based closed beta stage. Recently Tim Pritlove – one of the most important podcasters in Germany – roughly measured that there are currently around 26.000+ Flattr users.
What’s about Flattr and science / scientific communication? Until Michael’s tweet I didn’t consider it as an option and embedded Flattr buttons only in my coding related blog. Somehow I had a weird feeling putting them on this science related blog (although this is not the only topic of it). Maybe this was due to my personal understanding that science is not about money. Maybe it was due to the fact that scientist have usually (or do I need to say ideally?) a fixed income and don’t need extra money. What ever it was I guess we should give it also in this field a try. As for the generation of other content Flattr could give new impulses and opportunists for scientific activities and communication. The currency of science is usually attribution not necessarily money. Flattr is a mixture of both. People pay attribution to your work but also are willing to give you a little piece of their resources.
The main application of Flattr in the scholarly field might be to motivate more scientist to communicate their research to the public. And this hopefully without locking content away but making the knowledge a common good by using open licenses. It might be a long way but maybe this service could enable some scientist to become semi-professional communicators of science. I am not sure if it will be a driving force for the communication of scientists to scientists. I also doubt that it can be used to fund (semi)-professional research in the near future. We will see – let’s give it a chance.
It has been quite a while since I was working in a wet lab the last time but I still have a lot of respect for the busy bees generating data there. It’s hard to imagine such a lab without personal computers which get data automatically from the detecting apparatus or manually from the experimenter. But as far as I am informed the stream of information of an experiment is mostly going in one direction – from the experimenter to the computer. Using the channel from the data pool to the experimenter could offer interesting opportunities to improve research. There are devices like electronic lab journals but I think about a further step: Augmented reality (AR) for the wet lab.
I prepared two quick and dirty mock-ups (upper part = raw image, lower part = image + data layer) and hope they are sufficient to give a rough impression of the idea. In both cases QR codes would be used to identify the object. A smartphone with a camera would detect these items (e.g. microtrubes or petri dishes) and overlay the given image with information about the experiment. In the case of the petri dish the size of the colony could be measured (assuming that the petri dish can be used as a reference) additionally and stored as part of the experiment. The applications of such systems are endless and all the technologies are available for low cost. It’s just about connecting everything. Is any project bringing this to reality already? It’s somehow hard to believe that this is not the case but I couldn’t find anything like it. This could be a cool community project.
PS: I have to admit there are some small technical hurdles in my examples. The QR codes are pretty small and my smartphone was unable to interpret the code correct if they are bend around something.
Some weeks ago my brother who works in the field of Neuroscience made me aware of a genealogical tree of his scientific community called NeuroTree. This inspired me to think about such a tree for science in general which might give interesting insights into the scientific community and reveal connections that usually only insiders have (or would like to have). Michael Kuhn pointed out that such an endeavor exists already. In fact the NeuroTree is a part of this general Academic Family Tree. Until recently it comprised 15 subtrees of different scientific communities. As a tree for Computational Biology/Bioinformatics was missing at that point I asked Stephen V David who is responsible for this project if it would be possible for him to set up a tree for the Computational Biology and Bioinformatics crowd. Additionally I proposed to use a Creative Commons license (we discussed different options – Stephen decided to take the CC-BY) for its data. Soon after that the freshly planted Computational Biology Tree (suggested abbreviation: CompBioTree) got its first branches and leaves. Michael happily joined me gardening. We both hope that we are not the only ones who want to see the tree grow so please feel free to contribute.
Additionally we are open for suggestions and feedback. I have also some points in mind to improve the tree and the underlying system. To mention some:
- A new logo (the one I took is just a initial solution)
- RSS for changes
- SSL login
- Show image source
- Pubmed links
- Show history of an entry
- ORCID (if this ever comes true it would have many implication to this project e.g. they could be used a base id)
- (RESTful) API
- SVG generation
- Frequent data dumps in different formats
- Publish the software under a open source license
All of these points are nothing more than my wishes and have to be discussed with Stephen. I cannot promise anything at this point only that any feedback is welcome. Apropos feedback and communication: I first thought about creating a blog and a mailing list for updates about the tree but Micheal convinced me that this is not necessary currently. So far I created a Friendfeed group. If there is the need at some point maybe OpenWetWare could be a place to set up a blog and page for collecting ideas.