-
Excel messes up your data analysis :)
Well, no wonder: Excel is meant to be used to process money flows. Anyway, greyarea pointed me to this nice blog item from March 2006. It discusses a 2004 article in BMC Bioinformatics Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics by Barry Zeeberg et al. (DOI:10.1186/1471-2105-5-80). Hence, the importance of semantics and proper markup languages. The quotes are illustrative: -
Optical Chemical Structure Recognition
Days after the release of OSRA last week, I saw the optical chemistry structure recognition on the front page of my favorite Dutch /. equivalent, Tweakers.net, Duitsers leren computer chemische structuren herkennen, written by René Gerritsen. The article discusses the Fraunhofer Institute’s ChemoCR, which was, IIRC, presented as poster at last year’s German Conference on Chemoinformatics (to be held again this year). Meanwhile, the CCL.net mailing list had a discussion on the alternatives too; I think it is fair to say that the chemical community realizes the importance of these tools. Below is a short overview of the available tools, including some important information regarding integration into workflows. -
RDF-ing molecular space
RDF might be the solution we are looking for to get a grip on the huge amount of information we are facing. microformats , and RDFa , are just solutions along the way, and Gleaning Resource Descriptions from Dialects of Languages (GRDDL) might be an important tool to get the web RDF-ied. -
Screencasts for life science informatics
Deepak blogged about screencasting for bio topics, concentrated at bioscreencast.com of which he is co-owner. I guess it is like a YouTube for bioinformatics thingies. Jean-Claude picked this up very quickly (seen on Cb? At least I did.), and already uploaded a screencast, demoing JSpecView written by Robert. I wonder if he will upload the screencasts he made for Bioclipse too? (hint, hint … :) -
The CDK data model #1
The Chemistry Development Kit has a rich set of data classes, each of which is defined by an interface. While the classes for atoms, bonds and a connectivity table are fairly straightforward, but beyond that it is sometimes not entirely clear. I will now discuss all interfaces in a series of blog items. I’ll start with the IChemFile. Christoph, please correct me if I move to far away from our Notre Dame board sketch.