<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/taverna.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-04-19T09:50:36+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/taverna.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">Oscar text mining in Taverna</title><link href="https://chem-bla-ics.linkedchemistry.info/2010/10/21/oscar-text-mining-in-taverna.html" rel="alternate" type="text/html" title="Oscar text mining in Taverna" /><published>2010-10-21T00:00:00+00:00</published><updated>2010-10-21T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2010/10/21/oscar-text-mining-in-taverna</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2010/10/21/oscar-text-mining-in-taverna.html"><![CDATA[<p>One of the goals of my <a href="https://chem-bla-ics.linkedchemistry.info/2010/10/15/working-on-oscar-for-three-months.html">project in Cambridge <i class="fa-solid fa-recycle fa-xs"></i></a>
is to make <a href="http://oscar3-chem.sourceforge.net/">Oscar</a> available as <a href="http://taverna.sf.net/">Taverna</a> plugin
(<a href="https://bitbucket.org/egonw/oscar4-taverna">source code</a>, <a href="https://hudson.ch.cam.ac.uk/job/oscar4-taverna/">Hudson build</a>).
I have progressed somewhat, but still struggling with getting the update site working. The plugin actually installs into
<a href="http://www.mygrid.org.uk/2010/07/taverna-220-workbench-and-command-line-tool-are-released/">Taverna 2.2.0</a>, but the
activities do not show up. While this is work in progress, and the other project goal is refactoring, a current demo
workflow looks like:</p>

<p><img src="/assets/images/oscarTaverna.png" alt="" /></p>

<p>Example input would be: <em>This is a list of ethanol, methanol, and 2,4,6-trinitrotoluene.</em></p>

<p>The plain text input can be linked to the pdf2text <a href="http://www.slideshare.net/markmoby/sadi-in-taverna-tutorial">SADI service</a>,
and the CML is suitable for the <a href="http://chem-bla-ics.blogspot.com/2010/03/cdk-taverna-paper-published.html">CDK-Taverna plugin</a>,
which is currently being updated by Andreas, Achim, and <a href="http://www.steinbeck-molecular.de/steinblog/">Christoph</a> for
Taverna 2.2. As soon as the update site is properly working, I will upload a demo workflow to
<a href="http://www.myexperiment.org/">MyExperiment.org</a>.</p>

<p>I guess the first next activity (node in the workflow) will be around the dictionaries, as the
<a href="http://opsin.ch.cam.ac.uk/">OPSIN</a> activity converts only IUPAC names into connection tables. I was told OPSIN parses 97%
of the IUPAC names it finds, and when it does, it does almost 100% correct. Want to challenge the code?
Use <a href="http://opsin.ch.cam.ac.uk/">this web service</a>.</p>]]></content><author><name>Egon Willighagen</name></author><category term="oscar" /><category term="taverna" /><category term="inchikey:OKKJLVBELUTLKV-UHFFFAOYSA-N" /><category term="inchikey:LFQSCWFLJHTTHZ-UHFFFAOYSA-N" /><category term="inchikey:SPSSULHKWOKEEL-UHFFFAOYSA-N" /><summary type="html"><![CDATA[One of the goals of my project in Cambridge is to make Oscar available as Taverna plugin (source code, Hudson build). I have progressed somewhat, but still struggling with getting the update site working. The plugin actually installs into Taverna 2.2.0, but the activities do not show up. While this is work in progress, and the other project goal is refactoring, a current demo workflow looks like:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/oscarTaverna.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/oscarTaverna.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Installation HOWTO for CDK-Taverna 0.5.1.1 in Taverna 1.7.2</title><link href="https://chem-bla-ics.linkedchemistry.info/2010/01/17/installation-howto-for-cdk-taverna-0511.html" rel="alternate" type="text/html" title="Installation HOWTO for CDK-Taverna 0.5.1.1 in Taverna 1.7.2" /><published>2010-01-17T00:00:00+00:00</published><updated>2010-01-17T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2010/01/17/installation-howto-for-cdk-taverna-0511</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2010/01/17/installation-howto-for-cdk-taverna-0511.html"><![CDATA[<p><a href="http://cdktaverna.wordpress.com/">Thomas</a> made a <a href="http://cdktaverna.wordpress.com/2010/01/17/cdk-taverna-version-0-5-1-1-released/">new release of CDK-Taverna</a>
for the <a href="http://www.taverna.org.uk/">Taverna</a> 1.7.2 release, which is great news as the previous release was for Taverna 1.7.1.</p>

<p>He asked me to test it, and I installed a fresh Taverna install and the new plugin. After that, I used the <a href="http://myexperiment.org/">MyExperiment</a>
plugin to download one of the <a href="http://www.myexperiment.org/search?query=cdk-taverna&amp;type=workflows">CDK-Taverna workflows Thomas has on MyExperiment</a>,
and tuned it a bit to use some local input instead of the database. I took some screenshots while at it, and will use those now to talk you through the
installation of Taverna and the <a href="http://cdk-taverna.de/">CDK-Taverna</a> plugin.</p>

<h3 id="download-taverna">Download Taverna</h3>

<p>Taverna 1.7.2 can be downloaded from <a href="http://www.mygrid.org.uk/tools/taverna/taverna-1/taverna-download/">this download page</a>, but I took the
Linux version from the <a href="http://sourceforge.net/projects/taverna/files/taverna/1.7.2/">SourceForge download site</a>. I cannot detail the OS/X or
Windows installation, but on Linux you simply unzip the downloaded file, and you’re ready to go:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cd </span>taverna-1.7.2/
<span class="nv">$ </span>sh runme.sh
</code></pre></div></div>

<h3 id="plugin-installation">Plugin Installation</h3>

<p>Plugins can be installed using with the <em>Plugin manager</em> which can be accessed via the <em>Tools</em> menu:</p>

<p><img src="/assets/images/cdktav4.png" alt="" /></p>

<p>Clicking the <em>Find New Plugins</em> takes you to a second dialog listing known plugin sites, and the default download has several already:</p>

<p><img src="/assets/images/cdktav1.png" alt="" /></p>

<p>The CDK-Taverna update site is available at <em>http://cdk-taverna.de/plugin/</em>, and we can make Taverna aware of this update site by clicking the
<em>Add Plugin Site</em> button:</p>

<p><img src="/assets/images/cdktav.png" alt="" /></p>

<p>After filling out these values and approving it with the <em>OK</em> button, it will show up on the dialog showing all available plugins,
where you need the check the check box in front of the CDK-Taverna plugin name, as done in this screenshot:</p>

<p><img src="/assets/images/cdktav2.png" alt="" /></p>

<p>You can then hit the <em>Install</em> button after which the plugin will be downloaded:</p>

<p><img src="/assets/images/cdktav3.png" alt="" /></p>

<p>After it is done downloading the plugin, you can close the <em>Plugin Sites</em> and <em>Plugin Manager</em> dialogs. I shutdown and restarted Taverna with
<code class="language-plaintext highlighter-rouge">sh runme.se</code>, but not entirely sure this is needed. After that, the CDK nodes showed up in the list of Taverna processors:</p>

<p><img src="/assets/images/cdktav5.png" alt="" /></p>

<h3 id="myexperiment-plugin">MyExperiment Plugin</h3>

<p>Using the same Taverna <em>Plugin Manager</em> you can also install the MyExperiment plugin that allows you to search, browse, preview and download
Taverna workflows from the MyExperiment website from within Taverna itself. I installed the plugin, and then used it to search for CDK workflows
(and downloaded a QSAR workflow):</p>

<p><img src="/assets/images/cdktav6.png" alt="" /></p>

<p>This about everything to get you going. It’s not particularly rocket science, but I guess this howto is useful as you get to see what
you should expect when setting up a CDK-Taverna environment. If you have further questions, please leave those in the comments section,
and I’ll try to merge in answers where possible, or otherwise in the reactions too.</p>]]></content><author><name>Egon Willighagen</name></author><category term="cdk" /><category term="taverna" /><summary type="html"><![CDATA[Thomas made a new release of CDK-Taverna for the Taverna 1.7.2 release, which is great news as the previous release was for Taverna 1.7.1.]]></summary></entry><entry><title type="html">Details behind the “Calling XMPP cloud services from Taverna2”</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/01/21/details-behind-calling-xmpp-cloud.html" rel="alternate" type="text/html" title="Details behind the “Calling XMPP cloud services from Taverna2”" /><published>2009-01-21T00:00:00+00:00</published><updated>2009-01-21T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/01/21/details-behind-calling-xmpp-cloud</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/01/21/details-behind-calling-xmpp-cloud.html"><![CDATA[<p>On Monday I showed <a href="https://chem-bla-ics.linkedchemistry.info/2009/01/19/calling-xmpp-cloud-services-from.html">two screenshot <i class="fa-solid fa-recycle fa-xs"></i></a> showing our
<a href="https://chem-bla-ics.linkedchemistry.info/2009/01/19/calling-xmpp-cloud-services-from.html">new XMPP-based web/cloud services <i class="fa-solid fa-recycle fa-xs"></i></a> in action
inside <a href="http://taverna.sf.net/">Taverna</a>.</p>

<p>I promised details, but realize I have actually already posted a lot of them <a href="https://chem-bla-ics.linkedchemistry.info/2008/10/31/next-generation-asynchronous.html">in October <i class="fa-solid fa-recycle fa-xs"></i></a>:</p>

<blockquote>
  <p>Johannes ideas led to the <a href="http://xmpp.org/extensions/xep-0244.html">IO-DATA proposal</a> (XEP-0244), which is currently
marked experimental and being discussed on the ws-xmpp mailing list. He gathered a few people around him to get it going,
resulting in working stuff! Yeah!</p>
</blockquote>

<p><a href="http://miningdrugs.blogspot.com/">Joerg</a> <a href="http://friendfeed.com/e/a15e79ac-92ce-4b16-81d9-8f7b6ec1ea24/chem-bla-ics-Calling-XMPP-cloud-services-from/">asked</a>
<em>Could you post more results, what is it, why do we need it, e.g. why are you mentioning SOAP and cloud? Do not know enough to see the bonus right now.</em></p>

<p><strong>What is it?</strong> IO-DATA is a protocol on top of the XMPP protocol to allow machine-to-machine communication. Actually,
much like SOAP, RPC, and other platforms. How IO-DATA differs lies to some extend to the transport layer: instead of
using HTTP, it used the XMPP transport protocol, also used for Jabber chat clients. It basically allows clients like
Taverna to chat with services running elsewhere.</p>

<p><strong>Why do we need it?</strong> Most services run over HTTP, making them web services. This is convenient, because there is
much infrastructure around, like web browsers. REST services also take advantage of this. However, for heavy
computing this sometimes leads to problems. For example, routers are known to have time outs on HTTP connections.
To solve this, SOAP services often introduce a polling mechanism. IO-DATA takes a different approach. Instead of
having to ask all the time how a calculation is doing, you can just wait for the service to send you a message
when it is done. Instead of working around the lack of asynchronous aspects, IO-DATA introduces these in the protocol.</p>

<p>Other interesting features include that the IO-DATA integrates the interface formats for services into the service
itself, SOAP needs WSDL for this, and that it features service discovery via DISCO. The latter is done with SOAP
too, for example with UDDI and BioMoby. The latter also adds strong data typing for input and output of services.</p>

<p>IO-DATA addresses the data typing by allowing asking the service what XML Schema it uses for input and output.
While XML Schema has alternative, and which may be prefered in some situations, it does allow strong data typing
and supports <a href="http://friendfeed.com/e/2d322ac5-a5b9-4336-b421-fede0eb8e192/Hi-Guys-I-m-looking-for-an-exhaustive-resource-of/">a lot of formats in life sciences</a>
(which I’ll summarise soon).</p>

<p>Moreover, if there just happens not to be a suitable schemata around, you can just define one yourself, which can
be as simple as a single element wrapper around some custom text-based format. You worry about supporting many
formats? Well, no need. Johannes’ xws4j library, which I used for the Taverna plugin too, allows compiling a Java
binding code. Bioclipse’s script environment allows you do to this on the fly: you find a service, ask for the
schema, compile bindings for input and output, set up the input with the input binding, send it of to the service,
and use the output binding for convenient access to the computation results. Without having to reboot Bioclipse.
Isn’t that <strong>cool</strong>? Can your software do that? (See <a href="http://gist.github.com/22185">this example Gist</a>: the io
factory creates the binding).</p>

<p><strong>Why do I mention SOAP and the cloud?</strong> It should be clear from the above why I mention SOAP: it offer the same
functionality, but more conveniently, we think. I mention cloud here, to refer to cloud computing which is doing
computation on the cloud, which is a synonym for the internet (see
<a href="http://en.wikipedia.org/wiki/Cloud_computing">Cloud Computing @ Wikipedia</a>). Because it does
not use HTTP, we do not feel we can call it web service. Instead, cloud computing is a more general term, not
tied to any particular architecture. IO-DATA is just one possible architecture, one we think is promising for
life science applications.</p>]]></content><author><name>Egon Willighagen</name></author><category term="xmpp" /><category term="taverna" /><summary type="html"><![CDATA[On Monday I showed two screenshot showing our new XMPP-based web/cloud services in action inside Taverna.]]></summary></entry><entry><title type="html">Calling XMPP cloud services from Taverna2</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/01/19/calling-xmpp-cloud-services-from.html" rel="alternate" type="text/html" title="Calling XMPP cloud services from Taverna2" /><published>2009-01-19T00:10:00+00:00</published><updated>2009-01-19T00:10:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/01/19/calling-xmpp-cloud-services-from</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/01/19/calling-xmpp-cloud-services-from.html"><![CDATA[<p>SMILES (<em>CCC</em>) in, mass out. Yes, we can now call XMPP/IO-DATA cloud services with Taverna2 :)</p>

<p><img src="/assets/images/t1.png" alt="" /></p>

<p><img src="/assets/images/t2.png" alt="" /></p>

<p>Details will follow, but here’s the <a href="http://github.com/egonw/xws-taverna/tree/master">source code</a>.</p>]]></content><author><name>Egon Willighagen</name></author><category term="taverna" /><category term="xmpp" /><summary type="html"><![CDATA[SMILES (CCC) in, mass out. Yes, we can now call XMPP/IO-DATA cloud services with Taverna2 :)]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/t1.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/t1.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Metabolomics workflows in Taverna</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/11/26/metabolomics-workflows-in-taverna.html" rel="alternate" type="text/html" title="Metabolomics workflows in Taverna" /><published>2007-11-26T00:00:00+00:00</published><updated>2007-11-26T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/11/26/metabolomics-workflows-in-taverna</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/11/26/metabolomics-workflows-in-taverna.html"><![CDATA[<p>My current jobs description is to speed up metabolomics data analysis, and finally got around to making a first
relevant workflow for <a href="http://taverna.sf.net/">Taverna</a>, using the
<a href="http://www.chemspider.com/blog/?p=260">webservices just posted over at ChemSpider</a>:</p>

<p><img src="/assets/images/chemspiderWorkflow.png" alt="" /></p>

<p>I uploaded the <a href="http://myexperiment.org/workflows/97">source to MyExperiment</a>, so anyway can play with it.
There is much to improve, such as using <a href="http://cdk-taverna.de/">CDK-Taverna</a> for further analysis of the results.</p>

<p>I am not sure if opening the workflow in your Taverna installation will automatically set up the WDSL scavenger
for the <a href="http://www.chemspider.com/MassSpecAPI.asmx">ChemSpider services</a>, which are available in a HTTP version too,
btw. If not, right click on the <em>Available Processors</em> folder, and pick <em>Add new WDSL scavenger</em>… and point it to the
URL <em>http://www.chemspider.com/MassSpecAPI.asmx?WSDL</em>. The result should look like:</p>

<p><img src="/assets/images/chemspiderWorkflow1.png" alt="" /></p>

<p>Oh, and please note this comment:</p>

<blockquote>
  <p>These services are offered free of charge to our users during this period of testing, validation and feedback. Some of
these services will be made available commercially in the future and we are proactively informing you of our intention to
do this. It is likely that these services will remain available to academia at no charge. Please contact us at
feedbackATchemspiderDOTcom with feedback and questions.</p>
</blockquote>

<p>So, I do not know when my workflow will stop working.</p>]]></content><author><name>Egon Willighagen</name></author><category term="taverna" /><category term="chemspider" /><category term="metabolomics" /><summary type="html"><![CDATA[My current jobs description is to speed up metabolomics data analysis, and finally got around to making a first relevant workflow for Taverna, using the webservices just posted over at ChemSpider:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/chemspiderWorkflow.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/chemspiderWorkflow.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Taverna Workshop, Day 1 Update</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/10/08/taverna-workshop-day-1-update.html" rel="alternate" type="text/html" title="Taverna Workshop, Day 1 Update" /><published>2007-10-08T00:10:00+00:00</published><updated>2007-10-08T00:10:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/10/08/taverna-workshop-day-1-update</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/10/08/taverna-workshop-day-1-update.html"><![CDATA[<p>The second part of the morning session featured a presentation by Sirisha Gollapudi which spoke about mining
biological graphs, such as protein-protein interaction networks and metabolic pathways. Patterns detection
for nodes with only one edge, and cycles etc, using Taverna. An example data she worked on is the Palsson human
metabolism (doi:<a href="https://doi.org/10.1073/pnas.0610772104">10.1073/pnas.0610772104</a>); she mentioned that this
metabolite data set contains <a href="http://en.wikipedia.org/wiki/Cocaine">cocaine</a> :) Neil Chue Hong finished with
an introduction on the <a href="http://www.omii.ac.uk/">OMII-UK</a> which is co-host of this meeting.</p>

<p>After lunch Mark Wilkinson introduced <a href="http://biomoby.org/">BioMoby</a>, which we actually use in Wageningen already.
I have tried to use <a href="http://biomoby.open-bio.org/CVS_CONTENT/moby-live/Java/docs/">jMoby</a> to set up services
based on the <a href="http://cdk.sf.net/">CDK</a>, but failed sofar. Will talk with Mark on that. Next was my presentation,
and I spoke about <a href="http://www.cdk-taverna.de/">CDK-Taverna</a>, <a href="http://www.bioclipse.net/">Bioclipse</a> and some
peculiarities with chemoinformatics workflow, like the importance with intermediate interaction, the need to
visualize the data and complex, information rich data. Bioclipse is seeing
<a href="http://wiki.bioclipse.net/index.php?title=Bioclipse2">an integration of BioMoby and of Taverna</a>.</p>

<p>After the coffee brake Marco Roos spoke about <a href="http://myexperiment.org/">myExperiment</a> and his work on text
mining. I unfortunately missed this presentation, as I was meeting with people from the EBI who work on the
<a href="http://www.ebi.ac.uk/thornton-srv/databases/MACiE/">MACiE database</a> (see
<a href="https://chem-bla-ics.linkedchemistry.info/2006/02/17/chemical-reactions-in-cml.html">this blog item <i class="fa-solid fa-recycle fa-xs"></i></a>).</p>

<p>A discussion session afterwards introduced a few more Taverna uses, and encountered technical problems.
Taverna2 is actually going to be quite interesting, with a data caching system between work processors, and a
powerful scheme of annotation of processors, which will allow rating, finding local services, etc. More on
that tomorrow. Dinner time now :)</p>]]></content><author><name>Egon Willighagen</name></author><category term="taverna" /><category term="ebi" /><category term="cdk" /><category term="bioclipse" /><category term="myexperiment" /><category term="justdoi:10.1073/pnas.0610772104" /><category term="inchikey:ZPUCINDJVBIVPJ-LJISPDSOSA-N" /><summary type="html"><![CDATA[The second part of the morning session featured a presentation by Sirisha Gollapudi which spoke about mining biological graphs, such as protein-protein interaction networks and metabolic pathways. Patterns detection for nodes with only one edge, and cycles etc, using Taverna. An example data she worked on is the Palsson human metabolism (doi:10.1073/pnas.0610772104); she mentioned that this metabolite data set contains cocaine :) Neil Chue Hong finished with an introduction on the OMII-UK which is co-host of this meeting.]]></summary></entry><entry><title type="html">Taverna Workshop, Hinxton, UK</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/10/08/taverna-workshop-hinxton-uk.html" rel="alternate" type="text/html" title="Taverna Workshop, Hinxton, UK" /><published>2007-10-08T00:00:00+00:00</published><updated>2007-10-08T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/10/08/taverna-workshop-hinxton-uk</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/10/08/taverna-workshop-hinxton-uk.html"><![CDATA[<p>I arrived at the <a href="http://www.ebi.ac.uk/">EBI</a> last night for the <a href="http://taverna.sf.net/">Taverna</a> workshop, during which the design
of Taverna2 is presented and workflow examples are discussed. Several ‘colleagues’ from Wageningen and the SARA computing center
in Amsterdam are present, along with many other interesting people. This afternoon is my presentation.</p>

<p>Paul Fisher just presented his PhD work on using workflows to improve the throughput of QTL matching against pathway information and
phenotype. One interesting note was its function to make biological informational studies more reproducible. He had getting the
versions of online databases explicitly in the workflow, so that it gets stored in workflow output.</p>]]></content><author><name>Egon Willighagen</name></author><category term="taverna" /><category term="java" /><summary type="html"><![CDATA[I arrived at the EBI last night for the Taverna workshop, during which the design of Taverna2 is presented and workflow examples are discussed. Several ‘colleagues’ from Wageningen and the SARA computing center in Amsterdam are present, along with many other interesting people. This afternoon is my presentation.]]></summary></entry><entry><title type="html">CompLife2007, Utrecht/NL; Taverna, EBI/Hinxton/UK</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/09/30/complife2007-utrechtnl-taverna.html" rel="alternate" type="text/html" title="CompLife2007, Utrecht/NL; Taverna, EBI/Hinxton/UK" /><published>2007-09-30T00:00:00+00:00</published><updated>2007-09-30T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/09/30/complife2007-utrechtnl-taverna</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/09/30/complife2007-utrechtnl-taverna.html"><![CDATA[<p>Two working days left before I’m off to two conferences. First, next Thursday/Friday, the two day <a href="http://www.inf.uni-konstanz.de/complife07/">CompLife2007</a>
in Utrecht/NL, with sessions on genomics, systems biology, medical information and data analysis. And, on the second day tutorials on
<a href="http://knime.org/">KNIME</a> and <a href="http://cdk.sf.net/">CDK</a>/<a href="http://www.bioclipse.net/">Bioclipse</a>. I will try to orient as much as possible around
MS-based metabolomics, and metabolite identity in particular. <a href="https://chem-bla-ics.linkedchemistry.info/2006/09/28/complife06-day-1.html">Last year <i class="fa-solid fa-recycle fa-xs"></i></a>
the conference was very interesting.</p>

<p>The Monday/Tuesday after that, I will present CDK-<a href="http://taverna.sourceforge.net/">Taverna</a> integration I worked on in 2005 (see e.g.
<a href="https://chem-bla-ics.linkedchemistry.info/2006/05/18/taverna-runs-with-classpath-091.html">Taverna on Classpath <i class="fa-solid fa-recycle fa-xs"></i></a> and
<a href="https://chem-bla-ics.linkedchemistry.info/2005/10/18/cdk-taverna-fully-recognized.html">CDK-Taverna fully recognized <i class="fa-solid fa-recycle fa-xs"></i></a>) at the
<a href="http://taverna.sourceforge.net/index.php?doc=workshop.html">Taverna meeting</a>, before Thomas continued on that,
leading to the <a href="http://cdk-taverna.de/">cdk-taverna.de</a> plugin website. If time permits, I will prepare an example
workflow from metabolomics. Unlike previous times I went to Cambridgeshire, I won’t fly in on Stansted, but take the
<a href="http://www.eurostar.com/">EuroStar</a> instead. I am very much looking forward to that. Unfortunately, I will not have time
to visit Cambridge itself, this time :(</p>]]></content><author><name>Egon Willighagen</name></author><category term="cdk" /><category term="taverna" /><category term="knime" /><category term="bioclipse" /><summary type="html"><![CDATA[Two working days left before I’m off to two conferences. First, next Thursday/Friday, the two day CompLife2007 in Utrecht/NL, with sessions on genomics, systems biology, medical information and data analysis. And, on the second day tutorials on KNIME and CDK/Bioclipse. I will try to orient as much as possible around MS-based metabolomics, and metabolite identity in particular. Last year the conference was very interesting.]]></summary></entry><entry><title type="html">CDK Workshop - Day #2</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/01/30/cdk-workshop-day-2.html" rel="alternate" type="text/html" title="CDK Workshop - Day #2" /><published>2007-01-30T00:00:00+00:00</published><updated>2007-01-30T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/01/30/cdk-workshop-day-2</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/01/30/cdk-workshop-day-2.html"><![CDATA[<p>Because of other obligations, I was unable to attend the first day of the <a href="http://wiki.cubic.uni-koeln.de/cdkwiki/doku.php?id=spring2007workshop">CDK Workshop</a>,
though Christoph had set up Skype so that at least I could hear the talks from <a href="http://www.inf.uni-konstanz.de/bioml/staff/berthold/">Prof. Berthold</a>
(Konstanz, Germany) about <a href="http://www.knime.org/">KNIME</a> and <a href="http://almost.cubic.uni-koeln.de/cosi/curriculumVitae_zielesny.htm">Prof. Zielesny</a>
about <a href="http://cdk-taverna.de/">CDK-Taverna</a>.</p>

<p>Today, Miguel Rojas and Stefan Kuhn discussed their research. Miguel showed the state of mass spectrum prediction using the <a href="http://cdk.sf.net/">CDK</a>
and the MEDEA plugin for <a href="http://www.bioclipse.net/">Bioclipse</a>. Stefan demonstrated the <a href="http://www.nmrshiftdb.org/">NMRShiftDB</a>
and a new lab systems for NMR experiment scheduling and management system based on that. <a href="http://www2.cmbi.ru.nl/who-and-where/staff/27/">Dr. Ott</a>
(Nijmegen, Netherlands) showed the <a href="http://biometa.cmbi.ru.nl/">BioMeta Database</a> which contains metabolite and reaction information derived from the
<a href="http://www.genome.jp/kegg/ligand.html">KEGG</a>, but which fixes a set of chemical problems in the latter (see also the article,
DOI:<a href="https://doi.org/10.1186/1471-2105-7-517">10.1186/1471-2105-7-517</a>).</p>

<p>The afternoons of CDK workshops traditionally have discussion sessions and hackathons. Two groups were formed: one consisted of the KNIME guys who,
together with Miguel and Federico focused in QSAR descriptor calculations in KNIME, while Stefan, Martin and me looked at the fingerprinter
peculiarities that Martin found (see also this <a href="http://almost.cubic.uni-koeln.de/cdk/cdk_top/cdk_news/archive/cdknews2.2.article22.pdf">CDK News article</a>),
and came up with a possible further performance improvement of the AllRingsFinder. Because one class of molecules that is causing trouble consist of two
ring systems connected by a long linker, like Choloyl-CoA (below), we anticipate that splitting the molecule up into ring systems prior to using the
SSSR algorithm should speed up the complete all-ring finding process.</p>

<p><img src="/assets/images/choloyl-coa.png" alt="" /></p>

<p>Currently, the spanning tree is calculated before deciding on using the SSSR finder, which, we think, can be used to partition the molecule
into separate ring systems. On each of them, then, the further steps of the ring search can be applied.</p>

<p>After dinner (pasta/pizza), during the Spanish-German handball game, we continued the hacking and discussions, now focusing as a whole group
on QSAR descriptors in KNIME. We looked at each descriptor and decided if it should go into a QSAR calculator node, or even in a node of its own.</p>

<h2 id="bugs-found">Bugs found</h2>
<p>I won’t close this blog entry without giving a list of problems we found in the current CDK; some minor and small, some more troublesome.
Here goes: typos all over the place; the OrderQueryBond lack a return statement in an else clause; the Mol2Reader does not mark atom and
bond aromaticity properly and reads a single bond as aromatic, and an aromatic bond as single; the Renderer2D does not always highlight
both atoms when hovering over a bond; SmilesGenerator.parseBond() should output bond orders correctly; the SSSR finder seems to have a
messed up if-else statement for the ringBondCount limit of 37; the BondCount descriptor should count all bonds by default, not just the
single bonds; <code class="language-plaintext highlighter-rouge">IDescriptor.getParameters()</code> should return null instead of <code class="language-plaintext highlighter-rouge">Object[0];</code> several programs use the SYBYL atomtype S.o2, while
the specification and the CDK config defines S.O2; the IP descriptor now returns a variable length descriptor.</p>]]></content><author><name>Egon Willighagen</name></author><category term="cdk" /><category term="kegg" /><category term="knime" /><category term="smiles" /><category term="taverna" /><category term="justdoi:10.1186/1471-2105-7-517" /><category term="inchikey:ZKWNOTQHFKYUNU-JGCIYWTLSA-N" /><category term="nmrshiftdb" /><summary type="html"><![CDATA[Because of other obligations, I was unable to attend the first day of the CDK Workshop, though Christoph had set up Skype so that at least I could hear the talks from Prof. Berthold (Konstanz, Germany) about KNIME and Prof. Zielesny about CDK-Taverna.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/choloyl-coa.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/choloyl-coa.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">German Conference on Chemoinformatics 2006: Day 1 and 2</title><link href="https://chem-bla-ics.linkedchemistry.info/2006/11/13/german-conference-on-chemoinformatics.html" rel="alternate" type="text/html" title="German Conference on Chemoinformatics 2006: Day 1 and 2" /><published>2006-11-13T00:00:00+00:00</published><updated>2006-11-13T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2006/11/13/german-conference-on-chemoinformatics</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2006/11/13/german-conference-on-chemoinformatics.html"><![CDATA[<p>The <a href="http://scholle.oc.uni-kiel.de/users/cic/tagungen/workshop06/index.html">2nd German Conference on Chemoinformatics</a>
started yesterday, with two chemoinformatics tutorials: one on industrial chemoinformatics (I saw this presentation
before… not sure when), with a good overview on integrating different information sources; the second one was about
opensource chemoinformatics by <a href="http://wiki.cubic.uni-koeln.de/blog/index.php">Christoph Steinbeck</a> (being involved
in opensource chemoinformatics for almost 10 years now!), which included a <a href="http://www.bioclipse.net/">Bioclipse</a>
demo (by me) and a demo by Thomas Kuhn on the <a href="http://cdk.sf.net/">CDK</a> based chemoinformatics plugin to
<a href="http://taverna.sf.net/">Taverna</a>. Other opensource projects of the <a href="http://www.blueobelisk.org/">Blue Obelisk</a>
movement were mentioned and a few outside it too.</p>

<p>The conference is in honor of the life work by <a href="http://www2.chemie.uni-erlangen.de/">Prof. Gasteiger</a>, who gave an
overview of chemoinformatics in his group, Germany and Europe. He stressed the need of education in chemoinformatics,
like in <a href="http://wiki.cubic.uni-koeln.de/blog/pivot/entry.php?id=12">Obernai</a>. He also highlighted that we, today,
are still solving the same problem as 30 years ago. Which is true, which is why this channel is called
<a href="https://chem-bla-ics.linkedchemistry.info/">Chem-bla-ics <i class="fa-solid fa-recycle fa-xs"></i></a>, trying to solve that problem. When asked if opensource chemoinformatics
form the start would have addressed this, he replied that he requires people to cooperatively do research with his
group; opensource clearly cannot enforce that.</p>

<h1 id="day-2">Day 2</h1>

<p>Todays program had a number of interesting presentations (I, unfortunately, missed the first presentation, so
have to visit that group soon now, to make up for that.) <a href="http://www.dq.fct.unl.pt/staff/jas/introduction.htm">Prof. Aires-de-Sousa</a>
showed his work on MOLMAP for mapping metabolic networks (<a href="http://www.genome.jp/kegg/">KEGG</a> really, see my
<a href="https://chem-bla-ics.linkedchemistry.info/2006/04/04/mining-kegg-pathway-database-with-self.html">earlier blog <i class="fa-solid fa-recycle fa-xs"></i></a>), and showed,
just as proof of principle, classification of organisms based on this.</p>

<p>J. Weisser talked about docking, still an obligatory topic. This work really showed two new approaches: the use
of QM partial charges (the example showed an improvement in RMSD of a factor 10, not very statistical, but
promising indeed); the second was the fact that water does not like to be in tight spots, because of reduced
possibilities for hydrogen bonding. A concept common in understand supramolecular phenomenon, but I have not
seen this applied to docking before. But I am no expert in that field. M. Wagner showed work on using KEGG
data to estimate likely metabolites, and the use in reducing effects of metabolic degradation. T. Schroeter
introduced me to <a href="http://www.gaussianprocess.org/">gaussian processes</a>, a new data modeling method. Quite
embarrassing to get introduced to such, as being specialized in modeling methods for chemical problems.</p>

<p>The poster session was, as normally, really exhausting, talking to a lot of people. Having a booth at the exhibition
on opensource chemoinformatics added a nice twist to this. I therefore skipped the FIZ-award winner lectures, so I
hope someone else will blog about those.</p>

<p>One last note: <a href="http://www.sun.com/software/opensource/java/">Sun started releasing their Java platform under the GPL license</a>.
<a href="http://wwmm.ch.cam.ac.uk/blogs/downing/">Jim</a>, seems that they <a href="https://chem-bla-ics.linkedchemistry.info/2006/10/25/being-good-opensource-user.html">proved me wrong <i class="fa-solid fa-recycle fa-xs"></i></a>.
The class library is still not GPL, but is expected to become licensed such somewhere in the first half of next year.</p>]]></content><author><name>Egon Willighagen</name></author><category term="cheminf" /><category term="conference" /><category term="openscience" /><category term="bioclipse" /><category term="cdk" /><category term="taverna" /><category term="java" /><summary type="html"><![CDATA[The 2nd German Conference on Chemoinformatics started yesterday, with two chemoinformatics tutorials: one on industrial chemoinformatics (I saw this presentation before… not sure when), with a good overview on integrating different information sources; the second one was about opensource chemoinformatics by Christoph Steinbeck (being involved in opensource chemoinformatics for almost 10 years now!), which included a Bioclipse demo (by me) and a demo by Thomas Kuhn on the CDK based chemoinformatics plugin to Taverna. Other opensource projects of the Blue Obelisk movement were mentioned and a few outside it too.]]></summary></entry></feed>