<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/defense.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-06-15T12:00:19+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/defense.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">The CDK/Metabolomics/Chemometrics Unconference results</title><link href="https://chem-bla-ics.linkedchemistry.info/2008/04/07/cdkmetabolomicschemometrics.html" rel="alternate" type="text/html" title="The CDK/Metabolomics/Chemometrics Unconference results" /><published>2008-04-07T00:10:00+00:00</published><updated>2008-04-07T00:10:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2008/04/07/cdkmetabolomicschemometrics</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2008/04/07/cdkmetabolomicschemometrics.html"><![CDATA[<p>As <a href="https://chem-bla-ics.linkedchemistry.info/2008/04/03/t-plus-18-hours-dr-and-preparing-for.html">announced earlier <i class="fa-solid fa-recycle fa-xs"></i></a>, Miguel, Velitchka,
<a href="http://www.steinbeck-molecular.de/steinblog/">Christoph</a> and I held a small <a href="http://cdk.sf.net/">CDK</a>/Metabolomics/Chemometrics
unconference. We started late, and did not have an evening program, resulting in not overly much results. However, we did do
<em><a href="http://chem-bla-ics.blogspot.com/search?q=molecular+chemometrics">molecular chemometrics</a></em>. <!-- keep link --></p>

<p>We used the <a href="http://www.r-project.org/">R statistics software</a> together with Rajarshi’s <a href="http://cran.r-project.org/web/packages/rcdk/index.html">rcdk</a>
package (an R wrapper around the CDK library) and Ron’s (my PhD supervisor) <a href="http://cran.r-project.org/web/packages/pls/index.html">PLS</a>
package (see <a href="http://www.jstatsoft.org/v18/i02/">this paper</a>), to predict retention indices for a number of metabolites.</p>

<p>We ended up with this R script:</p>

<div class="language-R highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="s2">"rJava"</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="s2">"rcdk"</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="s2">"pls"</span><span class="p">)</span><span class="w">
</span><span class="n">mols</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">load.molecules</span><span class="p">(</span><span class="s2">"data_cdk.sdf"</span><span class="p">)</span><span class="w">
</span><span class="n">selection</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get.desc.names</span><span class="p">()</span><span class="w">
</span><span class="n">selection</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">selection</span><span class="p">[</span><span class="o">-</span><span class="n">which</span><span class="p">(</span><span class="n">selection</span><span class="o">==</span><span class="s2">"org.openscience.cdk.qsar.descriptors.molecular.AminoAcidCountDescriptor"</span><span class="p">)]</span><span class="w">
</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">eval.desc</span><span class="p">(</span><span class="n">mols</span><span class="p">,</span><span class="w"> </span><span class="n">selection</span><span class="p">,</span><span class="w"> </span><span class="n">verbose</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
</span><span class="n">x2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">[,</span><span class="n">apply</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">a</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="nf">all</span><span class="p">(</span><span class="o">!</span><span class="nf">is.na</span><span class="p">(</span><span class="n">a</span><span class="p">))})]</span><span class="w">
</span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">read.table</span><span class="p">(</span><span class="s2">"data_cdk_RI"</span><span class="p">)</span><span class="w">
</span><span class="n">input</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">)</span><span class="w">
</span><span class="n">pls.model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">plsr</span><span class="p">(</span><span class="n">V1</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">.</span><span class="p">,</span><span class="w"> </span><span class="m">50</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="o">=</span><span class="n">input</span><span class="p">,</span><span class="w"> </span><span class="n">validation</span><span class="o">=</span><span class="s2">"CV"</span><span class="p">)</span><span class="w">
</span><span class="n">summary</span><span class="p">(</span><span class="n">pls.model</span><span class="p">)</span><span class="w">
</span><span class="n">plot</span><span class="p">(</span><span class="n">RMSEP</span><span class="p">(</span><span class="n">pls.model</span><span class="p">))</span><span class="w">
</span><span class="n">plot</span><span class="p">(</span><span class="n">pls.model</span><span class="p">,</span><span class="w"> </span><span class="n">ncomp</span><span class="o">=</span><span class="m">20</span><span class="p">)</span><span class="w">
</span><span class="n">abline</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">col</span><span class="o">=</span><span class="s2">"red"</span><span class="p">)</span><span class="w">
</span><span class="n">plot</span><span class="p">(</span><span class="n">pls.model</span><span class="p">,</span><span class="w"> </span><span class="s2">"loadings"</span><span class="p">,</span><span class="w"> </span><span class="n">comps</span><span class="o">=</span><span class="m">1</span><span class="o">:</span><span class="m">2</span><span class="p">)</span><span class="w">
</span><span class="n">savehistory</span><span class="p">(</span><span class="s2">"finalHistory.R"</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">AminoAcidCountDescriptor</code> threw us a <code class="language-plaintext highlighter-rouge">NullPointerException</code> and there were a few NAs in the resulting matrix. The CV results were
not so good as Velitchka’s best models, but still a good start:</p>

<p><img src="/assets/images/riPred.png" alt="" /></p>

<p>No variable selection; 200 objects, 190 variables.</p>

<p>Questions:</p>

<ul>
  <li>Can we do this in <a href="http://www.bioclipse.net/">Bioclipse2</a> too?</li>
  <li>Can we improve the default CDK descriptor parameters to maximize the column count?</li>
  <li>Rajarshi, what would be involved to write some wrapper code for atomic descriptors for rcdk?</li>
</ul>]]></content><author><name>Egon Willighagen</name></author><category term="cdk" /><category term="defense" /><category term="phd" /><category term="metabolomics" /><category term="cheminf" /><category term="chemometrics" /><category term="justdoi:10.18637/jss.v018.i02" /><summary type="html"><![CDATA[As announced earlier , Miguel, Velitchka, Christoph and I held a small CDK/Metabolomics/Chemometrics unconference. We started late, and did not have an evening program, resulting in not overly much results. However, we did do molecular chemometrics.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/riPred.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/riPred.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">T plus 51 hours: a short photo impression</title><link href="https://chem-bla-ics.linkedchemistry.info/2008/04/04/t-plus-51-hours-short-photo-impression.html" rel="alternate" type="text/html" title="T plus 51 hours: a short photo impression" /><published>2008-04-04T00:00:00+00:00</published><updated>2008-04-04T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2008/04/04/t-plus-51-hours-short-photo-impression</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2008/04/04/t-plus-51-hours-short-photo-impression.html"><![CDATA[<p>I normally do not do these kinds of blog items, but, in reply to <a href="http://www.steinbeck-molecular.de/steinblog/index.php/2008/04/03/congratulations-egon/#comment-327">Christoph’s blog</a>,
here’s an overview of the ceremony (see also <a href="https://chem-bla-ics.linkedchemistry.info/2008/04/01/t-minus-26-hours-defending-open-source.html">T-26 <i class="fa-solid fa-recycle fa-xs"></i></a> and
<a href="https://chem-bla-ics.linkedchemistry.info/2008/04/03/t-plus-18-hours-dr-and-preparing-for.html">T+18 <i class="fa-solid fa-recycle fa-xs"></i></a>):</p>

<p><img src="/assets/images/vga_E112.JPG" alt="" /></p>

<p>This is the doctorate certificate Christoph mentioned, with also Karin and our kids:</p>

<p><img src="/assets/images/vga_E179.JPG" alt="" /></p>

<p>And, <a href="http://www.oortjeshekken.nl/">here</a> (<a href="http://maps.google.com/maps?f=q&amp;hl=en&amp;geocode=&amp;q=Erlecomsedam+4,+ooij,+netherlands&amp;sll=51.857623,5.93914&amp;sspn=0.046967,0.146942&amp;ie=UTF8&amp;ll=51.864169,5.933647&amp;spn=0.01174,0.036736&amp;t=h&amp;z=15">map</a>)
was the dinner in the evening:</p>

<p><img src="/assets/images/vga_E227.JPG" alt="" /></p>]]></content><author><name>Egon Willighagen</name></author><category term="defense" /><category term="phd" /><summary type="html"><![CDATA[I normally do not do these kinds of blog items, but, in reply to Christoph’s blog, here’s an overview of the ceremony (see also T-26 and T+18 ):]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/vga_E112.JPG" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/vga_E112.JPG" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">T plus 18 hours: dr and preparing for the afterparty, umm ^w^w^w, CDK/Metabolomics/Chemometrics unconference</title><link href="https://chem-bla-ics.linkedchemistry.info/2008/04/03/t-plus-18-hours-dr-and-preparing-for.html" rel="alternate" type="text/html" title="T plus 18 hours: dr and preparing for the afterparty, umm ^w^w^w, CDK/Metabolomics/Chemometrics unconference" /><published>2008-04-03T00:00:00+00:00</published><updated>2008-04-03T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2008/04/03/t-plus-18-hours-dr-and-preparing-for</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2008/04/03/t-plus-18-hours-dr-and-preparing-for.html"><![CDATA[<p>I am doctor now; I shall now be <a href="http://taaladvies.net/taal/advies/tekst/21#6">addressed as</a> <em>weledelzeergeleerde</em> Egon;
translating to something like <em>quite-noble-very-knowledgeable</em>, hahahaha. I’ll put up a few photo’s of the ceremony, which
is actually quite formal at the <a href="http://www.ru.nl/">Radboud University</a>, later.</p>

<p>With this blog item, I would to thank everyone who left a message, sent email, etc with good luck messages. Very much
appreciated! I’d also like to thank my supervisors, promotores <a href="http://www.cac.science.ru.nl/people/lbuydens/index.html">Lutgarde Buydens</a> and
<a href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/">Peter Murray-Rust</a> (he mentions the event <a href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1019">here</a>),
and <a href="http://www.cac.science.ru.nl/people/rwehrens/index.html">Ron Wehrens</a> for their confidence in me and their guidance
on the path towards the post-doc life. I also thank all those who attended my defense; I had a brilliant day, and actually
enjoyed talking to those who took place in my promotion committee and who asked me the not-really-nasty-questions about
my work.</p>

<h2 id="cdk-chemometrics-in-metabolomics-unconference">CDK-Chemometrics in Metabolomics Unconference</h2>

<p>For today, I organized a small, informal <a href="http://en.wikipedia.org/wiki/Unconference">unconference</a>, oriented around the
<a href="http://cdk.sf.net/">CDK</a>, chemometrics and metabolomics. I’m certain we will be online much of the day, as we typically
do. The meeting will start around 10:00 <a href="http://en.wikipedia.org/wiki/Central_European_Summer_Time">CEST</a>, but we’ll
attend a seminar by <a href="http://www.ki.si/index.php?id=844">Marjana Novič</a> at 11:00 CEST. If you happen to be in
<a href="http://en.wikipedia.org/wiki/Nijmegen">Nijmegen</a>, just drop in on the Analytical Chemistry department.
Otherwise, join the #cdk chat channel in the irc.freenode.net network.</p>

<p>What we’ll do?? Hey, it’s an unconference; we have no idea yet :)</p>]]></content><author><name>Egon Willighagen</name></author><category term="defense" /><category term="cheminf" /><category term="chemometrics" /><category term="phd" /><summary type="html"><![CDATA[I am doctor now; I shall now be addressed as weledelzeergeleerde Egon; translating to something like quite-noble-very-knowledgeable, hahahaha. I’ll put up a few photo’s of the ceremony, which is actually quite formal at the Radboud University, later.]]></summary></entry></feed>