<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/pubchem.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-05-17T12:12:40+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/pubchem.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">The TDCC NES Col-Lab Retreat</title><link href="https://chem-bla-ics.linkedchemistry.info/2026/02/21/the-tdcc-nes-col-lab-retreat.html" rel="alternate" type="text/html" title="The TDCC NES Col-Lab Retreat" /><published>2026-02-21T00:00:00+00:00</published><updated>2026-02-21T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2026/02/21/the-tdcc-nes-col-lab-retreat</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2026/02/21/the-tdcc-nes-col-lab-retreat.html"><![CDATA[<p>Last autumn two TDCC projects started, <em>FAIR4ChemNL</em> (<a href="https://chem-bla-ics.linkedchemistry.info/2026/02/08/open-infrastructures.html">with the PeerTube channel</a>
and doi:<a href="https://doi.org/10.61686/XVYQV45374">10.61686/XVYQV45374</a>) and <em>FAIRify for metabolomics data</em>
(doi:<a href="https://doi.org/10.61686/CSGIP04334">10.61686/CSGIP04334</a>). But I haven’t written much on either yet and what the role is our research group in these projects.</p>

<p>Let’s start with what the TDCC actually are: they are <a href="https://tdcc.nl/">Thematic Digital Competence Centres</a>:</p>

<blockquote>
  <p>The Thematic Digital Competence Centres (TDCCs) are network-based initiatives set up by NWO and the Dutch academic
community to broker investments into research data management projects. The three TDCCs are national and discipline
based, with one pillar each for the Social Sciences &amp; Humanities (SSH), Natural and Engineering Sciences (NES) and
Life Sciences &amp; Health (LSH). The networks will help formulate and facilitate projects designed to promote the adoption
of open data, software and research practices, alongside the development of the necessary expertise.</p>
</blockquote>

<p>So, where initiatives like <a href="https://www.go-fair.org/">GO FAIR</a> had centers of competencies (the implementation networks),
they did not have funding for them. This was a main reason why the <em>Chemistry Implementation Network</em> (ChIN,
doi:<a href="https://doi.org/10.1162/dint_a_00035">10.1162/dint_a_00035</a>) did not take off.
The TDCCs do not provide a lot of money, but enough to support disseminating expertise and promote some key ideas.</p>

<p>The idea is that combined with other efforts, it strengthens the level of FAIR in the Dutch research community.
I have to say, this is much needed, as the level of FAIR data in journal publications is so much to wish for,
and still mostly absent.</p>

<p>The FAIR4ChemNL project already had a networking activity during the writing of the proposal, the workshop already
back in 2024 that I <a href="https://chem-bla-ics.linkedchemistry.info/2024/06/10/two-meetings.html">blogged about earlier</a>
(see also <a href="https://doi.org/10.5281/zenodo.15050550">this report</a>).
The FAIRify project is coordinated by the group that was key in the <em>Netherlands Metabolomics Center</em> (NMC), now the
<a href="https://metabolomicscentre.nl/">BeneLux Metabolomics Center</a>. During a postdoc at the NMC during my Wageningen
days, we already did a lot of FAIR competency building with <a href="https://chem-bla-ics.linkedchemistry.info/tag/metware">the MetWare project</a>.</p>

<h2 id="the-col-lab-retreat">The Col-Lab Retreat</h2>

<p>The <a href="https://tdcc.nl/about-tddc/nes/">TDCC-NES</a> organized a networking event in August last year,
the 2025 <a href="https://nescollab.nl/">TDCC-NES Col-Lab Retreat</a>. I am late with
reporting on it, but there simply was too much project management that took priority. The meeting was in the
wonderful Dutch town Schoorl, and the location is great for collaborative meetings. I had been there a year
earlier for an Open Science Retreat and was happy to go back.</p>

<p>During the unconference-style meeting <a href="https://tdcc.nl/creating-space-for-our-community-the-story-of-our-nes-col-lab-retreat/">various topics were discussed</a>
in breakout groups, and because of the two TDCC projects, I was particularly interested in the <em>Metadata and interoperability</em>
topic. Partly because this is how we can make eletronic lab notebooks automatically push metadata to
registries (and <a href="https://www.linkedin.com/in/rory-macneil-68a80011/">Rory Macneil</a> was also in Schoorl,
of <a href="https://www.researchspace.com/">RSpace/ResearchSpace</a> which already integrated with various open
platforms), and partly because I wanted to continue explore <a href="https://chem-bla-ics.linkedchemistry.info/tag/nanopub">nanopublications</a>
with <a href="https://fediscience.org/@rupdecat">Christian Meesters</a>, which could be the envelope to distribute
the metadata. For the last, I was looking at the Java library for nanopublications
(see <a href="https://github.com/Nanopublication/nanopub-java/pull/52">this PR</a>.</p>

<p>The idea that ELNs automatically share metadata about experiments is something that is attractive.
It would require no involvement from the researcher, would be fully automatic, and drive interest
(users, peer reviewers) to experiments and experimental data. Something that is still absurdly hard
is to do a search for experiments that measured the melting point of some chemical. How
awesome would it be if ELNs would automatically register chemicals from the experiment in,
for example, <a href="https://pubchem.ncbi.nlm.nih.gov/">PubChem</a>.</p>

<p>We had the idea of applying for a Lorentz Workshop, but the earliest deadline was too early, but
maybe it is time to pick up that idea again. Interoperability standards already exist, like
the aforementioned nanopubs, but also <a href="https://www.researchobject.org/ro-crate/">RO-Crates</a> that are also studied by Jente Houweling
in the VHP4Safety project (see <a href="https://platform.vhp4safety.nl/data">this Data tab</a> for a preview).</p>]]></content><author><name>Egon Willighagen</name></author><category term="fair" /><category term="doi:10.1162/DINT_A_00035" /><category term="chemistry" /><category term="metabolomics" /><category term="fair4chemnl" /><category term="fairify" /><category term="cito:citesAsEvidence:10.5281/ZENODO.15050550" /><category term="nanopub" /><category term="crate" /><category term="pubchem" /><summary type="html"><![CDATA[Last autumn two TDCC projects started, FAIR4ChemNL (with the PeerTube channel and doi:10.61686/XVYQV45374) and FAIRify for metabolomics data (doi:10.61686/CSGIP04334). But I haven’t written much on either yet and what the role is our research group in these projects.]]></summary></entry><entry><title type="html">Compound (class) identifiers in Wikidata</title><link href="https://chem-bla-ics.linkedchemistry.info/2018/08/18/compound-class-identifiers-in-wikidata.html" rel="alternate" type="text/html" title="Compound (class) identifiers in Wikidata" /><published>2018-08-18T00:00:00+00:00</published><updated>2018-08-18T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2018/08/18/compound-class-identifiers-in-wikidata</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2018/08/18/compound-class-identifiers-in-wikidata.html"><![CDATA[<p><span style="width: 40%; display: block; margin-left: auto; margin-right: auto; float: right">
<img src="/assets/images/extid-wikidata-histogram.png" /> <br />
<a href="https://edu.nl/h6kg3">Bar chart</a> showing the number of compounds with a particular chemical identifier.
</span>
I think <a href="http://wikidata.org/">Wikidata</a> is a groundbreaking project, which will have a major impact on science. One of the
reasons is the open license (CCZero), the very basic approach (<a href="http://wikiba.se/">Wikibase</a>), and the superb community around
it. For example, setting up your own Wikibase including a cool SPARQL endpoint, is
<a href="https://github.com/wmde/wikibase-docker">easily done with Docker</a>.</p>

<p>Wikidata has many sub projects, such as <a href="http://wikicite.org/">WikiCite</a>, which captures the collective of primary literature.
Another one is the <a href="https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry">WikiProject Chemistry</a>. The two nicely match
up, I think, making a public database linking chemicals to literature (tho, very much needs to be done here), see my recent
ICCS 2018 poster (doi:<a href="https://doi.org/10.6084/m9.figshare.6356027.v1">10.6084/m9.figshare.6356027.v1</a>, paper pending).</p>

<p>But Wikidata is also a great resource for identifier mappings between chemical databases, something we need for
<a href="https://chem-bla-ics.blogspot.com/2017/11/new-paper-wikipathways-multifaceted.html">our metabolism pathway research</a>.
The mapping, as you may know, are <a href="https://chem-bla-ics.blogspot.com/2016/09/metabolite-identifier-mapping-databases.html">used in the latter</a>
via <a href="https://www.bridgedb.org/">BridgeDb</a> and we have been using Wikidata as one of three sources for some time now (the others being
<a href="http://www.hmdb.ca/">HMDB</a> and <a href="https://www.ebi.ac.uk/chebi/">ChEBI</a>). WikiProject Chemistry has a related
<a href="https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry/ChemID">ChemID</a> effort, and while the wiki page does not show
much recent activity, there is actually a lot of ongoing effort (see <a href="https://edu.nl/h6kg3">plot</a>).
And I’ve been <a href="https://chem-bla-ics.blogspot.com/2018/07/lipid-map-identifiers-and.html">adding my bits</a>.</p>

<h2 id="limitations-of-the-links">Limitations of the links</h2>
<p>But not each identifier in Wikidata has the same meaning. While they are all classified as ‘external-id’, the actual link may
have different meaning. This, of course, is the essence of scientific lenses, see <a href="https://chem-bla-ics.blogspot.com/2013/05/linking-wikipathways-to-binding.html">this post</a>
and the papers cited therein. One reason here is the difference in what entries in the various databases mean.</p>

<p>Wikidata has an extensive model, defined by the aforementioned WikiProject Chemistry. For example, it has different concepts
for chemical compounds (in fact, the hierarchy is pretty rich) and compound classes. And these are differently modeled. Furthermore,
it has a model that formalizes that things with a different InChI are different, but even allows things with the same InChI to be
different, if need arises. It tries to accurately and precisely capture the certainty and uncertainty of the chemistry. As such,
it is a powerful system to handle identifier mappings, because databases are not clear, and chemistry and biological in data is
even less: we measure experimentally a characterization of chemicals, but what we put in databases and give names, are specific
models (often chemical graphs).</p>

<p>That model differs from what other (chemical) databases use, or seem to use, because not always do databases indicate what they
actually have in a record. But I think this is a fair guess.</p>

<h2 id="chebi">ChEBI</h2>
<p>ChEBI (and the matching <a href="https://www.wikidata.org/wiki/Property:P683">ChEBI ID</a>) has entries for chemical classes (e.g.
<a href="https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:35366">fatty acid</a>) and specific compounds (e.g.
<a href="https://www.ebi.ac.uk/chebi/searchId.do?chebiId=30089">acetate</a>).</p>

<h2 id="pubchem-chemspider-unichem">PubChem, ChemSpider, UniChem</h2>
<p>These three resources use the InChI as central asset. While they do not really have the concept of compound classes so much
(though increasingly they have classifications), they do have entries where stereochemistry is undefined or unknown. Each
one has their own way to link to other databases themselves, which normally includes tons of structure normalization (see
e.g. doi:<a href="https://doi.org/10.1186/s13321-018-0293-8">10.1186/s13321-018-0293-8</a> and
doi:<a href="https://doi.org/10.1186/s13321-015-0072-8">10.1186/s13321-015-0072-8</a>).</p>

<h2 id="hmdb">HMDB</h2>
<p>HMDB (and the matching <a href="https://www.wikidata.org/wiki/Property:P2057">P2057</a>) has a biological perspective; the entries
reflect the biology of a chemical. Therefore, for most compounds, they focus on the neutral forms of compounds. This makes
linking to/from other databases where the compound is not neutral chemically less precise.</p>

<h2 id="cas-registry-numbers">CAS registry numbers</h2>
<p>CAS (and the matching <a href="https://www.wikidata.org/wiki/Property:P231">P231</a>) is pretty unique itself, and has identifiers
for substances (see <a href="https://www.wikidata.org/wiki/Q79529">Q79529</a>), much more than chemical compounds, and comes with a
own set of unique features. For example, solutions of some compound, by design, have the same identifier. Previously,
formaldehyde and formalin had different Wikipedia/Wikidata pages, both with the same CAS registry number.</p>

<h2 id="limitations-of-the-links-2">Limitations of the links #2</h2>
<p>Now, returning to our starting point: limitations in linking databases. If we want FAIR mappings, we need to be as precise
as possible. Of course, that may mean we need more steps, but we can always simplify at will, but we never can have a
computer make the links more complex (well, not without making assumptions, etc).</p>

<p>And that is why Wikidata is so suitable to link all these chemical databases: it can distinguish differences when needed,
and make that explicit. It make mappings between the databases more <a href="https://www.nature.com/articles/sdata201618">FAIR</a>.</p>]]></content><author><name>Egon Willighagen</name></author><category term="wikidata" /><category term="scholia" /><category term="chemistry" /><category term="bridgedb" /><category term="cas" /><category term="chebi" /><category term="chemspider" /><category term="fair" /><category term="hmdb" /><category term="pubchem" /><category term="rdf" /><category term="wikicite" /><category term="justdoi:10.6084/m9.figshare.6356027.v1" /><category term="justdoi:10.1186/s13321-018-0293-8" /><category term="justdoi:10.1186/s13321-015-0072-8" /><category term="justdoi:10.1038/sdata.2016.18" /><summary type="html"><![CDATA[Bar chart showing the number of compounds with a particular chemical identifier. I think Wikidata is a groundbreaking project, which will have a major impact on science. One of the reasons is the open license (CCZero), the very basic approach (Wikibase), and the superb community around it. For example, setting up your own Wikibase including a cool SPARQL endpoint, is easily done with Docker.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/extid-wikidata-histogram.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/extid-wikidata-histogram.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Adding disclosures to Wikidata with Bioclipse</title><link href="https://chem-bla-ics.linkedchemistry.info/2016/03/20/adding-disclosures-to-wikidata-with.html" rel="alternate" type="text/html" title="Adding disclosures to Wikidata with Bioclipse" /><published>2016-03-20T00:00:00+00:00</published><updated>2016-03-20T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2016/03/20/adding-disclosures-to-wikidata-with</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2016/03/20/adding-disclosures-to-wikidata-with.html"><![CDATA[<p>Last week the huge, bi-annual ACS meeting took place (<a href="https://twitter.com/search?q=%23ACSSanDiego">#ACSSanDiego</a>),
during which commonly new drug (leads) are disclosed. This time too, like this one tweeted by
<a href="https://twitter.com/beth_halford">Bethany Halford</a>:</p>

<iframe id="twitter-widget-3" scrolling="no" frameborder="0" allowtransparency="true" allowfullscreen="true" class="" title="X Post" src="https://platform.twitter.com/embed/Tweet.html?dnt=false&amp;embedId=twitter-widget-3&amp;features=eyJ0ZndfdGltZWxpbmVfbGlzdCI6eyJidWNrZXQiOltdLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X2ZvbGxvd2VyX2NvdW50X3N1bnNldCI6eyJidWNrZXQiOnRydWUsInZlcnNpb24iOm51bGx9LCJ0ZndfdHdlZXRfZWRpdF9iYWNrZW5kIjp7ImJ1Y2tldCI6Im9uIiwidmVyc2lvbiI6bnVsbH0sInRmd19yZWZzcmNfc2Vzc2lvbiI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfZm9zbnJfc29mdF9pbnRlcnZlbnRpb25zX2VuYWJsZWQiOnsiYnVja2V0Ijoib24iLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X21peGVkX21lZGlhXzE1ODk3Ijp7ImJ1Y2tldCI6InRyZWF0bWVudCIsInZlcnNpb24iOm51bGx9LCJ0ZndfZXhwZXJpbWVudHNfY29va2llX2V4cGlyYXRpb24iOnsiYnVja2V0IjoxMjA5NjAwLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X3Nob3dfYmlyZHdhdGNoX3Bpdm90c19lbmFibGVkIjp7ImJ1Y2tldCI6Im9uIiwidmVyc2lvbiI6bnVsbH0sInRmd19kdXBsaWNhdGVfc2NyaWJlc190b19zZXR0aW5ncyI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfdXNlX3Byb2ZpbGVfaW1hZ2Vfc2hhcGVfZW5hYmxlZCI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfdmlkZW9faGxzX2R5bmFtaWNfbWFuaWZlc3RzXzE1MDgyIjp7ImJ1Y2tldCI6InRydWVfYml0cmF0ZSIsInZlcnNpb24iOm51bGx9LCJ0ZndfbGVnYWN5X3RpbWVsaW5lX3N1bnNldCI6eyJidWNrZXQiOnRydWUsInZlcnNpb24iOm51bGx9LCJ0ZndfdHdlZXRfZWRpdF9mcm9udGVuZCI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9fQ%3D%3D&amp;frame=false&amp;hideCard=false&amp;hideThread=false&amp;id=710543705812426752&amp;lang=en&amp;origin=https%3A%2F%2Fchem-bla-ics.blogspot.com%2F2016%2F03%2Fadding-disclosures-to-wikidata-with.html&amp;sessionId=ba8a9ed10d55387ac0f656bfaf73f3a579e1e77a&amp;theme=light&amp;widgetsVersion=2615f7e52b7e0%3A1702314776716&amp;width=550px" style="position: static; visibility: visible; width: 550px; height: 1311px; display: block; flex-grow: 1;" data-tweet-id="710543705812426752"></iframe>
<p><br /></p>

<p>Because getting this information out in the open is important, I think it’s a good idea to add them to
<a href="http://wikidata.org/">Wikidata</a> (see doi:<a href="http://dx.doi.org/10.3897/rio.1.e7573">10.3897/rio.1.e7573</a>).
So, with <a href="http://www.bioclipse.net/">Bioclipse</a> (doi:<a href="http://dx.doi.org/10.1186/1471-2105-8-59">10.1186/1471-2105-8-59</a>)
I redrew the structure:</p>

<p><img src="/assets/images/strucutre.png" alt="" /></p>

<p>I previously blogged about how to <a href="https://chem-bla-ics.linkedchemistry.info/2016/01/27/adding-chemical-compound-to-wikidata.html">add chemicals to Wikidata <i class="fa-solid fa-recycle fa-xs"></i></a>,
but I realized that I wanted to also use Bioclipse to automate this process a bit. So, I wrote this script to generated the SMILES, InChI,
InChIKey, double check the compound is not already in Wikidata (using the <a href="https://query.wikidata.org/">Wikidata SPARQL endpoint</a>),
an look up the <a href="https://pubchem.ncbi.nlm.nih.gov/">PubChem</a> compound identifier (example SMILES).</p>

<div class="language-groovy highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">smiles</span> <span class="o">=</span> <span class="s2">"CCCC"</span>

<span class="n">mol</span> <span class="o">=</span> <span class="n">cdk</span><span class="o">.</span><span class="na">fromSMILES</span><span class="o">(</span><span class="n">smiles</span><span class="o">)</span>
<span class="n">ui</span><span class="o">.</span><span class="na">open</span><span class="o">(</span><span class="n">mol</span><span class="o">)</span>

<span class="n">inchiObj</span> <span class="o">=</span> <span class="n">inchi</span><span class="o">.</span><span class="na">generate</span><span class="o">(</span><span class="n">mol</span><span class="o">)</span>
<span class="n">inchiShort</span> <span class="o">=</span> <span class="n">inchiObj</span><span class="o">.</span><span class="na">value</span><span class="o">.</span><span class="na">substring</span><span class="o">(</span><span class="mi">6</span><span class="o">)</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">inchiObj</span><span class="o">.</span><span class="na">key</span> <span class="c1">// key = "GDGXJFJBRMKYDL-FYWRMAATSA-N"</span>

<span class="n">sparql</span> <span class="o">=</span> <span class="s2">"""
PREFIX wdt: &lt;http://www.wikidata.org/prop/direct/&gt;
SELECT ?compound WHERE {
  ?compound wdt:P235 "$key" .
}
"""</span>

<span class="k">if</span> <span class="o">(</span><span class="n">bioclipse</span><span class="o">.</span><span class="na">isOnline</span><span class="o">())</span> <span class="o">{</span>
  <span class="n">results</span> <span class="o">=</span> <span class="n">rdf</span><span class="o">.</span><span class="na">sparqlRemote</span><span class="o">(</span>
    <span class="s2">"https://query.wikidata.org/sparql"</span><span class="o">,</span> <span class="n">sparql</span>
  <span class="o">)</span>
  <span class="n">missing</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="na">rowCount</span> <span class="o">==</span> <span class="mi">0</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
  <span class="n">missing</span> <span class="o">=</span> <span class="kc">true</span>
<span class="o">}</span>

<span class="n">formula</span> <span class="o">=</span> <span class="n">cdk</span><span class="o">.</span><span class="na">molecularFormula</span><span class="o">(</span><span class="n">mol</span><span class="o">)</span>

<span class="c1">// Create the Wikidata QuickStatement,</span>
<span class="c1">// see https://tools.wmflabs.org/wikidata-todo/quick_statements.php</span>

<span class="n">item</span> <span class="o">=</span> <span class="s2">"LAST"</span> <span class="c1">// set to Qxxxx if you need to append info,</span>
              <span class="c1">// e.g. item = "Q22579236"</span>

<span class="n">pubchemLine</span> <span class="o">=</span> <span class="s2">""</span>
<span class="k">if</span> <span class="o">(</span><span class="n">bioclipse</span><span class="o">.</span><span class="na">isOnline</span><span class="o">())</span> <span class="o">{</span>
  <span class="n">pcResults</span> <span class="o">=</span> <span class="n">pubchem</span><span class="o">.</span><span class="na">search</span><span class="o">(</span><span class="n">key</span><span class="o">)</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">pcResults</span><span class="o">.</span><span class="na">size</span> <span class="o">==</span> <span class="mi">1</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">cid</span> <span class="o">=</span> <span class="n">pcResults</span><span class="o">[</span><span class="mi">0</span><span class="o">]</span>
    <span class="n">pubchemLine</span> <span class="o">=</span> <span class="s2">"$item\tP662\t\"$cid\""</span>
  <span class="o">}</span>
<span class="o">}</span>

<span class="k">if</span> <span class="o">(!</span><span class="n">missing</span><span class="o">)</span> <span class="o">{</span>
  <span class="n">println</span> <span class="s2">"===================="</span>
  <span class="n">println</span> <span class="s2">"Already in Wikidata as "</span> <span class="o">+</span> <span class="n">results</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="s2">"compound"</span><span class="o">)</span>
  <span class="n">println</span> <span class="s2">"===================="</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
  <span class="n">statement</span> <span class="o">=</span> <span class="s2">"""
    CREATE
    
    $item\tDen\t\"chemical compound\"
    $item\tP233\t\"$smiles\"
    $item\tP274\t\"$formula\"
    $item\tP234\t\"$inchiShort\"
    $item\tP235\t\"$key\"
    $pubchemLine
  """</span>

  <span class="n">println</span> <span class="s2">"===================="</span>
  <span class="n">println</span> <span class="n">statement</span>
  <span class="n">println</span> <span class="s2">"===================="</span>
<span class="o">}</span>
</code></pre></div></div>

<p>The output of this script is a <a href="https://tools.wmflabs.org/wikidata-todo/quick_statements.php">QuickStatement</a> for
<a href="http://twitter.org/MagnusManske">Magnus Manske</a>’s tool (IMPORTANT: it’s not meant to automate editing Wikidata! I only automate
creating the input, which I carefully check (e.g. checking all stereochemistry is defined)! Note, how Bioclipse opens up the
structure in a viewer with ui.open()), which is a list of commands to create and edit entries in Wikidata. You need to enable
it first, but if you have an account, this is not too hard. Of course, the advantage is that it is a lot quicker. I have similar
script to create QuickStatements starting with only a <a href="https://www.ebi.ac.uk/chembl/">ChEMBL</a> identifier.</p>

<p>The QuickStatement for GDC-0853 looks like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    CREATE
    
    LAST Den "chemical compound"
    LAST P233 "O=C1C(=CC(=CN1C)c2ccnc(c2CO)N4C(=O)c3cc5c(n3CC4)CC(C)(C)C5)Nc6ncc(cc6)N7CCN(C[C@@H]7C)C8COC8"
    LAST P274 "C37H44N8O4"
    LAST P234 "1S/C37H44N8O4/c1-23-18-42(27-21-49-22-27)9-10-43(23)26-5-6-33(39-17-26)40-30-13-25(19-41(4)35(30)47)28-7-8-38-34(29(28)20-46)45-12-11-44-31(36(45)48)14-24-15-37(2,3)16-32(24)44/h5-8,13-14,17,19,23,27,46H,9-12,15-16,18,20-22H2,1-4H3,(H,39,40)/t23-/m0/s1"
    LAST P235 "WNEODWDFDXWOLU-QHCPKHFHSA-N"
    LAST P662 "86567195"
</code></pre></div></div>

<p>The first line creates a new Wikidata item, while the next ones add information about this compound. GDC-0853 is now also
<a href="https://www.wikidata.org/wiki/Q23304817">Q23304817</a>. The label I added manually afterwards. Note how the Bioclipse script found
the PubChem identifier, using the InChIKey. I also use this approach to add compounds to Wikidata that we have in
<a href="http://wikipathways.org/">WikiPathways</a>.</p>]]></content><author><name>Egon Willighagen</name></author><category term="acs" /><category term="bioclipse" /><category term="chembl" /><category term="inchi" /><category term="pubchem" /><category term="wikidata" /><category term="acssandiego" /><category term="doi:10.1186/1471-2105-8-59" /><category term="doi:10.3897/RIO.1.E7573" /><summary type="html"><![CDATA[Last week the huge, bi-annual ACS meeting took place (#ACSSanDiego), during which commonly new drug (leads) are disclosed. This time too, like this one tweeted by Bethany Halford:]]></summary></entry><entry><title type="html">PubChemRDF: semantic web access to PubChem data</title><link href="https://chem-bla-ics.linkedchemistry.info/2015/07/15/pubchemrdf-semantic-web-access-to.html" rel="alternate" type="text/html" title="PubChemRDF: semantic web access to PubChem data" /><published>2015-07-15T00:00:00+00:00</published><updated>2015-07-15T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2015/07/15/pubchemrdf-semantic-web-access-to</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2015/07/15/pubchemrdf-semantic-web-access-to.html"><![CDATA[<p><img src="/assets/images/s13321-015-0084-4-graphical-abstract.gif" style="width: 30%; display: block; margin-left: auto; margin-right: auto; float: right" />
Gang Fu and Evan Bolton have <a href="https://pubchem.ncbi.nlm.nih.gov/rdf/">blogged</a> about it previously, but their PubChemRDF paper is out now
(doi:<a href="https://doi.org/10.1186/s13321-015-0084-4">10.1186/s13321-015-0084-4</a>). It very likely defines the largest collection of RDF triples
using the <a href="http://chem-bla-ics.blogspot.nl/search?q=CHEMINF&amp;max-results=20&amp;by-date=true">CHEMINF ontology</a> and I congratulate the
authors with a increasingly powerful <a href="http://pubchem.ncbi.nlm.nih.gov/">PubChem</a> database.</p>

<p>With this major provider of Linked Open Data for chemistry now published, I should soon see where
<a href="http://chem-bla-ics.blogspot.nl/2012/07/isbjrn-4-added-cheminf-support.html">my Isbjørn stands</a>. The release of this publication is
also very timely with respect to the CHEMINF ontology, as I last week finished a transition from Google to GitHub, by moving the important
wiki pages, including one about “<a href="https://github.com/semanticchemistry/semanticchemistry/wiki/Where-is-the-CHEMINF-ontology-used%3F">Where is the CHEMINF ontology used?</a>”.
I already added Gang’s paper. A big thanks and congratulations to the PubChem team and my sincere thanks to have been able to contribute to this paper.</p>]]></content><author><name>Egon Willighagen</name></author><category term="pubchem" /><category term="rdf" /><category term="cheminf" /><category term="ontology" /><category term="doi:10.1186/S13321-015-0084-4" /><summary type="html"><![CDATA[Gang Fu and Evan Bolton have blogged about it previously, but their PubChemRDF paper is out now (doi:10.1186/s13321-015-0084-4). It very likely defines the largest collection of RDF triples using the CHEMINF ontology and I congratulate the authors with a increasingly powerful PubChem database.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/s13321-015-0084-4-graphical-abstract.gif" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/s13321-015-0084-4-graphical-abstract.gif" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Searching PubChem from within Bioclipse</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/08/07/searching-pubchem-from-within-bioclipse.html" rel="alternate" type="text/html" title="Searching PubChem from within Bioclipse" /><published>2009-08-07T00:00:00+00:00</published><updated>2009-08-07T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/08/07/searching-pubchem-from-within-bioclipse</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/08/07/searching-pubchem-from-within-bioclipse.html"><![CDATA[<p>For the application note which we are about to submit, I was working on improving the <a href="http://pubchem.ncbi.nlm.nih.gov/">PubChem</a>
<a href="http://www.bioclipse.net/">Bioclipse</a> API a bit, resulting in new <code class="language-plaintext highlighter-rouge">download</code> methods:</p>

<script src="https://gist.github.com/163281.js"></script>

<p>The search allows using <a href="http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_index">PubChem Filters</a> which provides
many simple means to restrict the search results. For example, we can search molecules and restrict on the molecular
weight:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">lists</span> <span class="o">=</span> <span class="nx">pubchem</span><span class="p">.</span><span class="nf">download</span><span class="p">(</span><span class="nx">pubchem</span><span class="p">.</span><span class="nf">search</span><span class="p">(</span><span class="dl">"</span><span class="s2">malaria 300:500[MW]</span><span class="dl">"</span><span class="p">))</span>
</code></pre></div></div>

<p>Other filters you can use in pubchem.search (provided by PubChem itself), includes (with examples):</p>

<ul>
  <li><strong>[el]</strong>: <code class="language-plaintext highlighter-rouge">pubchem.search("Au[el]")</code></li>
  <li><strong>[inchi]</strong>: <code class="language-plaintext highlighter-rouge">pubchem.search("\"InChI=1S/CH4/h1H4\"[inchi]")</code></li>
  <li><strong>[inchikey]</strong>: <code class="language-plaintext highlighter-rouge">pubchem.search("VNWKTOKETHGBQD-UHFFFAOYSA-N[inchikey]")</code></li>
  <li><strong>[mimass]</strong>: <code class="language-plaintext highlighter-rouge">pubchem.search("375.9785:375.9786[mimass]")</code></li>
</ul>

<p>And many, many more… see the linked Filters page.</p>

<p>Now, you surely want to look at the hits, for which we use the molecular table editor:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">list</span> <span class="o">=</span> <span class="nx">pubchem</span><span class="p">.</span><span class="nf">download</span><span class="p">(</span><span class="nx">pubchem</span><span class="p">.</span><span class="nf">search</span><span class="p">(</span><span class="dl">"</span><span class="s2">375.9785:375.9786[mimass]</span><span class="dl">"</span><span class="p">))</span>
<span class="nx">cdk</span><span class="p">.</span><span class="nf">saveSDFile</span><span class="p">(</span><span class="dl">"</span><span class="s2">/Virtual/hits.sdf</span><span class="dl">"</span><span class="p">,</span> <span class="nx">list</span><span class="p">)</span>
<span class="nx">ui</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="dl">"</span><span class="s2">/Virtual/hits.sdf</span><span class="dl">"</span><span class="p">)</span>
</code></pre></div></div>

<p>Resulting in:</p>

<p><img src="/assets/images/pubchemSearchResults.png" alt="" /></p>]]></content><author><name>Egon Willighagen</name></author><category term="bioclipse" /><category term="pubchem" /><category term="inchikey:VNWKTOKETHGBQD-UHFFFAOYSA-N" /><summary type="html"><![CDATA[For the application note which we are about to submit, I was working on improving the PubChem Bioclipse API a bit, resulting in new download methods:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/pubchemSearchResults.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/pubchemSearchResults.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">PubChem-CDK</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/05/11/pubchem-cdk.html" rel="alternate" type="text/html" title="PubChem-CDK" /><published>2009-05-11T00:00:00+00:00</published><updated>2009-05-11T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/05/11/pubchem-cdk</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/05/11/pubchem-cdk.html"><![CDATA[<p><a href="http://pele.farmbio.uu.se/pubchem/">PubChem-CDK</a> is a project that runs <a href="http://cdk.sf.net/">CDK</a> code on the <a href="http://pubchem.ncbi.nlm.nih.gov/">PubChem</a> data.
As we speak, a groovy script reads about 100 PubChem Compounds XML entries per second into the database. Mind you, not the SDF they distribute which uses a
custom extension to overcome the limits of the real MDL SDF format.</p>

<p>Right now, it has run the atom type perception algorithm on about 1M compounds, and has a pretty good coverage of the <em>organic chemistry</em> domain. I will
analyze the <a href="http://pele.farmbio.uu.se/pubchem/atomtyping/">results</a> statistically soon, but will likely use this data first to add some missing atom types
to CDK 1.2.x. BTW, did you know only <strong><em>three</em></strong> <a href="http://pele.farmbio.uu.se/pubchem/atomtyping/?element=C">carbon atoms failed</a>?
A C<sup>4-</sup> (CID:<a href="http://pele.farmbio.uu.se/pubchem/?cid=156031">156031</a>), a C<sup>3+</sup> (CID:<a href="http://pele.farmbio.uu.se/pubchem/?cid=161072">161072</a>),
and a C<sup>2+</sup> (CID:<a href="http://pele.farmbio.uu.se/pubchem/?cid=161073">161073</a>). Would your cheminformatics library know what their properties are?</p>

<p>It is really nice way of browsing PubChem, BTW. For example, did you know there are several boron compounds which have a substructure [N+]-[B+]-[N+]? Yes,
three positive charges, <em>next</em> to each other? For example (CID:<a href="http://pele.farmbio.uu.se/pubchem/?cid=3612285">3612285</a>):</p>

<p><img src="/assets/images/CID3612285.png" alt="" /></p>

<p>Well, neither did I. How was it synthesised? What are the spectral properties? How do they stabilise it? What magic counter ion? PubChem, unfortunately,
does not have links to primary literature, and there is no free source for that available. A failure in chemistry. The source points to
<a href="http://cdb.ics.uci.edu/index.htm">ChemDB</a>, but the <a href="http://cdb.ics.uci.edu/cgibin/ChemicalDetailWeb.psp?chemical_id=5257702">entry in that database</a> does
not shed light on this either.</p>

<p>Anyway, more on this later. Much more, as I plan to run many CDK algorithms on this code.</p>]]></content><author><name>Egon Willighagen</name></author><category term="cdk" /><category term="pubchem" /><category term="inchikey:QBHDGWNWWIEZOM-UHFFFAOYSA-N" /><summary type="html"><![CDATA[PubChem-CDK is a project that runs CDK code on the PubChem data. As we speak, a groovy script reads about 100 PubChem Compounds XML entries per second into the database. Mind you, not the SDF they distribute which uses a custom extension to overcome the limits of the real MDL SDF format.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/CID3612285.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/CID3612285.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Downloading Domoic Acid from PubChem</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/04/17/downloading-domoic-acid-from-pubchem.html" rel="alternate" type="text/html" title="Downloading Domoic Acid from PubChem" /><published>2009-04-17T00:10:00+00:00</published><updated>2009-04-17T00:10:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/04/17/downloading-domoic-acid-from-pubchem</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/04/17/downloading-domoic-acid-from-pubchem.html"><![CDATA[<p>The identity of <a href="http://en.wikipedia.org/wiki/Domoic_acid">domoic acid</a> has been under discussion (see
<a href="http://www.chemspider.com/blog/the-plot-thickens-on-domoic-acid.html">here</a>, <a href="http://www.chemspider.com/blog/where-does-ce-news-source-its-chemical-structures.html">here</a>
and <a href="http://www.chemspider.com/blog/providing-some-structured-support-with-chemspiders-wikipedia-services.html">here</a>).
(And I very much like the <a href="http://www.chemspider.com/">ChemSpider</a> service to make it easy to
<a href="http://www.chemspider.com/blog/providing-some-structured-support-with-chemspiders-wikipedia-services.html">copy data from ChemSpider into WikiPedia ChemBoxes</a>;
cheers!)</p>

<p>Now, my practical in next weeks <a href="https://apps.sourceforge.net/mediawiki/cdk/index.php?title=CDK_Workshop_2009">CDK Workshop will</a> use
<a href="http://groovy.codehaus.org/">Groovy</a> (please install it on your laptop!), and am hacking up example scripts for the course material,
and came up with this script to download the structure of <a href="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=5282253">domoic acid</a>
from <a href="http://pubchem.ncbi.nlm.nih.gov/">PubChem</a> (CID:5282253):</p>

<script src="https://gist.github.com/97067.js"></script>]]></content><author><name>Egon Willighagen</name></author><category term="cdk" /><category term="pubchem" /><category term="chemspider" /><category term="wikipedia" /><category term="inchikey:VZFRNCSOCOPNDB-AOKDLOFSSA-N" /><summary type="html"><![CDATA[The identity of domoic acid has been under discussion (see here, here and here). (And I very much like the ChemSpider service to make it easy to copy data from ChemSpider into WikiPedia ChemBoxes; cheers!)]]></summary></entry><entry><title type="html">Rednael, CDK Git for Rajarshi’s patches, PubChem SDF</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/04/13/rednael-cdk-git-for-rajarshis-patches.html" rel="alternate" type="text/html" title="Rednael, CDK Git for Rajarshi’s patches, PubChem SDF" /><published>2009-04-13T00:00:00+00:00</published><updated>2009-04-13T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/04/13/rednael-cdk-git-for-rajarshis-patches</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/04/13/rednael-cdk-git-for-rajarshis-patches.html"><![CDATA[<p>Short blog item about some <a href="http://cdk.git.sourceforge.net/">CDK Git</a> updates. Could not get sleep, so might as well spend that time on
<a href="http://cdk.sf.net/">CDK</a> hacking, not? Reason why I actually could not catch sleep was the news that <a href="http://pubchem.ncbi.nlm.nih.gov/">PubChem</a>
SD files are not regular MDL SD files, but use custom extensions, for example, for dative bonds (see
<a href="ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_sdtags.pdf">this PDF</a>). This <em>surely</em> explains the weird things I have seen,
but, unfortunately, the big SDF button on PubChem does not warn about that. Anyway, thanx for Wolfgang for informing about that
customization!</p>

<p>So, instead I hacked a bit on the CDK, which was about time. Last two weeks have been really busy with finding a new house (which we did),
and writing two big grant applications (about done). Finally time for cleaning up my TOREPLY list on Gmail. I picked the request of
<a href="http://blog.rguha.net/">Rajarshi</a> to put online some of his patches, which are now available from
<a href="http://pele.farmbio.uu.se/git/rajarshi.git/">pele.farmbio.uu.se/git/rajarshi.git</a>, where you will
<a href="https://sourceforge.net/tracker/?func=browse&amp;group_id=20024&amp;atid=320024">find four of his patches ready for review</a>:
<em>fp2d</em>, <em>pcore</em>, <em>pubchemfp</em> and <em>cleanpt</em>. These are <strong>really</strong> interesting patches!</p>

<p>That brings me to the last thing for today: <a href="http://github.com/egonw/rednael/tree/master">Rednael</a>.
<a href="http://en.wikipedia.org/wiki/Zarah_Leander">Leander</a> (a nickname already reserved, so reverse used) is an
<a href="http://en.wikipedia.org/wiki/Internet_Relay_Chat">IRC</a> bot for the #cdk channel which reports us of commits to our main Git repository.
Back in the old SVN days (time goes so fast :), we had the <a href="http://cia.vc/">CIA</a> (Langley?) use there equipment to monitor SVN commit,
and <a href="http://cia.vc/stats/project/cdk">report those online</a> and on IRC, but Git is too advanced for them, apparently. So, I wrote
my own little bot to do it (see earlier link to GitHub). It can monitor multiple channels and report about multiple RSS feeds per
channel. Thus, it is actually not restricted to Git commits alone.</p>]]></content><author><name>Egon Willighagen</name></author><category term="cdk" /><category term="git" /><category term="pubchem" /><summary type="html"><![CDATA[Short blog item about some CDK Git updates. Could not get sleep, so might as well spend that time on CDK hacking, not? Reason why I actually could not catch sleep was the news that PubChem SD files are not regular MDL SD files, but use custom extensions, for example, for dative bonds (see this PDF). This surely explains the weird things I have seen, but, unfortunately, the big SDF button on PubChem does not warn about that. Anyway, thanx for Wolfgang for informing about that customization!]]></summary></entry><entry><title type="html">Bioclipse2 Scripting #2: searching PubChem</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/02/21/bioclipse2-scripting-2-searching.html" rel="alternate" type="text/html" title="Bioclipse2 Scripting #2: searching PubChem" /><published>2009-02-21T00:00:00+00:00</published><updated>2009-02-21T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/02/21/bioclipse2-scripting-2-searching</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/02/21/bioclipse2-scripting-2-searching.html"><![CDATA[<p>This week I have been porting the PubChem plugin for <a href="http://www.bioclipse.net/">Bioclipse</a> 1.2 to the new manager-based architecture. While still working on the Wizards,
you can run the following JavaScript in Bioclipse2 from SVN and from the next beta (*):</p>

<script src="https://gist.github.com/67462.js"></script>

<p>*) There was some confusion on the <em>two</em> beta Bioclipse2 releases so far. Some people expected a release without any bugs left. That release is what we planned to call a
<em>Release Candidate</em>. We agree that the first two betas at least turned out to be more alpha than we actually hoped, and we thank everyone who has given these releases a
go. Those who tried several development releases of Bioclipse2 saw a lot of ongoing development, and we are fixing
<a href="http://bugs.bioclipse.net/">any bug reported</a> on these releases. So, do not hesitate in reporting bugs!</p>

<p>Earlier in this series:</p>

<ul>
  <li><a href="https://chem-bla-ics.linkedchemistry.info/2008/10/25/bioclipse2-scripting-1-from-smiles-to.html">Bioclipse2 Scripting #1: from SMILES to a UFF optimized structure in Jmol <i class="fa-solid fa-recycle fa-xs"></i></a></li>
  <li><a href="https://chem-bla-ics.linkedchemistry.info/2008/11/04/next-generation-asynchronous.html">Next generation asynchronous webservices #2 <i class="fa-solid fa-recycle fa-xs"></i></a></li>
  <li><a href="https://chem-bla-ics.linkedchemistry.info/2008/11/20/scripting-jchempaint.html">Scripting JChemPaint <i class="fa-solid fa-recycle fa-xs"></i></a></li>
  <li><a href="https://chem-bla-ics.linkedchemistry.info/2009/02/15/bioclipse-for-cdk-developers-1.html">Bioclipse for CDK Developers #1 <i class="fa-solid fa-recycle fa-xs"></i></a></li>
</ul>]]></content><author><name>Egon Willighagen</name></author><category term="bioclipse" /><category term="pubchem" /><category term="javascript" /><summary type="html"><![CDATA[This week I have been porting the PubChem plugin for Bioclipse 1.2 to the new manager-based architecture. While still working on the Wizards, you can run the following JavaScript in Bioclipse2 from SVN and from the next beta (*):]]></summary></entry><entry><title type="html">Editing and Validation of PubChem XML documents</title><link href="https://chem-bla-ics.linkedchemistry.info/2009/01/15/editing-and-validation-of-pubchem-xml.html" rel="alternate" type="text/html" title="Editing and Validation of PubChem XML documents" /><published>2009-01-15T00:00:00+00:00</published><updated>2009-01-15T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2009/01/15/editing-and-validation-of-pubchem-xml</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2009/01/15/editing-and-validation-of-pubchem-xml.html"><![CDATA[<p>With the general framework set up for <a href="https://chem-bla-ics.linkedchemistry.info/2008/12/30/editing-and-validation-of-cml-documents.html">editing and validation of CML documents <i class="fa-solid fa-recycle fa-xs"></i></a>,
it was fairly easy to support the <a href="ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem.xsd">PubChem XML file format schema too</a>.</p>

<p>With the upcoming Bioclipse2 beta (scheduled next Friday), all you need to install on top of the Bioclipse2 core is the new XML feature.</p>]]></content><author><name>Egon Willighagen</name></author><category term="pubchem" /><category term="xml" /><category term="bioclipse" /><summary type="html"><![CDATA[With the general framework set up for editing and validation of CML documents , it was fairly easy to support the PubChem XML file format schema too.]]></summary></entry></feed>