<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/acssandiego.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-04-11T11:30:50+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/acssandiego.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">Adding disclosures to Wikidata with Bioclipse</title><link href="https://chem-bla-ics.linkedchemistry.info/2016/03/20/adding-disclosures-to-wikidata-with.html" rel="alternate" type="text/html" title="Adding disclosures to Wikidata with Bioclipse" /><published>2016-03-20T00:00:00+00:00</published><updated>2016-03-20T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2016/03/20/adding-disclosures-to-wikidata-with</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2016/03/20/adding-disclosures-to-wikidata-with.html"><![CDATA[<p>Last week the huge, bi-annual ACS meeting took place (<a href="https://twitter.com/search?q=%23ACSSanDiego">#ACSSanDiego</a>),
during which commonly new drug (leads) are disclosed. This time too, like this one tweeted by
<a href="https://twitter.com/beth_halford">Bethany Halford</a>:</p>

<iframe id="twitter-widget-3" scrolling="no" frameborder="0" allowtransparency="true" allowfullscreen="true" class="" title="X Post" src="https://platform.twitter.com/embed/Tweet.html?dnt=false&amp;embedId=twitter-widget-3&amp;features=eyJ0ZndfdGltZWxpbmVfbGlzdCI6eyJidWNrZXQiOltdLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X2ZvbGxvd2VyX2NvdW50X3N1bnNldCI6eyJidWNrZXQiOnRydWUsInZlcnNpb24iOm51bGx9LCJ0ZndfdHdlZXRfZWRpdF9iYWNrZW5kIjp7ImJ1Y2tldCI6Im9uIiwidmVyc2lvbiI6bnVsbH0sInRmd19yZWZzcmNfc2Vzc2lvbiI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfZm9zbnJfc29mdF9pbnRlcnZlbnRpb25zX2VuYWJsZWQiOnsiYnVja2V0Ijoib24iLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X21peGVkX21lZGlhXzE1ODk3Ijp7ImJ1Y2tldCI6InRyZWF0bWVudCIsInZlcnNpb24iOm51bGx9LCJ0ZndfZXhwZXJpbWVudHNfY29va2llX2V4cGlyYXRpb24iOnsiYnVja2V0IjoxMjA5NjAwLCJ2ZXJzaW9uIjpudWxsfSwidGZ3X3Nob3dfYmlyZHdhdGNoX3Bpdm90c19lbmFibGVkIjp7ImJ1Y2tldCI6Im9uIiwidmVyc2lvbiI6bnVsbH0sInRmd19kdXBsaWNhdGVfc2NyaWJlc190b19zZXR0aW5ncyI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfdXNlX3Byb2ZpbGVfaW1hZ2Vfc2hhcGVfZW5hYmxlZCI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9LCJ0ZndfdmlkZW9faGxzX2R5bmFtaWNfbWFuaWZlc3RzXzE1MDgyIjp7ImJ1Y2tldCI6InRydWVfYml0cmF0ZSIsInZlcnNpb24iOm51bGx9LCJ0ZndfbGVnYWN5X3RpbWVsaW5lX3N1bnNldCI6eyJidWNrZXQiOnRydWUsInZlcnNpb24iOm51bGx9LCJ0ZndfdHdlZXRfZWRpdF9mcm9udGVuZCI6eyJidWNrZXQiOiJvbiIsInZlcnNpb24iOm51bGx9fQ%3D%3D&amp;frame=false&amp;hideCard=false&amp;hideThread=false&amp;id=710543705812426752&amp;lang=en&amp;origin=https%3A%2F%2Fchem-bla-ics.blogspot.com%2F2016%2F03%2Fadding-disclosures-to-wikidata-with.html&amp;sessionId=ba8a9ed10d55387ac0f656bfaf73f3a579e1e77a&amp;theme=light&amp;widgetsVersion=2615f7e52b7e0%3A1702314776716&amp;width=550px" style="position: static; visibility: visible; width: 550px; height: 1311px; display: block; flex-grow: 1;" data-tweet-id="710543705812426752"></iframe>
<p><br /></p>

<p>Because getting this information out in the open is important, I think it’s a good idea to add them to
<a href="http://wikidata.org/">Wikidata</a> (see doi:<a href="http://dx.doi.org/10.3897/rio.1.e7573">10.3897/rio.1.e7573</a>).
So, with <a href="http://www.bioclipse.net/">Bioclipse</a> (doi:<a href="http://dx.doi.org/10.1186/1471-2105-8-59">10.1186/1471-2105-8-59</a>)
I redrew the structure:</p>

<p><img src="/assets/images/strucutre.png" alt="" /></p>

<p>I previously blogged about how to <a href="https://chem-bla-ics.linkedchemistry.info/2016/01/27/adding-chemical-compound-to-wikidata.html">add chemicals to Wikidata <i class="fa-solid fa-recycle fa-xs"></i></a>,
but I realized that I wanted to also use Bioclipse to automate this process a bit. So, I wrote this script to generated the SMILES, InChI,
InChIKey, double check the compound is not already in Wikidata (using the <a href="https://query.wikidata.org/">Wikidata SPARQL endpoint</a>),
an look up the <a href="https://pubchem.ncbi.nlm.nih.gov/">PubChem</a> compound identifier (example SMILES).</p>

<div class="language-groovy highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">smiles</span> <span class="o">=</span> <span class="s2">"CCCC"</span>

<span class="n">mol</span> <span class="o">=</span> <span class="n">cdk</span><span class="o">.</span><span class="na">fromSMILES</span><span class="o">(</span><span class="n">smiles</span><span class="o">)</span>
<span class="n">ui</span><span class="o">.</span><span class="na">open</span><span class="o">(</span><span class="n">mol</span><span class="o">)</span>

<span class="n">inchiObj</span> <span class="o">=</span> <span class="n">inchi</span><span class="o">.</span><span class="na">generate</span><span class="o">(</span><span class="n">mol</span><span class="o">)</span>
<span class="n">inchiShort</span> <span class="o">=</span> <span class="n">inchiObj</span><span class="o">.</span><span class="na">value</span><span class="o">.</span><span class="na">substring</span><span class="o">(</span><span class="mi">6</span><span class="o">)</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">inchiObj</span><span class="o">.</span><span class="na">key</span> <span class="c1">// key = "GDGXJFJBRMKYDL-FYWRMAATSA-N"</span>

<span class="n">sparql</span> <span class="o">=</span> <span class="s2">"""
PREFIX wdt: &lt;http://www.wikidata.org/prop/direct/&gt;
SELECT ?compound WHERE {
  ?compound wdt:P235 "$key" .
}
"""</span>

<span class="k">if</span> <span class="o">(</span><span class="n">bioclipse</span><span class="o">.</span><span class="na">isOnline</span><span class="o">())</span> <span class="o">{</span>
  <span class="n">results</span> <span class="o">=</span> <span class="n">rdf</span><span class="o">.</span><span class="na">sparqlRemote</span><span class="o">(</span>
    <span class="s2">"https://query.wikidata.org/sparql"</span><span class="o">,</span> <span class="n">sparql</span>
  <span class="o">)</span>
  <span class="n">missing</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="na">rowCount</span> <span class="o">==</span> <span class="mi">0</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
  <span class="n">missing</span> <span class="o">=</span> <span class="kc">true</span>
<span class="o">}</span>

<span class="n">formula</span> <span class="o">=</span> <span class="n">cdk</span><span class="o">.</span><span class="na">molecularFormula</span><span class="o">(</span><span class="n">mol</span><span class="o">)</span>

<span class="c1">// Create the Wikidata QuickStatement,</span>
<span class="c1">// see https://tools.wmflabs.org/wikidata-todo/quick_statements.php</span>

<span class="n">item</span> <span class="o">=</span> <span class="s2">"LAST"</span> <span class="c1">// set to Qxxxx if you need to append info,</span>
              <span class="c1">// e.g. item = "Q22579236"</span>

<span class="n">pubchemLine</span> <span class="o">=</span> <span class="s2">""</span>
<span class="k">if</span> <span class="o">(</span><span class="n">bioclipse</span><span class="o">.</span><span class="na">isOnline</span><span class="o">())</span> <span class="o">{</span>
  <span class="n">pcResults</span> <span class="o">=</span> <span class="n">pubchem</span><span class="o">.</span><span class="na">search</span><span class="o">(</span><span class="n">key</span><span class="o">)</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">pcResults</span><span class="o">.</span><span class="na">size</span> <span class="o">==</span> <span class="mi">1</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">cid</span> <span class="o">=</span> <span class="n">pcResults</span><span class="o">[</span><span class="mi">0</span><span class="o">]</span>
    <span class="n">pubchemLine</span> <span class="o">=</span> <span class="s2">"$item\tP662\t\"$cid\""</span>
  <span class="o">}</span>
<span class="o">}</span>

<span class="k">if</span> <span class="o">(!</span><span class="n">missing</span><span class="o">)</span> <span class="o">{</span>
  <span class="n">println</span> <span class="s2">"===================="</span>
  <span class="n">println</span> <span class="s2">"Already in Wikidata as "</span> <span class="o">+</span> <span class="n">results</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="mi">1</span><span class="o">,</span><span class="s2">"compound"</span><span class="o">)</span>
  <span class="n">println</span> <span class="s2">"===================="</span>
<span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
  <span class="n">statement</span> <span class="o">=</span> <span class="s2">"""
    CREATE
    
    $item\tDen\t\"chemical compound\"
    $item\tP233\t\"$smiles\"
    $item\tP274\t\"$formula\"
    $item\tP234\t\"$inchiShort\"
    $item\tP235\t\"$key\"
    $pubchemLine
  """</span>

  <span class="n">println</span> <span class="s2">"===================="</span>
  <span class="n">println</span> <span class="n">statement</span>
  <span class="n">println</span> <span class="s2">"===================="</span>
<span class="o">}</span>
</code></pre></div></div>

<p>The output of this script is a <a href="https://tools.wmflabs.org/wikidata-todo/quick_statements.php">QuickStatement</a> for
<a href="http://twitter.org/MagnusManske">Magnus Manske</a>’s tool (IMPORTANT: it’s not meant to automate editing Wikidata! I only automate
creating the input, which I carefully check (e.g. checking all stereochemistry is defined)! Note, how Bioclipse opens up the
structure in a viewer with ui.open()), which is a list of commands to create and edit entries in Wikidata. You need to enable
it first, but if you have an account, this is not too hard. Of course, the advantage is that it is a lot quicker. I have similar
script to create QuickStatements starting with only a <a href="https://www.ebi.ac.uk/chembl/">ChEMBL</a> identifier.</p>

<p>The QuickStatement for GDC-0853 looks like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    CREATE
    
    LAST Den "chemical compound"
    LAST P233 "O=C1C(=CC(=CN1C)c2ccnc(c2CO)N4C(=O)c3cc5c(n3CC4)CC(C)(C)C5)Nc6ncc(cc6)N7CCN(C[C@@H]7C)C8COC8"
    LAST P274 "C37H44N8O4"
    LAST P234 "1S/C37H44N8O4/c1-23-18-42(27-21-49-22-27)9-10-43(23)26-5-6-33(39-17-26)40-30-13-25(19-41(4)35(30)47)28-7-8-38-34(29(28)20-46)45-12-11-44-31(36(45)48)14-24-15-37(2,3)16-32(24)44/h5-8,13-14,17,19,23,27,46H,9-12,15-16,18,20-22H2,1-4H3,(H,39,40)/t23-/m0/s1"
    LAST P235 "WNEODWDFDXWOLU-QHCPKHFHSA-N"
    LAST P662 "86567195"
</code></pre></div></div>

<p>The first line creates a new Wikidata item, while the next ones add information about this compound. GDC-0853 is now also
<a href="https://www.wikidata.org/wiki/Q23304817">Q23304817</a>. The label I added manually afterwards. Note how the Bioclipse script found
the PubChem identifier, using the InChIKey. I also use this approach to add compounds to Wikidata that we have in
<a href="http://wikipathways.org/">WikiPathways</a>.</p>]]></content><author><name>Egon Willighagen</name></author><category term="acs" /><category term="bioclipse" /><category term="chembl" /><category term="inchi" /><category term="pubchem" /><category term="wikidata" /><category term="acssandiego" /><category term="doi:10.1186/1471-2105-8-59" /><category term="doi:10.3897/RIO.1.E7573" /><summary type="html"><![CDATA[Last week the huge, bi-annual ACS meeting took place (#ACSSanDiego), during which commonly new drug (leads) are disclosed. This time too, like this one tweeted by Bethany Halford:]]></summary></entry></feed>