<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/rdf.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-04-19T09:50:36+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/rdf.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">SWAT4HCLS 2026 Amsterdam this week</title><link href="https://chem-bla-ics.linkedchemistry.info/2026/03/22/swat4hcls-2026-amsterdam-this-week.html" rel="alternate" type="text/html" title="SWAT4HCLS 2026 Amsterdam this week" /><published>2026-03-22T00:00:00+00:00</published><updated>2026-03-22T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2026/03/22/swat4hcls-2026-amsterdam-this-week</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2026/03/22/swat4hcls-2026-amsterdam-this-week.html"><![CDATA[<p>Tomorrow, <a href="https://www.swat4ls.org/workshops/amsterdam2026/">SWAT4HCLS 2026</a> will start, again in Amsterdam.
The first SWAT4LS I attended <a href="https://chem-bla-ics.linkedchemistry.info/2009/11/21/swat4ls-linking-open-drug-data-to.html">was also in Amsterdam</a>, and the second meeting in Amsterdam I was <a href="https://chem-bla-ics.linkedchemistry.info/2016/12/18/my-swat4ls-poster-about-enanomapper.html">also there</a>. And I was in <a href="https://www.swat4ls.org/workshops/cambridge2015/index.php">Cambridge</a> (see
<a href="https://chem-bla-ics.blogspot.com/2015/12/swat4ls-in-cambridge.html">this post</a>),
<a href="https://www.swat4ls.org/workshops/antwerp2018/">Antwerp</a>  (no post), and at least to one of the two
<a href="https://www.swat4ls.org/workshops/leiden2024/">Leiden</a> meetings (also no posts, it seems).</p>

<p>I am looking forward to meet old friends, new friends (some whom I never met in person), and
recent collaborators (that I never met in person).
For those who will not be in Amsterdam, you can follow the meeting on social media with
the <a href="https://hashtags-hub.toolforge.org/swat4hcls">hashtag #swat4hcls</a>. And there is also
<a href="https://fediwall.biohackrxiv.org/">this BioHackrXiv Fediwall</a>, for those in the
<a href="https://en.wikipedia.org/wiki/Fediverse">fediverse</a>.</p>

<h3 id="scholia-demo">Scholia demo</h3>

<p>I will give a demo to update people on the work in the <a href="https://github.com/wdscholia/scholia">Scholia</a> project with
Daniel Mietchen, Peter Patel-Schneider, Konrad Linden, Johannes Kalmbach,
Lars Willighagen, Wolfgang Fahl, and Hannah Bast (also keynote in Amsterdam)
to <a href="https://chem-bla-ics.linkedchemistry.info/2026/02/28/rescuing-scholia-3-we-did-it.html">update the SPARQL queries</a>
we use to visualize data in <a href="https://www.wikidata.org/">Wikidata</a> to SPARQL 1.1 so that it can run on
<a href="https://qlever.dev/">Qlever</a>.
The abstract can be <a href="https://commons.wikimedia.org/wiki/File:Scholia_2026_Compliance_with_SPARQL_1.1.pdf">found in Wikimedia Commons</a>.</p>

<p>This was the outcome of many years figuring how to ensure Scholia could remain working. The
<a href="https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2026-02-17/Technology_report">Wikidata RDF graph split</a>
has given us many headaches, so many that just before christmas it became it could
be possible to survive the split, I was so happy, I realize I want to share this news. So, we teamed
up and wrote this demonstration contribution abstract. Thanks to everyone who made this happen!
Just to be clear, we are not done yet. The system is not running outside the Wikimedia Foundation
platforms.</p>

<p>One of the reviewer comments requested <a href="https://qlever.scholia.wiki/event/Q138033585">a Scholia page for the meeting</a>.
It has not been updated for the accepted speakers, but you can look at <a href="https://qlever.scholia.wiki/event-series/Q56846035">pages for past meetings</a>
to get an idea what you will find.</p>

<h3 id="swat4hcls-biohackathon-2026">SWAT4HCLS Biohackathon 2026</h3>

<p>There will also be <a href="https://www.swat4ls.org/workshops/amsterdam2026/swat4hcls-biohackathon-2026/">a biohackathon again</a>,
of course, with the <a href="https://index.biohackrxiv.org/tag/SWAT4HCLS26">option for BioHackRxiv reports</a>.
There are already <a href="https://www.swat4ls.org/workshops/amsterdam2026/swat4hcls-biohackathon-2026/">several pitches</a>,
including one that I submitted about Scholia.</p>]]></content><author><name>Egon Willighagen</name></author><category term="rdf" /><category term="sparql" /><category term="swat4ls" /><category term="wikidata" /><summary type="html"><![CDATA[Tomorrow, SWAT4HCLS 2026 will start, again in Amsterdam. The first SWAT4LS I attended was also in Amsterdam, and the second meeting in Amsterdam I was also there. And I was in Cambridge (see this post), Antwerp (no post), and at least to one of the two Leiden meetings (also no posts, it seems).]]></summary></entry><entry><title type="html">Where do the WikiPathways come from?</title><link href="https://chem-bla-ics.linkedchemistry.info/2026/02/22/where-do-the-wikipathways-come-from.html" rel="alternate" type="text/html" title="Where do the WikiPathways come from?" /><published>2026-02-22T00:00:00+00:00</published><updated>2026-02-22T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2026/02/22/where-do-the-wikipathways-come-from</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2026/02/22/where-do-the-wikipathways-come-from.html"><![CDATA[<p><a href="https://en.wikipedia.org/wiki/WikiPathways">WikiPathways</a> was <a href="https://qlever.scholia.wiki/topic/Q7999828#earliest-published-works">founded in 2008</a>,
in the year I left Wageningen (and we Nijmegen) and moved to Uppsala, Sweden. When we dediced to move back to The Netherlands in 2012, I got to opportunity
to join the Department of Bioinformatics (BiGCaT) and work on Open PHACTS. I had visited the group in March 2011 because I had a COST action
workshop near Maastricht (about nanoQSAR) and the bioinformatics group did <a href="https://wikipathways.org/">WikiPathways</a>.</p>

<p>When I joined, there were already hundreds of pathways, originating from various collaborations (see below).
Around the winter break, the question came up who are the people who have drawn all these pathways. And on the new website
this is not actually that easy to see. You can <a href="https://www.wikipathways.org/browse/table.html">browse all pathways</a>, or look up
<a href="https://www.wikipathways.org/browse/authors.html">author profiles</a>, but not all authors have done the same amount of work.
Moreover, at various points of time, batches of pathways from those collaborators were added. Often, these were added
by the <code class="language-plaintext highlighter-rouge">MaintBot</code> account, which is routinely hidden, and then the author who shows up as first author, is not even
the original author. And then we still have a lot of homology-converted pathways. These are pathways translated to
some species from a model species. You can find them in <a href="https://github.com/wikipathways/wikipathways-homology">this repository</a>.</p>

<p>But nowadays I do a lot in the WikiPathways project, among other things generate the RDF and maintain the code that does so.
And I realized that we have author information in the RDF too (created by <a href="https://orcid.org/0000-0001-5706-2163">Alex Pico</a>.
So, the idea came up to see who the “first authors” are of the WikiPathways (mind the <em>MaintBot</em> issue), and what we know
about them. Many already had their ORCID profiles linked from their profile pages, making it easy to look up their
expertises.</p>

<p>Now, that was in January. But it turned out that the author information in the RDF worked fine in the <code class="language-plaintext highlighter-rouge">.ttl</code> file
of a single pathway, but that the <em>series ordinal</em> (e.g. 1 for being first author) was bound to the author, and
a SPARQL query would not be able to figure out on which pathways someone was first author. I fixed this somewhere
in January, so in the <a href="https://github.com/wikipathways/wikipathways-help/discussions/221">February 10 release</a> the
improved data model was available.</p>

<p>Allow me to show what is now possible, with a few SPARQL queries. First, list the authors of a pathway, use
<a href="https://edu.nl/q9txc">this template</a> for <code class="language-plaintext highlighter-rouge">WP10</code>:</p>

<div class="language-sparql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">PREFIX</span><span class="w"> </span><span class="nn">dc</span><span class="o">:</span><span class="w">    </span><span class="nn">&lt;http://purl.org/dc/elements/1.1/&gt;</span><span class="w">
</span><span class="k">PREFIX</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="w">  </span><span class="nn">&lt;http://xmlns.com/foaf/0.1/&gt;</span><span class="w">
</span><span class="k">PREFIX</span><span class="w"> </span><span class="nn">wpq</span><span class="o">:</span><span class="w">   </span><span class="nn">&lt;http://www.wikidata.org/prop/qualifier/&gt;</span><span class="w">
</span><span class="k">PREFIX</span><span class="w"> </span><span class="nn">pav</span><span class="o">:</span><span class="w">   </span><span class="nn">&lt;http://purl.org/pav/&gt;</span><span class="w">

</span><span class="k">SELECT</span><span class="w"> </span><span class="nv">?pathway</span><span class="w"> </span><span class="nv">?version</span><span class="w"> </span><span class="nv">?ordinal</span><span class="w"> </span><span class="nv">?author_</span><span class="w"> </span><span class="nv">?name</span><span class="w"> </span><span class="nv">?orcid</span><span class="w"> </span><span class="nv">?page</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="k">VALUES</span><span class="w"> </span><span class="nv">?pathway</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nn">&lt;https://identifiers.org/wikipathways/WP10&gt;</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="nv">?author_</span><span class="w"> </span><span class="k">a</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="ss">Person</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">wp</span><span class="o">:</span><span class="ss">hasAuthorship</span><span class="w"> </span><span class="nv">?authorship</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?authorship</span><span class="w"> </span><span class="err">^</span><span class="nn">wp</span><span class="o">:</span><span class="ss">hasAuthorship</span><span class="w"> </span><span class="nv">?pathway</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">wpq</span><span class="o">:</span><span class="ss">series_ordinal</span><span class="w"> </span><span class="nv">?ordinal</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?pathway</span><span class="w"> </span><span class="nn">pav</span><span class="o">:</span><span class="ss">hasVersion</span><span class="w"> </span><span class="nv">?pathway_</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?pathway_</span><span class="w"> </span><span class="k">a</span><span class="w"> </span><span class="nn">wp</span><span class="o">:</span><span class="ss">Pathway</span><span class="w"> </span><span class="p">;</span><span class="w"> </span><span class="nn">wp</span><span class="o">:</span><span class="ss">isAbout</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="nn">gpml</span><span class="o">:</span><span class="ss">version</span><span class="w"> </span><span class="nv">?version</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="k">OPTIONAL</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?author_</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="ss">homepage</span><span class="w"> </span><span class="nv">?page</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="k">OPTIONAL</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?author_</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="ss">name</span><span class="w"> </span><span class="nv">?name</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="k">OPTIONAL</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?author_</span><span class="w"> </span><span class="nn">dc</span><span class="o">:</span><span class="ss">identifier</span><span class="w"> </span><span class="nv">?orcid</span><span class="w"> </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="k">ASC</span><span class="p">(</span><span class="nv">?pathway</span><span class="p">)</span><span class="w"> </span><span class="k">ASC</span><span class="p">(</span><span class="nv">?ordinal</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>We can see who the 8 people are who contributed to this pathway (we cannot actually see here what they contributed), and many
authors are member of the WikiPathways review team who focus more on technical quality than the biology. The first author,
however, often is the person who contributed most of the biological knowledge in the pathway, in this case
<a href="https://www.wikipathways.org/authors/A.Pandey">Akhilesh Pandey</a> from the NetSlim collaboration
(see doi:<a href="https://doi.org/10.1093/database/bar032">10.1093/database/bar032</a>):</p>

<p><img src="/assets/images/wikipathways_authorList.png" alt="" /></p>

<h2 id="collaborations">Collaborations</h2>

<p>Over time, multiple collaborations have taken place, like the one with NetSlim from the above query. In these collaborations,
the knowledge may not be digitized in WikiPathways as GPML by the biological experts. That encoding regularly is done
by others, but with those experts ensuring the quality. The following collaborations are examples, and
<a href="https://www.wikipathways.org/browse/communities.html">a fuller list is found online</a>:</p>

<ul>
  <li><a href="https://www.wikipathways.org/communities/wormbase_approved.html">WormBase</a> (doi:<a href="https://doi.org/10.1093/nar/gkt1063">10.1093/nar/gkt1063</a>)</li>
  <li><a href="https://www.wikipathways.org/communities/lipids.html">LIPID MAPS</a> (doi:<a href="https://doi.org/10.1093/nar/gkad896">10.1093/nar/gkad896</a>)</li>
  <li><a href="https://www.wikipathways.org/communities/imd.html">Inherited Metabolic Disorders</a> (doi:<a href="https://doi.org/10.1007/978-3-030-67727-5_73">10.1007/978-3-030-67727-5_73</a>)</li>
  <li><a href="https://www.wikipathways.org/communities/micronutrients.html">Micronutrients</a> (doi:<a href="https://doi.org/10.1007/s12263-010-0192-8">10.1007/s12263-010-0192-8</a>)</li>
</ul>

<p>We have collaborated with Reactome on various occassions (e.g. see doi:<a href="https://doi.org/10.1371/journal.pcbi.1004941">10.1371/journal.pcbi.1004941</a> and
doi:<a href="https://doi.org/10.1007/s12263-010-0192-8">10.1007/s12263-010-0192-8</a>), around plants (e.g. see doi:<a href="https://doi.org/10.1186/1939-8433-6-14">10.1186/1939-8433-6-14</a>),
around rare diseases in projects like <a href="https://www.ejprarediseases.org/">EJP-RD</a> and <a href="https://erdera.org/">ERDERA</a>, and around SARS-CoV-2.
For that, see these communities:</p>

<ul>
  <li><a href="https://www.wikipathways.org/communities/reactome.html">Reactome</a></li>
  <li><a href="https://www.wikipathways.org/communities/plants.html">Plants</a> (see also <a href="https://doi.org/10.37044/osf.io/m37f2_v1">this DBCLS BioHackathon 2025 paper</a>)</li>
  <li><a href="https://www.wikipathways.org/communities/rarediseases.html">Rare Diseases</a></li>
  <li><a href="https://www.wikipathways.org/communities/covid19.html">COVID-19</a></li>
</ul>

<p>And then there are pathways in WikiPathways supported by a full paper, but I will leave that for a later moment.</p>

<h2 id="author-statistics">Author statistics</h2>

<p>Back to the authors, because the new RDF model allows a few more nice queries. For example, we can check the number
of pathways with a certain number of authors, and then we find with the following query that there are two pathways
with up to 18 authors (<a href="https://edu.nl/mhjbw">try here</a>):</p>

<div class="language-sparql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">PREFIX</span><span class="w"> </span><span class="nn">dc</span><span class="o">:</span><span class="w">    </span><span class="nn">&lt;http://purl.org/dc/elements/1.1/&gt;</span><span class="w">
</span><span class="k">PREFIX</span><span class="w"> </span><span class="nn">wpq</span><span class="o">:</span><span class="w">   </span><span class="nn">&lt;http://www.wikidata.org/prop/qualifier/&gt;</span><span class="w">

</span><span class="k">SELECT</span><span class="w"> </span><span class="nv">?atLeast</span><span class="w"> </span><span class="p">(</span><span class="nb">COUNT</span><span class="p">(</span><span class="k">DISTINCT</span><span class="w"> </span><span class="nv">?pathway</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="nv">?count</span><span class="p">)</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="nv">?author_</span><span class="w"> </span><span class="k">a</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="ss">Person</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">wp</span><span class="o">:</span><span class="ss">hasAuthorship</span><span class="w"> </span><span class="nv">?authorship</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?authorship</span><span class="w"> </span><span class="err">^</span><span class="nn">wp</span><span class="o">:</span><span class="ss">hasAuthorship</span><span class="w"> </span><span class="nv">?pathway</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">wpq</span><span class="o">:</span><span class="ss">series_ordinal</span><span class="w"> </span><span class="nv">?atLeast</span><span class="w"> </span><span class="p">.</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="nv">?atLeast</span><span class="w">
  </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="k">ASC</span><span class="p">(</span><span class="nn">xsd</span><span class="o">:</span><span class="ss">integer</span><span class="p">(</span><span class="nv">?atLeast</span><span class="p">))</span><span class="w">
</span></code></pre></div></div>

<p>We can also look at the <a href="https://edu.nl/fkwy9">list of authors</a>, sorted by the number of pathways they are noted as first author on.
allong with their profile page on ORCID number:</p>

<div class="language-sparql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">PREFIX</span><span class="w"> </span><span class="nn">dc</span><span class="o">:</span><span class="w">    </span><span class="nn">&lt;http://purl.org/dc/elements/1.1/&gt;</span><span class="w">
</span><span class="k">PREFIX</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="w">  </span><span class="nn">&lt;http://xmlns.com/foaf/0.1/&gt;</span><span class="w">
</span><span class="k">PREFIX</span><span class="w"> </span><span class="nn">wpq</span><span class="o">:</span><span class="w">   </span><span class="nn">&lt;http://www.wikidata.org/prop/qualifier/&gt;</span><span class="w">
</span><span class="k">PREFIX</span><span class="w"> </span><span class="nn">pav</span><span class="o">:</span><span class="w">   </span><span class="nn">&lt;http://purl.org/pav/&gt;</span><span class="w">

</span><span class="k">SELECT</span><span class="w"> </span><span class="p">(</span><span class="nb">COUNT</span><span class="p">(</span><span class="k">DISTINCT</span><span class="w"> </span><span class="nv">?pathway</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="nv">?count</span><span class="p">)</span><span class="w"> </span><span class="nv">?name</span><span class="w"> </span><span class="nv">?orcid</span><span class="w"> </span><span class="nv">?page</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="k">VALUES</span><span class="w"> </span><span class="nv">?ordinal</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="s2">"1"</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="nv">?author_</span><span class="w"> </span><span class="k">a</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="ss">Person</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">wp</span><span class="o">:</span><span class="ss">hasAuthorship</span><span class="w"> </span><span class="nv">?authorship</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?authorship</span><span class="w"> </span><span class="err">^</span><span class="nn">wp</span><span class="o">:</span><span class="ss">hasAuthorship</span><span class="w"> </span><span class="nv">?pathway</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">wpq</span><span class="o">:</span><span class="ss">series_ordinal</span><span class="w"> </span><span class="nv">?ordinal</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?pathway</span><span class="w"> </span><span class="nn">pav</span><span class="o">:</span><span class="ss">hasVersion</span><span class="w"> </span><span class="nv">?pathway_</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?pathway_</span><span class="w"> </span><span class="k">a</span><span class="w"> </span><span class="nn">wp</span><span class="o">:</span><span class="ss">Pathway</span><span class="w"> </span><span class="p">;</span><span class="w"> </span><span class="nn">dcterms</span><span class="o">:</span><span class="ss">identifier</span><span class="w"> </span><span class="nv">?version</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="k">OPTIONAL</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?author_</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="ss">homepage</span><span class="w"> </span><span class="nv">?page</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="k">OPTIONAL</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?author_</span><span class="w"> </span><span class="nn">foaf</span><span class="o">:</span><span class="ss">name</span><span class="w"> </span><span class="nv">?name</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="k">OPTIONAL</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?author_</span><span class="w"> </span><span class="nn">dc</span><span class="o">:</span><span class="ss">identifier</span><span class="w"> </span><span class="nv">?orcid</span><span class="w"> </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="nv">?ordinal</span><span class="w"> </span><span class="nv">?name</span><span class="w"> </span><span class="nv">?orcid</span><span class="w"> </span><span class="nv">?page</span><span class="w">
  </span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="k">DESC</span><span class="p">(</span><span class="nv">?count</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>Is this the full story? No, of course not. There are so much details yet uncovered, but it gives a bit more
insight of where the biological knowledge in WikiPathways is coming from.</p>

<p>Want more peer review of the content? Then why not help setup a new community? Just ping me or
<a href="https://www.wikipathways.org/authors/Mkutmon">Martina</a>.</p>]]></content><author><name>Egon Willighagen</name></author><category term="wikipathways" /><category term="rdf" /><category term="justdoi:10.1093/database/bar032" /><category term="sparql" /><category term="justdoi:10.1093/nar/gkt1063" /><category term="justdoi:10.1093/nar/gkad896" /><category term="doi:10.1007/978-3-030-67727-5_73" /><category term="justdoi:10.1007/s12263-010-0192-8" /><category term="justdoi:10.1371/journal.pcbi.1004941" /><category term="justdoi:10.1007/s12263-010-0192-8" /><category term="justdoi:10.37044/osf.io/m37f2_v1" /><summary type="html"><![CDATA[WikiPathways was founded in 2008, in the year I left Wageningen (and we Nijmegen) and moved to Uppsala, Sweden. When we dediced to move back to The Netherlands in 2012, I got to opportunity to join the Department of Bioinformatics (BiGCaT) and work on Open PHACTS. I had visited the group in March 2011 because I had a COST action workshop near Maastricht (about nanoQSAR) and the bioinformatics group did WikiPathways.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/wikipathways_authorList.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/wikipathways_authorList.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">PlantMetWiki: a linked open data service for querying and analyzing plant pathway knowledge</title><link href="https://chem-bla-ics.linkedchemistry.info/2026/01/05/plantmetwiki.html" rel="alternate" type="text/html" title="PlantMetWiki: a linked open data service for querying and analyzing plant pathway knowledge" /><published>2026-01-05T00:00:00+00:00</published><updated>2026-01-05T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2026/01/05/plantmetwiki</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2026/01/05/plantmetwiki.html"><![CDATA[<p>Back on October I presented <em>Everything you always wanted to know: plant pathway modelling in WikiPathways</em> (doi:<a href="https://doi.org/10.5281/zenodo.18149988">10.5281/zenodo.18149988</a>)
at the <em>Knowledge Graphs for Plant and Microbiome Multiomics</em> symposium (see <a href="https://web.archive.org/web/20260105060309/https://www.linkedin.com/posts/elena-del-pup-840805164_knowledgegraphs-plantbiology-fairdata-activity-7351538387108978689-bgqh/">this archived LinkedIn post</a>)
on 14th October 2025 (<a href="https://www.youtube.com/watch?v=NgYRHiuBvpc">youtube recording</a>).
I had not found time yet to post about this meeting, but it was an awesome list of speakers, regrettable absense of some others, but resulting
in new contacts and some slowly evolving collaborations.</p>

<p>Previously, plant pathways were somewhat negatively prioritized at our BiGCaT research group. Something with Dutch academic politics. But that
was 10 years ago, and with the notion that human health very much involves the exposome, which includes live around humans, I think the
plant pathway science is important to human health. Even just the human health impacts of drops in biodiversity. Or the impact on our
nutrition supply chain of climate change.</p>

<p>Anyway, I am happy that <a href="https://github.com/elenadelpup">Elena</a> and <a href="https://github.com/DeniseSl22">Denise</a>
pulled me into a <a href="https://github.com/pathway-lod">collaboration</a> to create an RDF-based knowledge graph about plant pathways.
Their idea was to <a href="https://plantcyc.org/">PlantCyc</a> pathways (their license seems to allow that; doi:<a href="https://doi.org/10.1093/nar/gkae991">10.1093/nar/gkae991</a>),
convert that to GPML (<a href="https://github.com/pathway-lod/Cyc_to_wiki">by Max</a>) and then to RDF. That last step is where I come in. The details will follow later, but Elena announced
the project on LinkedIn (<a href="https://web.archive.org/web/20260105060958/https://www.linkedin.com/feed/update/urn:li:activity:7407756920041713664/">archived link</a>),
so time to blog about it myself too.</p>

<p>I am happy with this effort, not just because we now have pathways in RDF form for more than 500 species, but also
because it requires continued development of the WikiPathways solutions, like GPML and
<a href="https://github.com/PathVisio/libGPML">libGPML</a> and the RDF generation
code, but also BridgeDb (doi:<a href="https://doi.org/10.1186/1471-2105-11-5">10.1186/1471-2105-11-5</a>).
The latter provides the identifier mapping infrastructure, but needed to be extended for
the new species (something I had to do earlier this year for several <a href="https://www.wikipathways.org/search.html?query=caffeine+synthesis">caffeine synthesis pathways</a>
developed at the <a href="https://2025.biohackathon.org/">DBCLS BioHackathon 2025</a>).</p>

<p>Lars gave me a tip on how to scale this up (after <a href="https://github.com/bridgedb/datasources/commit/be64e5ac120d21fc70f742a090353fb801279b38">a manual addition</a>),
<a href="https://verifier.globalnames.org/">verifier.globalnames.org</a> (doi:<a href="https://doi.org/10.5281/zenodo.17245658">10.5281/zenodo.17245658</a>,
which greatly helped me out. It translates species names
into identifiers, and their JSON is very rich in that process as well as easy to process. So,
<a href="">a custom script</a> allowed me to update BridgeDb more efficiently. Highly recommended!</p>

<p>So, the resulting knowledge base is available at <a href="https://plantmetwiki.bioinformatics.nl/">plantmetwiki.bioinformatics.nl</a>
and looks like this (also big thanks to Marvin for support in setting this up!):</p>

<p><img src="/assets/images/plantmetwiki.png" alt="" /></p>]]></content><author><name>Egon Willighagen</name></author><category term="wikipathways" /><category term="gpml" /><category term="rdf" /><category term="justdoi:10.5281/zenodo.18149988" /><category term="justdoi:10.1093/nar/gkae991" /><category term="justdoi:10.1186/1471-2105-11-5" /><category term="justdoi:10.5281/zenodo.17245658" /><summary type="html"><![CDATA[Back on October I presented Everything you always wanted to know: plant pathway modelling in WikiPathways (doi:10.5281/zenodo.18149988) at the Knowledge Graphs for Plant and Microbiome Multiomics symposium (see this archived LinkedIn post) on 14th October 2025 (youtube recording). I had not found time yet to post about this meeting, but it was an awesome list of speakers, regrettable absense of some others, but resulting in new contacts and some slowly evolving collaborations.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/plantmetwiki.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/plantmetwiki.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Rescuing Scholia #2: getting closer</title><link href="https://chem-bla-ics.linkedchemistry.info/2025/12/31/rescuing-scholia-2-getting-close.html" rel="alternate" type="text/html" title="Rescuing Scholia #2: getting closer" /><published>2025-12-31T00:00:00+00:00</published><updated>2025-12-31T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2025/12/31/rescuing-scholia-2-getting-close</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2025/12/31/rescuing-scholia-2-getting-close.html"><![CDATA[<p>Three weeks ago, I wrote a the post <a href="https://chem-bla-ics.linkedchemistry.info/2025/12/08/rescuing-scholia.html">Rescuing Scholia: will we make it in time?</a>,
where I sketched a future without <a href="https://scholia.toolforge.org/">Scholia</a>. Scholia, started
<a href="https://chem-bla-ics.linkedchemistry.info/2023/01/27/scholia-timeline.html">almost 10 years ago</a>
and I think it is worth keeping around longer.</p>

<p>Fortunately, it looks like we will have a working replacement in time before the
<a href="https://www.mediawiki.org/wiki/Wikidata_Query_Service">WDQS</a> instance with all the
<a href="https://wikidata.org/">Wikidata</a> triples in a single SPARQL endpoint goes down,
likely in a week or so (even tho we may be behind <a href="https://openalex.org/works?page=1&amp;filter=cites:w2767995756">the citation peak</a>).</p>

<p>The work of the past year helped, for exampe, making it easier to configure Scholia for a different
endpoint and the asynchronous loading of panels (reducing the stress on the SPARQL end point).
Already in September, Prof. <a href="https://github.com/hannahbast">Hannah Bast</a> started
<a href="https://github.com/WDscholia/scholia/pull/2715">a branch</a> for the transition and various
hackathons this autumn, and the work by <a href="https://github.com/KonradLinden">Konrad Linded</a>
who explored and addressed some of the hurdles to take. The tips and suggestions from
Hannah and <a href="https://github.com/RobinTF">RobinTF</a> really made a difference. And also a huge thanks
to <a href="https://orcid.org/0000-0001-9488-1870">Daniel</a> who kept relentlessly pushing this forward.</p>

<p>When I posted my <a href="https://chem-bla-ics.linkedchemistry.info/2025/12/08/rescuing-scholia.html">will we make it</a> post,
there was a demo instance and a spreadsheet showing the state of each query. The instance
showed no human-readable labels. This was because the WDQS <code class="language-plaintext highlighter-rouge">wikibase:label</code> service 
was used a lot, and there is no replacement for that. Getting labels for all relevant
items is possible, but makes the queries a lot heavier and made even more queries
run out of memory. Various solutions were <a href="https://github.com/ad-freiburg/scholia/issues/17">discussed</a>,
Finn indicated he <a href="https://github.com/ad-freiburg/scholia/issues/17#issuecomment-3605952951">preferred a macro solution</a>,
which <a href="https://github.com/ad-freiburg/scholia/pull/20/changes">Lars implemented</a>, and
saw some tweaks after that. Then followed a long series of patches by particularly
<a href="https://github.com/pfps">Peter</a> to update all the SPARQL queries to have them use
the new labels macro. But plenty of other things were fixed or newly implemented,
such as <a href="https://github.com/WolfgangFahl">Wolfgang</a>’s <a href="https://qlever.scholia.wiki/backend">/backend</a>
page.</p>

<p>So, with one week to go, we need your help: as the weekly
<a href="https://www.wikidata.org/wiki/Wikidata:Status_updates/2025_12_29">Wikidata Status Update</a>
already indicated:</p>

<blockquote>
  <p>this month’s Scholia hackathon has moved Scholia closer to its planned switch to a
QLever backend. Beta testers can assist by exploring the
<a href="https://qlever.scholia.wiki/">interim QLever-backed Scholia instance</a>
and <a href="https://github.com/WDscholia/scholia/issues">reporting any issues</a>.</p>
</blockquote>

<p>And thanks to <a href="https://github.com/Adafede">Adriano</a> and others who already have!</p>

<p>Now, we are not done yet. The real instance at <a href="https://scholia.toolforge.org/">scholia.toolforge.org</a>
has seen ridiculous abuse by scrapers (and the main instance is regularly unusable, to be honest),
and we have no idea the new setup is powerful enough. And we need to point to the new servers anyway.
So, plenty of work is left to be done in the next few days.</p>

<p>But we are getting close. So, please give <a href="https://qlever.scholia.wiki/">qlever.scholia.wiki</a>
a go, and let us know your observations. As <a href="https://en.wikipedia.org/wiki/Linus%27s_law">Linus’s law</a> writes:</p>

<blockquote>
  <p>Given enough eyeballs, all bugs are shallow.</p>
</blockquote>]]></content><author><name>Egon Willighagen</name></author><category term="wikidata" /><category term="scholia" /><category term="sparql" /><category term="rdf" /><summary type="html"><![CDATA[Three weeks ago, I wrote a the post Rescuing Scholia: will we make it in time?, where I sketched a future without Scholia. Scholia, started almost 10 years ago and I think it is worth keeping around longer.]]></summary></entry><entry><title type="html">Rescuing Scholia: will we make it in time?</title><link href="https://chem-bla-ics.linkedchemistry.info/2025/12/08/rescuing-scholia.html" rel="alternate" type="text/html" title="Rescuing Scholia: will we make it in time?" /><published>2025-12-08T00:00:00+00:00</published><updated>2025-12-08T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2025/12/08/rescuing-scholia</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2025/12/08/rescuing-scholia.html"><![CDATA[<p>What <a href="https://chem-bla-ics.linkedchemistry.info/2023/01/27/scholia-timeline.html">started out in 2016 on Twitter</a> became a
<a href="https://meta.wikimedia.org/wiki/Coolest_Tool_Award/Full_history">(small) award winning</a>
<a href="https://chem-bla-ics.linkedchemistry.info/tag/scholia">decade long collaborative project</a>.
Unfortunately, the future is not clear. We are at odds if it will survice the growth of Wikidata
and in particularly the <a href="https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split">SPARQL graph split</a>.
To be clear, the choice for Blazegraph initially worked great, but after it was bought by a big
company, developed halted. Very unfortunate for Wikidata. Unlike earlier, we no longer have funding, and rewriting Scholia
at this scale takes a good bit of effort. We already
<a href="https://chem-bla-ics.linkedchemistry.info/2025/04/20/the-april-2025-scholia-hackathon.html">held a few hackathons</a>.</p>

<p>So far, we have been able to continue to use a <em>legacy</em> SPARQL endpoint with all the data, but in exactly one month
that endpoint will be sunset. And we are <strong>not</strong> ready.</p>

<h2 id="rescuing-scholia">Rescuing Scholia</h2>
<p>Daniel and Lane have been leading an effort to rescue Scholia. The hackathons were part of this effort. It seems
that <a href="https://en.wikipedia.org/wiki/QLever">QLever</a> is the only route left. Earlier efforts to rewrite the more
than 350 Scholia SPARQL queries to support the graph split have basically failed. The complexity is far too high.
QLever, however, provides the full graph and since recently full SPARQL 1.1 support. That is also not enough to
reproduce the full Scholia functionality, but it seems to get us far.
Importantly, the data may not update as frequently as the <a href="https://www.mediawiki.org/wiki/Wikidata_Query_Service">WDQS</a>,
and that is another complexity to take into account. Particularly, all the 404 pages.</p>

<p>So, in the next weeks, we have to complete rewriting all those queries as queries that QLever can handle. A team
of people have done great work already, <a href="https://github.com/ad-freiburg/scholia/issues?q=is%3Aissue%20author%3AKonradLinden">including Konrad</a>.</p>

<p>I hope we make it in time.</p>]]></content><author><name>Egon Willighagen</name></author><category term="scholia" /><category term="wikidata" /><category term="rdf" /><category term="sparql" /><summary type="html"><![CDATA[What started out in 2016 on Twitter became a (small) award winning decade long collaborative project. Unfortunately, the future is not clear. We are at odds if it will survice the growth of Wikidata and in particularly the SPARQL graph split. To be clear, the choice for Blazegraph initially worked great, but after it was bought by a big company, developed halted. Very unfortunate for Wikidata. Unlike earlier, we no longer have funding, and rewriting Scholia at this scale takes a good bit of effort. We already held a few hackathons.]]></summary></entry><entry><title type="html">Beilstein journals contain Bioschemas</title><link href="https://chem-bla-ics.linkedchemistry.info/2025/02/13/beiltein-journal-has-bioschemas.html" rel="alternate" type="text/html" title="Beilstein journals contain Bioschemas" /><published>2025-02-13T00:00:00+00:00</published><updated>2025-02-13T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2025/02/13/beiltein-journal-has-bioschemas</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2025/02/13/beiltein-journal-has-bioschemas.html"><![CDATA[<p>Two weeks ago, the <a href="https://www.beilstein-journals.org/bjoc/news/LAFGBV6PT5ASC5R7JOKSEXOQYM">Beilstein Institute announced Bioschemas support in their journals</a>:</p>

<blockquote>
  <p>We streamline the discoverability of your research by incorporating machine-readable chemical information into many of our published articles.
This includes the conversion of chemical structures from submitted ChemDraw files to InChI strings and validating them using open-source tools.</p>
</blockquote>

<p>The idea is far from new and has been around for two decades. But the <a href="https://scholia.toolforge.org/publisher/Q4881267">two Beilstein journals</a>
(both <a href="https://en.wikipedia.org/wiki/Diamond_open_access">diamond Open Access</a>), actually integrated into their active publishing model.
That has been trialed and put in action before. For example, there was (is?) <a href="https://doi.org/10.59350/ne4rf-wey66">Project Prospect</a>
(2007), <a href="https://chem-bla-ics.linkedchemistry.info/2009/03/19/nature-chemistry-improves-publishing.html">chemical structure annotation in Nature Chemistry</a>
(2009), <a href="https://chem-bla-ics.linkedchemistry.info/2014/02/21/slow-publishing-innovation.html">SMILES in the ACS Journal of Medicinal Chemistry</a>
(2014) (doi:<a href="https://doi.org/10.1021/jm5002056">10.1021/jm5002056</a>),
and <em>FAIR chemical structures in the Journal of Cheminformatics</em> (2021) (doi:<a href="https://doi.org/10.1186/s13321-021-00520-4">10.1186/s13321-021-00520-4</a>).</p>

<p>But this announcement is a new step. I like how validation of the chemical structures is part of the approach, and I like
how they use the <a href="https://bioschemas.org/">Bioschemas</a> extention of <a href="https://schema.org/">schema.org</a>. The last because
they use two Bioschemas types/profiles that contributed to or initiated, respectively: <a href="https://bioschemas.org/profiles/MolecularEntity/0.5-RELEASE">MolecularEntity</a>
and <a href="https://bioschemas.org/profiles/ChemicalSubstance/0.4-RELEASE">ChemicalSubstance</a>.</p>

<p>First stop for me is to check the schema.org annotation with a validation tool, like <a href="https://search.google.com/test/rich-results">Google’s Rich Results Test</a>.
That gives an idea how they may have have their search engine pick it up. The test article I was given on LinkedIn is
Xiao <em>et al.</em>’s <em>Molecular diversity of the reactions of MBH carbonates of isatins and various nucleophiles</em>
(doi:<a href="https://doi.org/10.3762/bjoc.21.21">10.3762/bjoc.21.21</a>) in the <a href="https://scholia.toolforge.org/venue/Q2894008">Beilstein Journal of Organic Chemistry</a>,
and we indeed <a href="https://search.google.com/test/rich-results/result?id=FRW9wBOpXtsMp9TLUV6SfQ">see the schema.org annotation show up</a>:</p>

<p><img src="/assets/images/bjoc_bioschemas.png" alt="" /></p>

<p>And because of the use of open standards, extracting the information is not so hard with, for example here,
Bacting (doi:<a href="https://doi.org/10.21105/joss.02558">10.21105/joss.02558</a>), based on a 2022 script from the NanoSafety Cluster
projects NanoCommons and SbD4Nano:</p>

<div class="language-groovy highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@Grab</span><span class="o">(</span><span class="n">group</span><span class="o">=</span><span class="s1">'io.github.egonw.bacting'</span><span class="o">,</span> <span class="n">module</span><span class="o">=</span><span class="s1">'managers-rdf'</span><span class="o">,</span> <span class="n">version</span><span class="o">=</span><span class="s1">'1.0.4'</span><span class="o">)</span>
<span class="nd">@Grab</span><span class="o">(</span><span class="n">group</span><span class="o">=</span><span class="s1">'io.github.egonw.bacting'</span><span class="o">,</span> <span class="n">module</span><span class="o">=</span><span class="s1">'managers-ui'</span><span class="o">,</span> <span class="n">version</span><span class="o">=</span><span class="s1">'1.0.4'</span><span class="o">)</span>
<span class="nd">@Grab</span><span class="o">(</span><span class="n">group</span><span class="o">=</span><span class="s1">'io.github.egonw.bacting'</span><span class="o">,</span> <span class="n">module</span><span class="o">=</span><span class="s1">'net.bioclipse.managers.jsoup'</span><span class="o">,</span> <span class="n">version</span><span class="o">=</span><span class="s1">'1.0.4'</span><span class="o">)</span>

<span class="n">bioclipse</span> <span class="o">=</span> <span class="k">new</span> <span class="n">net</span><span class="o">.</span><span class="na">bioclipse</span><span class="o">.</span><span class="na">managers</span><span class="o">.</span><span class="na">BioclipseManager</span><span class="o">(</span><span class="s2">"."</span><span class="o">);</span>
<span class="n">rdf</span> <span class="o">=</span> <span class="k">new</span> <span class="n">net</span><span class="o">.</span><span class="na">bioclipse</span><span class="o">.</span><span class="na">managers</span><span class="o">.</span><span class="na">RDFManager</span><span class="o">(</span><span class="s2">"."</span><span class="o">);</span>
<span class="n">jsoup</span> <span class="o">=</span> <span class="k">new</span> <span class="n">net</span><span class="o">.</span><span class="na">bioclipse</span><span class="o">.</span><span class="na">managers</span><span class="o">.</span><span class="na">JSoupManager</span><span class="o">(</span><span class="s2">"."</span><span class="o">);</span>

<span class="n">articles</span> <span class="o">=</span> <span class="o">[</span>
   <span class="n">args</span><span class="o">[</span><span class="mi">0</span><span class="o">]</span>
<span class="o">]</span>

<span class="n">kg</span> <span class="o">=</span> <span class="n">rdf</span><span class="o">.</span><span class="na">createInMemoryStore</span><span class="o">()</span>

<span class="k">for</span> <span class="o">(</span><span class="n">article</span> <span class="k">in</span> <span class="n">articles</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">htmlContent</span> <span class="o">=</span> <span class="n">bioclipse</span><span class="o">.</span><span class="na">download</span><span class="o">(</span><span class="n">article</span><span class="o">)</span>

    <span class="n">htmlDom</span> <span class="o">=</span> <span class="n">jsoup</span><span class="o">.</span><span class="na">parseString</span><span class="o">(</span><span class="n">htmlContent</span><span class="o">)</span>

    <span class="c1">// application/ld+json</span>

    <span class="n">bioschemasSections</span> <span class="o">=</span> <span class="n">jsoup</span><span class="o">.</span><span class="na">select</span><span class="o">(</span><span class="n">htmlDom</span><span class="o">,</span> <span class="s2">"script[type='application/ld+json']"</span><span class="o">);</span>

    <span class="k">for</span> <span class="o">(</span><span class="n">section</span> <span class="k">in</span> <span class="n">bioschemasSections</span><span class="o">)</span> <span class="o">{</span>
        <span class="n">bioschemasJSON</span> <span class="o">=</span> <span class="n">section</span><span class="o">.</span><span class="na">html</span><span class="o">()</span>
        <span class="n">rdf</span><span class="o">.</span><span class="na">importFromString</span><span class="o">(</span><span class="n">kg</span><span class="o">,</span> <span class="n">bioschemasJSON</span><span class="o">,</span> <span class="s2">"JSON-LD"</span><span class="o">)</span>
    <span class="o">}</span>
<span class="o">}</span>

<span class="n">turtle</span> <span class="o">=</span> <span class="n">rdf</span><span class="o">.</span><span class="na">asTurtle</span><span class="o">(</span><span class="n">kg</span><span class="o">);</span>

<span class="n">println</span> <span class="s2">"#"</span> <span class="o">+</span> <span class="n">rdf</span><span class="o">.</span><span class="na">size</span><span class="o">(</span><span class="n">kg</span><span class="o">)</span> <span class="o">+</span> <span class="s2">" triples detected in the JSON-LD"</span>
<span class="c1">// println turtle</span>


<span class="n">sparql</span> <span class="o">=</span> <span class="s2">"""
PREFIX schema: &lt;http://schema.org/&gt;
SELECT ?entity ?inchikey ?smiles WHERE {
  ?entity a schema:MolecularEntity .
  OPTIONAL { ?entity schema:inChIKey ?inchikey }
  OPTIONAL { ?entity schema:smiles ?smiles }
}
"""</span>

<span class="n">results</span> <span class="o">=</span> <span class="n">rdf</span><span class="o">.</span><span class="na">sparql</span><span class="o">(</span><span class="n">kg</span><span class="o">,</span> <span class="n">sparql</span><span class="o">)</span>

<span class="k">for</span> <span class="o">(</span><span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="o">;</span><span class="n">i</span><span class="o">&lt;=</span><span class="n">results</span><span class="o">.</span><span class="na">rowCount</span><span class="o">;</span><span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
  <span class="n">println</span> <span class="s2">"${results.get(i, "</span><span class="n">inchikey</span><span class="s2">")}\t${results.get(i, "</span><span class="n">smiles</span><span class="s2">")}"</span>
<span class="o">}</span>
</code></pre></div></div>

<p>The output is a simple table:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>MGAPJMNPGGTFHJ-JEIPZWNWSA-N     CN1C(=O)/C(=C/2\C3=CC(=CC=C3N(CC4=CC=CC=C4)C2=O)Cl)/C(=P(C5=CC=CC=C5)(C6=CC=CC=C6)C7=CC=CC=C7)C1=O
XEWMQVUVGAHESA-UHFFFAOYSA-N     CC1=CC=C(C=C1)NC2=C(C3C4=CC(=CC=C4N(CC5=CC=CC=C5)C3=O)C)C(=O)N(C)C2=O
UVTJORFYHPGJDZ-PYCFMQQDSA-N     CCCCN1C2=CC=C(C)C=C2/C(=C(\C#N)/CNC3=CC=C(C)C=C3)/C1=O
ILWGDUYVQRAMMG-PGMHBOJBSA-N     CCCCN1C2=CC=C(C)C=C2/C(=C(\C#N)/CNC3=CC=C(C=C3)Cl)/C1=O
CAFIBKBZWJFZCW-FXBPSFAMSA-N     CCCCN1C2=CC=C(C)C=C2/C(=C(\C#N)/CNC3=CC=CC=C3)/C1=O
UOJSFLANMVIMBV-UHFFFAOYSA-N     CCCCN1C2=CC=C(C)C=C2C(C3=C(C(=O)N(C)C3=O)NC4=CC=C(C=C4)Cl)C1=O
VNJBTGZXAGHCSO-OAPYJULQSA-N     COC(=O)/C(=C\1/C2=C(C=CC=C2)N(CC3=CC=CC=C3)C1=O)/C=P(C4=CC=CC=C4)(C5=CC=CC=C5)C6=CC=CC=C6
KJXQRAKSOANQTJ-GFMRDNFCSA-N     CC1=CC=C(C=C1)NC/C(=C\2/C3=C(C=CC=C3)N(CC4=CC=CC=C4)C2=O)/C#N
IGEBJMZDOPBFGF-UHFFFAOYSA-N     CCCCN1C2=CC=C(C)C=C2C(C3=C(C(=O)N(C)C3=O)NC4=CC=CC=C4)C1=O
SSANVPNESOMKOM-AWQADKOQSA-N     C1=CC=C(C=C1)CN2C3=CC=C(C=C3/C(=C(/C#N)\C=P(C4=CC=CC=C4)(C5=CC=CC=C5)C6=CC=CC=C6)/C2=O)Cl
GEHWHSHQSIOZKL-NVQSTNCTSA-N     CCCCN1C2=CC=C(C=C2/C(=C\3/C(=P(C4=CC=CC=C4)(C5=CC=CC=C5)C6=CC=CC=C6)C(=O)N(C)C3=O)/C1=O)Cl
PALRSQOHFLRWDH-UHFFFAOYSA-N     CCCCN1C2=CC=C(C)C=C2C(C3=C(C(=O)N(C)C3=O)NC4=CC=C(C=C4)OC)C1=O
KBFODZMDSAFLFR-UHFFFAOYSA-N     CN1C(=O)C(=C(C1=O)NC2=CC(=CC=C2)Cl)C3C4=CC(=CC=C4N(CC5=CC=CC=C5)C3=O)Cl
JCGAVVZYXDJPBU-GFMRDNFCSA-N     CC1=C(C=CC=C1)NC/C(=C\2/C3=C(C=CC=C3)N(CC4=CC=CC=C4)C2=O)/C#N
DZFPCPDEQGLPLY-UHFFFAOYSA-N     CCCCN1C2=CC=C(C)C=C2C(C3=C(C(=O)N(C)C3=O)NC4=CC=C(C)C=C4)C1=O
XMRNJCJUOXYXJU-DAFNUICNSA-N     CC1=CC=C(C=C1)NC/C(=C\2/C3=CC(=CC=C3N(CC4=CC=CC=C4)C2=O)C)/C#N
SSDSNBBHEUUKGI-UHFFFAOYSA-N     CC1=CC=C2C(=C1)C(C3=C(C(=O)N(C)C3=O)N(C)C4=CC=CC=C4)C(=O)N2CC5=CC=CC=C5
USFYPRDMNXMWPO-UHFFFAOYSA-N     CCCCN1C2=CC=C(C)C=C2C(C3=C(C(=O)N(C)C3=O)NC4=CC=C(C=C4)Br)C1=O
XYHTWFULRHTEAG-MUGXBBEHSA-N     CCCCN1C2=CC=C(C)C=C2/C(=C(/C#N)\C=P(C3=CC=CC=C3)(C4=CC=CC=C4)C5=CC=CC=C5)/C1=O
XALDZIBHNNIVAM-UHFFFAOYSA-N     CCCCN1C2=CC=C(C)C=C2C(C3=C(C(=O)N(C)C3=O)NC4=C(C=CC=C4)O)C1=O
TUTWQHBRQPMLME-OAPYJULQSA-N     COC(=O)/C(=C\1/C2=CC(=CC=C2N(CC3=CC=CC=C3)C1=O)Cl)/C=P(C4=CC=CC=C4)(C5=CC=CC=C5)C6=CC=CC=C6
IYEHFTMZZMIPRU-UHFFFAOYSA-N     CC1=CC=C(C=C1)NC2=C(C3C4=CC(=CC=C4N(CC5=CC=CC=C5)C3=O)Cl)C(=O)N(C)C2=O
KBSDGNPLIPXCEX-UHFFFAOYSA-N     CCCCN1C2=CC=C(C)C=C2C(C3=C(C(=O)N(C)C3=O)NCC4=CC=CC=C4)C1=O
BQGIUMITIGHBSD-UHFFFAOYSA-N     CCCCNC1=C(C2C3=CC(=CC=C3N(CC4=CC=CC=C4)C2=O)C)C(=O)N(C)C1=O
PNSOLOPHIVUPOZ-MNDPQUGUSA-N     CCCCNC/C(=C\1/C2=CC(=CC=C2N(CCCC)C1=O)C)/C#N
HLTBKJRJOIZCMJ-PYCFMQQDSA-N     CCCCN1C2=CC=C(C)C=C2/C(=C(\C#N)/CN(C)C3=CC=CC=C3)/C1=O
FFLHFLUBMRBQTB-UHFFFAOYSA-N     CCCCN1C2=CC=C(C=C2C(C3=C(C(=O)N(C)C3=O)NC4=CC=C(C)C=C4)C1=O)F
FOQOVOLYYARWPA-NKFKGCMQSA-N     C1=CC=C(C=C1)CN2C3=C(C=CC=C3)/C(=C(\C#N)/CNC4=CC(=CC=C4)Cl)/C2=O
KLEPCAQFOXJLNV-UHFFFAOYSA-N     CC1=C(C=CC=C1)NC2=C(C3C4=CC(=CC=C4N(CC5=CC=CC=C5)C3=O)Cl)C(=O)N(C)C2=O
</code></pre></div></div>

<p>That also made me realize that there are not chemical names in the annotation. That would be really useful to move things
forward. Then again, PubChem will likely just generate the IUPAC name, since they have access to such software anyway.
They have teamed up with PubChem which will index it, but I will be interested in seeing how to use this for
<code class="language-plaintext highlighter-rouge">main subject</code> annotation in <a href="https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry">Wikidata</a>.</p>

<p>A final note for now, the model they use is annotate the article with chemical substances (<code class="language-plaintext highlighter-rouge">ChemicalSubstance</code>) with
(one or more?) molecular entities (`MolecularEntity’). That is a model that scales well to their other journal,
the <a href="https://scholia.toolforge.org/venue/Q814756">Beilstein Journal of Nanotechnology</a>. But scraping that is for another post.</p>]]></content><author><name>Egon Willighagen</name></author><category term="bioschemas" /><category term="rdf" /><category term="chemistry" /><category term="cito:citesForInformation:10.59350/ne4rf-wey66" /><category term="cito:citesForInformation:10.1186/s13321-021-00520-4" /><category term="cito:citesForInformation:10.1021/jm5002056" /><category term="cito:usesDataFrom:10.3762/bjoc.21.21" /><category term="cito:usesMethodIn:10.21105/joss.02558" /><category term="cito:citesForInformation:10.59350/40377-hz881" /><category term="beilstein" /><summary type="html"><![CDATA[Two weeks ago, the Beilstein Institute announced Bioschemas support in their journals:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/bjoc_bioschemas.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/bjoc_bioschemas.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Two meetings: ELIXIR Toxicology and FAIR4ChemNL</title><link href="https://chem-bla-ics.linkedchemistry.info/2024/06/10/two-meetings.html" rel="alternate" type="text/html" title="Two meetings: ELIXIR Toxicology and FAIR4ChemNL" /><published>2024-06-10T00:00:00+00:00</published><updated>2024-06-10T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2024/06/10/two-meetings</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2024/06/10/two-meetings.html"><![CDATA[<p>Noting that in the coming week I am not attending the <a href="https://elixir-europe.org/events/elixir-all-hands-2024">ELIXIR All Hands in Uppsala</a>.
Having lived in (and around) Uppsala for more than three years, I am disappointed and with the first stories from colleagues coming
in even more. But it has been a way too busy year, I have much to finish up, and I need to take care of myself too. I am not 32 anymore.</p>

<p>But in the past two weeks I did attend two workshops. The first was a <a href="https://www.aanmelder.nl/intoxicom2024firstworkshop">workshop</a> by the
<a href="https://elixir-europe.org/communities/toxicology">ELIXIR Toxicology Community</a>, which was held in Utrecht/NL. The programme was around
FAIR and included two really nice hands-on sessions where we developed drafts for <a href="https://faircookbook.elixir-europe.org/">FAIR Cookbook</a>
recipes (see also doi:<a href="https://doi.org/10.1038/s41597-023-02166-3">10.1038/s41597-023-02166-3</a>) and for
<a href="https://www.go-fair.org/how-to-go-fair/fair-implementation-profile/">FAIR Implementation Profiles</a>
(doi:<a href="https://doi.org/10.1007/978-3-030-65847-2_13">10.1007/978-3-030-65847-2_13</a>). We will write up a
<a href="https://biohackrxiv.org/discover">BioHackrXiv</a> report.</p>

<p>The second workshop was last week, the <a href="https://tdcc.nl/evenementen/fair4chemnl-workshop/">FAIR4ChemNL workshop</a>, which was also held
in Utrecht/NL. The topic was FAIR in chemistry, and we discussed various aspects. There was a significant participant group from the
German NFDI4Cat project (“Cat” is short for (chemical) catalysis), which recently published a nice analysis of several ontologies
(doi:<a href="https://doi.org/10.1186/s13321-024-00807-2">10.1186/s13321-024-00807-2</a>). And there was also a lot of mention of RDF and SPARQL.</p>

<p>I think it is time for a new special issue around semantic web technologies.</p>]]></content><author><name>Egon Willighagen</name></author><category term="elixir" /><category term="fair" /><category term="chemistry" /><category term="doi:10.1038/S41597-023-02166-3" /><category term="justdoi:10.1007/978-3-030-65847-2_13" /><category term="justdoi:10.1186/S13321-024-00807-2" /><category term="rdf" /><category term="sparql" /><category term="fair4chemnl" /><summary type="html"><![CDATA[Noting that in the coming week I am not attending the ELIXIR All Hands in Uppsala. Having lived in (and around) Uppsala for more than three years, I am disappointed and with the first stories from colleagues coming in even more. But it has been a way too busy year, I have much to finish up, and I need to take care of myself too. I am not 32 anymore.]]></summary></entry><entry><title type="html">New paper: A template wizard for the cocreation of machine-readable data-reporting to harmonize the evaluation of (nano)materials</title><link href="https://chem-bla-ics.linkedchemistry.info/2024/05/27/from-spreadsheets-to-rdf.html" rel="alternate" type="text/html" title="New paper: A template wizard for the cocreation of machine-readable data-reporting to harmonize the evaluation of (nano)materials" /><published>2024-05-27T00:00:00+00:00</published><updated>2024-05-27T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2024/05/27/from-spreadsheets-to-rdf</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2024/05/27/from-spreadsheets-to-rdf.html"><![CDATA[<p>I was about to call this blog post <em>From spreadsheets to RDF</em>, after <a href="https://chem-bla-ics.linkedchemistry.info/2024/05/20/from-papers-to-rdf.html">the post last week</a>.
But then I decided to just use the pattern I typically use. Why I wanted to use that shorter term in the first
place was that one of the thing I like about the <a href="https://sourceforge.net/projects/ambit/">AMBIT software</a>
(of OpenTox and eNanoMapper fame) is its
RDF support (see doi:<a href="https://doi.org/10.1186/1756-0500-4-487">10.1186/1756-0500-4-487</a>). But
<a href="https://chem-bla-ics.linkedchemistry.info/tag/rdf">RDF</a>, ontologies,
those are hard things. And unlike mathematics, we do not have simple objects like integer numbers or simple
operators. Well, I think we do, and we talk about them. But there is no obligatory education. Just like
any biologist needs to know what <em>1 + 2</em> means, I think any biologist needs basic knowledge about how
knowledge graphs work. But sometimes feels like a taboo, like cursing in the life sciences church.</p>

<p>So, there we are. This is where spreadsheets come in. If done well, they combine aspects of knowledge graphs
with usability and can even cover a good bit of the learnability. This is what is described in this new
paper about templates in the <a href="https://www.nanosafetycluster.eu/">EU NanoSafety Cluster</a>: <em>A template wizard
for the cocreation of machine-readable data-reporting to harmonize the evaluation of (nano)materials</em>
(doi:<a href="https://doi.org/10.1038/s41596-024-00993-1">10.1038/s41596-024-00993-1</a>).</p>

<p>The learnability comes in with the spreadsheet templates (“this is how we did it”) and a “wizard” around
it guides the user with the selection of a template but also can provide feedback on the template. The
technical term for that is “validator”, but it can be tought of as a spelling checker. Computers are good at
finding contradictions (the lack of a pattern), though less good at ranking the alternatives (which is
the cause of hallucinations in AI approaches).</p>

<p>And to return to the RDF, software like AMBIT can read these templates, use the semantics linked to the
template, and make the FAIR static spreadsheets (good for archiving on Zenodo!) available as FAIR interactive
data (good for exploration and machine learning), and as RDF (good for data integration).</p>

<p>Congrats to <a href="http://orcid.org/0000-0002-4322-6179">Nina</a> and the various EU NanoSafety Cluster projects!</p>]]></content><author><name>Egon Willighagen</name></author><category term="rdf" /><category term="opentox" /><category term="fair" /><category term="doi:10.1186/1756-0500-4-487" /><category term="doi:10.1038/S41596-024-00993-1" /><summary type="html"><![CDATA[I was about to call this blog post From spreadsheets to RDF, after the post last week. But then I decided to just use the pattern I typically use. Why I wanted to use that shorter term in the first place was that one of the thing I like about the AMBIT software (of OpenTox and eNanoMapper fame) is its RDF support (see doi:10.1186/1756-0500-4-487). But RDF, ontologies, those are hard things. And unlike mathematics, we do not have simple objects like integer numbers or simple operators. Well, I think we do, and we talk about them. But there is no obligatory education. Just like any biologist needs to know what 1 + 2 means, I think any biologist needs basic knowledge about how knowledge graphs work. But sometimes feels like a taboo, like cursing in the life sciences church.]]></summary></entry><entry><title type="html">New paper: From papers to RDF-based integration of physicochemical data and adverse outcome pathways for nanomaterials</title><link href="https://chem-bla-ics.linkedchemistry.info/2024/05/20/from-papers-to-rdf.html" rel="alternate" type="text/html" title="New paper: From papers to RDF-based integration of physicochemical data and adverse outcome pathways for nanomaterials" /><published>2024-05-20T00:00:00+00:00</published><updated>2024-05-20T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2024/05/20/from-papers-to-rdf</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2024/05/20/from-papers-to-rdf.html"><![CDATA[<p>Making something FAIR is hard, particularly when you do more than making something findable. We’ve seen before that
making something usefully findable <a href="https://chem-bla-ics.blogspot.com/2020/10/new-paper-semi-automated-workflow-for.html">requires deep indexing</a>,
and already that continues to be difficult, because we are not seeing it enough.
So, when I thought convert a <a href="https://chem-bla-ics.blogspot.com/2021/05/new-strategy-towards-generation-of.html">paper led by Hoet’s lab in Leuven</a>
into machine-actionable RDF to make it FAIR, I gravely underestimated the amount of work.
<a href="https://scholia.toolforge.org/author/Q99306396">Jeaphianne</a> et al. did an awesome job on this work
(doi:<a href="https://doi.org/10.1186/s13321-024-00833-0">10.1186/s13321-024-00833-0</a>).</p>

<p>The idea was simple: write up which nanomaterial (type) activates which molecular initiating event.
It would simply annotate each material with a unique identifier to link it to databases like
<a href="https://enanomapper.adma.ai/">eNanoMapper</a> and <a href="https://doi.org/10.3389/fphy.2023.1271842">NanoCommons</a>
and it would use unique identifiers for the
<a href="https://chem-bla-ics.blogspot.com/2022/05/new-providing-adverse-outcome-pathways.html">Adverse Outcome Pathway</a>) (AOP) key events.
As such, it would make a direct link in the growing linked open data cloud between the AOPs
and the nanomaterial databases.</p>

<p>Unfortunately, it was quickly discovered that actually reusing this new datasets requires rich annotation (metadata!)
of the materials and the materials from the source paper were not yet in material databases.
And then the cumbersome start was started, resulting in a very rich data model describing the
key events, the materials, the assays used, and the original papers themselves:</p>

<p><img src="/assets/images/13321_2024_833_Fig1_HTML.png" alt="" /></p>

<p>But the work has not finished yet. The paper assigned <a href="https://chem-bla-ics.blogspot.com/2022/09/nanomaterial-identifiers-erm-identifier.html">ERM identifiers</a>
to all included materials, and now these need to be added to new <a href="https://nanocommons.github.io/erm-database/">ERM Identifier Database</a>
under development.</p>]]></content><author><name>Egon Willighagen</name></author><category term="fair" /><category term="rdf" /><category term="doi:10.1186/S13321-024-00833-0" /><category term="doi:10.14573/ALTEX.2102191" /><category term="doi:10.3390/NANO10102068" /><category term="erm" /><category term="doi:10.1186/S13321-022-00614-7" /><category term="doi:10.3389/FPHY.2023.1271842" /><category term="doi:10.3762/BJNANO.6.165" /><category term="doi:10.1089/AIVT.2021.0010" /><summary type="html"><![CDATA[Making something FAIR is hard, particularly when you do more than making something findable. We’ve seen before that making something usefully findable requires deep indexing, and already that continues to be difficult, because we are not seeing it enough. So, when I thought convert a paper led by Hoet’s lab in Leuven into machine-actionable RDF to make it FAIR, I gravely underestimated the amount of work. Jeaphianne et al. did an awesome job on this work (doi:10.1186/s13321-024-00833-0).]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/13321_2024_833_Fig1_HTML.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/13321_2024_833_Fig1_HTML.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Boiling points in Wikidata</title><link href="https://chem-bla-ics.linkedchemistry.info/2023/08/12/boiling-points-in-wikidata.html" rel="alternate" type="text/html" title="Boiling points in Wikidata" /><published>2023-08-12T00:00:00+00:00</published><updated>2023-08-12T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2023/08/12/boiling-points-in-wikidata</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2023/08/12/boiling-points-in-wikidata.html"><![CDATA[<p>Some days ago, I started added boiling points to <a href="https://wikidata.org/">Wikidata</a>, referenced from
<a href="https://scholia.toolforge.org/work/Q22236188">Basic Laboratory and Industrial Chemicals</a> (wikidata:Q22236188),
<a href="https://scholia.toolforge.org/author/Q18609741">David R. Lide</a>’s
‘a CRC quick reference handbook’ from 1993 (well, the edition I have). But Wikidata
<a href="https://www.wikidata.org/wiki/User_talk:Egon_Willighagen#Basic_laboratory_and_industrial_chemicals:_a_CRC_quick_reference_handbook_(Q22236188)">wants</a>
pressure (wikidata:P2077) info at which the boiling point (wikidata:P2102) was measured. Rightfully so. But I had not added those yet,
because it slows me and can be automated with <a href="https://quickstatements.toolforge.org/">QuickStatements</a>.</p>

<p>I just need a few SPARQL queries to list to which statements the qualifiers needs to be added. Basically, all boiling points which has the
book as a reference and that do not have the pressure info. First, there are values with ‘unknown value’, which results in blank nodes
(by the time you read this, they likely are already fixed):</p>

<div class="language-sparql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span><span class="w"> </span><span class="nv">?cmp</span><span class="w"> </span><span class="nv">?bp</span><span class="w"> </span><span class="nv">?pressure</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="nv">?cmp</span><span class="w"> </span><span class="nn">p</span><span class="o">:</span><span class="ss">P2102</span><span class="w"> </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="nn">prov</span><span class="o">:</span><span class="ss">wasDerivedFrom</span><span class="o">/</span><span class="nn">pr</span><span class="o">:</span><span class="ss">P248</span><span class="w"> </span><span class="nn">wd</span><span class="o">:</span><span class="ss">Q22236188</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">ps</span><span class="o">:</span><span class="ss">P2102</span><span class="w"> </span><span class="nv">?bp</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="nn">pq</span><span class="o">:</span><span class="ss">P2077</span><span class="w"> </span><span class="nv">?pressure</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="k">FILTER</span><span class="w"> </span><span class="p">(</span><span class="nb">contains</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="nv">?pressure</span><span class="p">),</span><span class="w"> </span><span class="s2">"http://"</span><span class="p">))</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>So, to get the list for which I want to write the QuickStatements which does not have any P2077 qualifier yet, I use
<a href="https://query.wikidata.org/#SELECT%20%3Fcmp%20WHERE%20%7B%0A%20%20%3Fcmp%20p%3AP2102%20%3FbpStatement%20.%0A%20%20%3FbpStatement%20prov%3AwasDerivedFrom%2Fpr%3AP248%20wd%3AQ22236188%20%3B%0A%20%20%20%20ps%3AP2102%20%3Fbp%20.%0A%20%20MINUS%20%7B%20%3FbpStatement%20pq%3AP2077%20%3Fpressure%20%7D%0A%7D">this query</a>:</p>

<div class="language-sparql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span><span class="w"> </span><span class="nv">?cmp</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="nv">?cmp</span><span class="w"> </span><span class="nn">p</span><span class="o">:</span><span class="ss">P2102</span><span class="w"> </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="nn">prov</span><span class="o">:</span><span class="ss">wasDerivedFrom</span><span class="o">/</span><span class="nn">pr</span><span class="o">:</span><span class="ss">P248</span><span class="w"> </span><span class="nn">wd</span><span class="o">:</span><span class="ss">Q22236188</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">ps</span><span class="o">:</span><span class="ss">P2102</span><span class="w"> </span><span class="nv">?bp</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="k">MINUS</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="nn">pq</span><span class="o">:</span><span class="ss">P2077</span><span class="w"> </span><span class="nv">?pressure</span><span class="w"> </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>At the time of writing, this lists 54 boiling points.</p>

<p>I can the WDQS create CSV-styled QuickStatements with:</p>

<div class="language-sparql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span><span class="w"> </span><span class="p">(</span><span class="nb">SUBSTR</span><span class="p">(</span><span class="nb">STR</span><span class="p">(</span><span class="nv">?cmp</span><span class="p">),</span><span class="mi">32</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="nv">?qid</span><span class="p">)</span><span class="w"> </span><span class="nv">?P2102</span><span class="w"> </span><span class="nv">?qal2077</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="p">{</span><span class="w">
  </span><span class="nv">?cmp</span><span class="w"> </span><span class="nn">p</span><span class="o">:</span><span class="ss">P2102</span><span class="w"> </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="nn">prov</span><span class="o">:</span><span class="ss">wasDerivedFrom</span><span class="o">/</span><span class="nn">pr</span><span class="o">:</span><span class="ss">P248</span><span class="w"> </span><span class="nn">wd</span><span class="o">:</span><span class="ss">Q22236188</span><span class="w"> </span><span class="p">;</span><span class="w">
    </span><span class="nn">ps</span><span class="o">:</span><span class="ss">P2102</span><span class="w"> </span><span class="nv">?P2102</span><span class="w"> </span><span class="p">.</span><span class="w">
  </span><span class="k">MINUS</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nv">?bpStatement</span><span class="w"> </span><span class="nn">pq</span><span class="o">:</span><span class="ss">P2077</span><span class="w"> </span><span class="nv">?pressure</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="k">BIND</span><span class="w"> </span><span class="p">(</span><span class="s2">"101.325U21064807"</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="nv">?qal2077</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Here, the SPARQL variables double as QuickStatement instructions. Finally, note to use of “U21064807” which is the Wikidata item for
kilopascal (wikidata:Q21064807).</p>

<p>I also need to “add” the boiling point again, to make sure QuickStatements knows which statement to add the qualifier to. I think this
can be done better, but not sure how to target statements directly. This is not fool proof: I noted that this approach ignores the
situation where there are two statements with the (exact) same boiling point, but different error margins. But that I will monitor
and where needed correct manually.</p>]]></content><author><name>Egon Willighagen</name></author><category term="rdf" /><category term="wikidata" /><category term="chemistry" /><summary type="html"><![CDATA[Some days ago, I started added boiling points to Wikidata, referenced from Basic Laboratory and Industrial Chemicals (wikidata:Q22236188), David R. Lide’s ‘a CRC quick reference handbook’ from 1993 (well, the edition I have). But Wikidata wants pressure (wikidata:P2077) info at which the boiling point (wikidata:P2102) was measured. Rightfully so. But I had not added those yet, because it slows me and can be automated with QuickStatements.]]></summary></entry></feed>