<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/openbabel.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-06-07T16:43:55+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/openbabel.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">OSRA: GPL-ed molecule drawing to SMILES convertor</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/07/20/osra-gpl-ed-molecule-drawing-to-smiles.html" rel="alternate" type="text/html" title="OSRA: GPL-ed molecule drawing to SMILES convertor" /><published>2007-07-20T00:10:00+00:00</published><updated>2007-07-20T00:10:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/07/20/osra-gpl-ed-molecule-drawing-to-smiles</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/07/20/osra-gpl-ed-molecule-drawing-to-smiles.html"><![CDATA[<p>Igor wrote a message to the <a href="http://www.ccl.net/chemistry/sub_unsub.shtml">CCL mailing list</a> about
<a href="http://cactus.nci.nih.gov/osra/">OSRA</a>:</p>

<blockquote>
  <p>We would like to announce a new addition to the set of chemoinformatics tools available from the Computer-Aided Drug Design Group
at the NCI-Frederick. OSRA is a utility designed to convert graphical representations of chemical structures, such as they appear
in journal articles, patent documents, textbooks, trade magazines etc., into SMILES.<br /><br /></p>

  <p>OSRA can read a document in any of the over 90 graphical formats parseable by ImageMagick (GIF, JPEG, PNG, TIFF, PDF, PS etc.) and
generate the SMILES representation of the molecular structure images encountered within that document.</p>
</blockquote>

<p>The email does not give any information on the fail rate, but the demo they provide via the
<a href="http://cactus.nci.nih.gov/cgi-bin/osra/index.cgi">webinterface</a> does show some minor glitches (the bromine is not recognized):</p>

<p><img src="/assets/images/osra.png" alt="" /></p>

<p>The source reuses <a href="http://openbabel.sf.net/">OpenBabel</a> and uses the GPL license. The value equal to that of text mining tools like
<a href="https://chem-bla-ics.linkedchemistry.info/2006/06/22/text-mining-for-chemistry-using-oscar3.html">OSCAR3 <i class="fa-solid fa-recycle fa-xs"></i></a>,
and together they sounds like the Jordan and Pippen of mining chemical literature.</p>]]></content><author><name>Egon Willighagen</name></author><category term="cheminf" /><category term="openbabel" /><summary type="html"><![CDATA[Igor wrote a message to the CCL mailing list about OSRA:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://chem-bla-ics.linkedchemistry.info/assets/images/osra.png" /><media:content medium="image" url="https://chem-bla-ics.linkedchemistry.info/assets/images/osra.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">InChI’s in LaTex and CDK News</title><link href="https://chem-bla-ics.linkedchemistry.info/2006/03/31/inchis-in-latex-and-cdk-news.html" rel="alternate" type="text/html" title="InChI’s in LaTex and CDK News" /><published>2006-03-31T00:00:00+00:00</published><updated>2006-03-31T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2006/03/31/inchis-in-latex-and-cdk-news</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2006/03/31/inchis-in-latex-and-cdk-news.html"><![CDATA[<p>An <a href="http://www.iupac.org/inchi/">InChI</a> (or see the <a href="http://www.iupac.org/inchi/">FAQ</a>) is a line notation
for a molecular structure that was recently developed by the <a href="http://www.nist.gov/">NIST</a> and the
<a href="http://www.iupac.org/">IUPAC</a>. Principally they can be applied to protein too (see below), but because
proteins would give lenghty InChI’s and are quite well defined in terms of connectivity anyway, those can
better be described by their amino acid sequence.</p>

<p>The March 2006 issue of <a href="http://almost.cubic.uni-koeln.de/cdk/cdk_top/cdk_news/">CDK News</a>, the
<a href="http://cdk.sf.net/">Chemistry Development Kit</a> project newsletter, will be
<a href="http://sourceforge.net/project/showfiles.php?group_id=20024&amp;package_id=124796">released</a> later today,
and had, for the second time, the requirment that authors provide InChI’s for molecular structures mentioned in the articles.
Different from the previous issue is how InChI’s are marked up in LaTeX. I’ve setup a <code class="language-plaintext highlighter-rouge">\inchi{}</code>
for this that automatically creates a <a href="http://www.google.com/">Google</a> search query as link behind the InChI:</p>

<div class="language-latex highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">\newcommand</span><span class="p">{</span>
  <span class="k">\inchi</span><span class="p">}</span>[1]<span class="p">{</span><span class="k">\href</span><span class="p">{</span>http://www.google.com/search?q=#1<span class="p">}</span>
                  <span class="p">{</span><span class="k">\normalfont\texttt</span><span class="p">{</span>InChI=#1<span class="p">}</span>
            <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, googling for InChI’s only works if one removes the <code class="language-plaintext highlighter-rouge">InChI=</code> part of the InChI. As an example I will show how it works
for methane. The InChI for this compound is <code class="language-plaintext highlighter-rouge">InChI=1/CH4/h1H4</code>, so in LaTex one enters <code class="language-plaintext highlighter-rouge">\inchi{1/CH4/h1H4}</code>.
This will create a link like: <a href="http://www.google.com/search?q=1/CH4/h1H4">InChI=1/CH4/h1H4</a>.</p>

<p>BTW, if you are interested in InChI’s for proteins, here is the InChI for <a href="http://www.pdb.org/pdb/explore.do?structureId=1CRN">1CRN</a>,
created with <a href="http://openbabel.sourceforge.net/">OpenBabel</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>InChI=1/C202H439N55O64S6/c1-28-92(12)149-188(308)237-127-84-323-324-
85-128(176(296)225-114(46-37-63-212-202(209)210)165(285)232-122(69-89(6)7)195(315)253-64-38-
47-132(253)179(299)215-80-143(274)241-158(107(27)265)199(319)257-68-42-51-136(257)182(302)226-
115(60-61-144(275)276)164(284)218-100(20)162(282)244-149)236-187(307)148(91(10)11)242-172(292)
120(74-138(204)269)229-168(288)117(70-108-43-34-33-35-44-108)228-169(289)119(73-137(203)268)
230-173(293)124(81-258)234-166(286)113(45-36-62-211-201(207)208)224-159(279)99(19)221-186(306)
147(90(8)9)243-189(309)150(93(13)29-2)245-174(294)125(82-259)235-183(303)135-50-41-66-255(135)
196(316)130-87-326-322-83-126(223-142(273)79-216-185(305)154(103(23)261)251-171(291)118(72-
110-54-58-112(267)59-55-110)231-192(312)155(104(24)262)250-163(283)101(21)220-175(127)295)178
(298)246-151(94(14)30-3)190(310)247-152(95(15)31-4)191(311)248-153(96(16)32-5)198(318)256-67-
40-49-134(256)181(301)213-77-140(271)217-97(17)161(281)249-156(105(25)263)194(314)240-131
(88-327-325-86-129(177(297)239-130)238-193(313)157(106(26)264)252-184(304)146(206)102(22)260)197
(317)254-65-39-48-133(254)180(300)214-78-141(272)222-121(76-145(277)278)170(290)227-116(71-
109-52-56-111(266)57-53-109)167(287)219-98(18)160(280)233-123(200(320)321)75-139(205)270/h89-
202,211-252,258-321H,28-88,203-210H2,1-27H3/t92-,93-,94-,95-,96-,97-,98-,99-,100-,101-,102+,
103+,104+,105+,106+,107+,109-,110-,111+,112+,113-,114-,115-,116-,117-,118-,119-,120-,121-,122-,
123-,124-,125-,126-,127-,128-,129-,130-,131-,132-,133-,134-,135-,136-,137?,138-,139-,140-,141+,
142-,143+,146-,147-,148-,149-,150-,151-,152-,153-,154-,155-,156-,157-,158-,159+,160?,161-,162?,
163-,164-,165?,166+,167?,168+,169+,170+,171-,172+,173+,174+,175?,176-,177?,178+,179+,180-,
181?,182-,183+,184?,185+,186+,187-,188-,189-,190+,191?,192-,193?,194-,195-,196-,197-,198-,199-/m0/s1
</code></pre></div></div>]]></content><author><name>Egon Willighagen</name></author><category term="inchi" /><category term="cdk" /><category term="cdknews" /><category term="iupac" /><category term="nist" /><category term="google" /><category term="protein" /><category term="openbabel" /><category term="inchikey:VNWKTOKETHGBQD-UHFFFAOYSA-N" /><summary type="html"><![CDATA[An InChI (or see the FAQ) is a line notation for a molecular structure that was recently developed by the NIST and the IUPAC. Principally they can be applied to protein too (see below), but because proteins would give lenghty InChI’s and are quite well defined in terms of connectivity anyway, those can better be described by their amino acid sequence.]]></summary></entry></feed>