<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/copyright.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-06-15T12:00:19+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/copyright.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">Does ChemSpider really violate Open Data with CC SA?</title><link href="https://chem-bla-ics.linkedchemistry.info/2008/05/10/does-chemspider-really-violate-open.html" rel="alternate" type="text/html" title="Does ChemSpider really violate Open Data with CC SA?" /><published>2008-05-10T00:00:00+00:00</published><updated>2008-05-10T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2008/05/10/does-chemspider-really-violate-open</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2008/05/10/does-chemspider-really-violate-open.html"><![CDATA[<p><a href="http://www.chemspider.com/">ChemSpider</a> <a href="http://www.chemspider.com/blog/it-appears-chemspider-does-bad-by-using-creative-commons-licenses.html">is afraid</a>
they are doing something bad because they release their data as <a href="http://creativecommons.org/licenses/by-sa/3.0/">CC-BY-SA</a>.
Because, John Wilbanks says in Peter’s blog:</p>

<blockquote>
  <p>I would add to it that I’d like to see a meaningful discussion of the
risks of Share Alike and Attribution on <strong>data integration</strong>. Chemspider’s
move to CC-BY-SA fits into this discussion nicely - it’s a total
violation of the open data protocol we laid out at SC, which says “Don’t
Use CC Licenses on Data” - <strong>but it does conform inside the broader OKD.</strong></p>
</blockquote>

<p>Now, let’s take this into pieces.</p>

<ol>
  <li>John notes that ChemSpider is in compliance with the <a href="http://www.opendefinition.org/1.0/">OKD</a>. This means, that ChemSpider thinks
about Open Data just like the <a href="http://en.wikipedia.org/wiki/Open_Knowledge_Foundation">Open Knowledge Foundation</a> does. I’ve scanned
through the OKD, and it indeed seems to support the BY and SA clauses of the CC. So, Chemspider did not do a bad thing.</li>
  <li>Data integration is tricky: you have to keep track of license information on an entry-by-entry level. For each fact, you keep to track the
source, and associate the source with it’s original license. For example, the <a href="http://www.nmrshiftdb.org/">NMRShiftDB</a>
information in ChemSpider should be <a href="http://www.gnu.org/copyleft/fdl.html">GNU FDL</a>.</li>
  <li>OpenX licenses may be viral. This holds for the <a href="http://www.gnu.org/licenses/gpl.html">GNU GPL</a> as well as for the CC-BY-SA.
Nothing new there. It just requires that when you would like to incorporate the ChemSpider data into a larger database, that database
has to be CC-BY-SA too, or likely at least CC-SA.</li>
</ol>

<p>Summarizing, I think ChemSpider did a good thing, and that ChemSpider does <strong>not</strong> violate the OpenData idea, but instead, that the CC-BY-SA and
the OKD violates John’s requirements for integrating data resources (apparently based on a two year legal study). That has nothing to do with ChemSpider.</p>

<p>Now, people will always have different opinions on Openness. The original BSD clause had a
<a href="http://en.wikipedia.org/wiki/BSD_License#UC_Berkeley_advertising_clause">restrictive ‘advertisement’ clause</a>, not Open enough for at least the
<a href="http://www.debian.org/social_contract#guidelines">Debian Free Software Guidelines</a> (DFSG), while still open source. The clause was
later removed from the BSD license.</p>

<p>Another <a href="http://www.debian.org/">Debian</a> example is Firebox, which is named <a href="http://packages.debian.org/iceweasel">IceWeasel</a> in Debian,
because the ‘license’ on the Firefox name is not open enough.</p>

<p>Another problem with the definition of Openness, is the viral aspect of some licenses (see earlier). For some, the GPL is not open enough,
because it does not give people the freedom to license their software they like themselves, something the BSD and MIT licenses do allow.
There is ongoing debate (and that should be ongoing) on how much <em>freedom</em> a license must provide to be called Open. The whole OpenAccess
discussion is similar (see e.g. <a href="http://www.google.com/search?q=strong+weak+open+access+site%3Awwmm.ch.cam.ac.uk&amp;btnG=Search">Peter’s story on this</a>),
where the discussion on the minimal amount of freedom is even worse.</p>

<p>Should we worry about ChemSpider being ‘only’ CC-BY-SA? Maybe. Data is not software, but I disagree that viral license would be OK for software, but NOT for data. That’s just BSD-versus-GPL all over again. I am happy about OpenBabel being GPL, and I am happy about ChemSpider being CC-BY-SA too.</p>

<p>All that said, these discussion are important. And creating good definitions of what freedoms are required, are crucial in deciding whether something is Open. The Blue Obelisk does not have/use such definitions yet, and we should start discussing this, and define a Blue Obelisk ODOSOS Guidelines. Please no funny jokes about how we can boogy then :)</p>

<p>Now, looking forward to hearing what you think about these issues… Looking forward to the other blog items!</p>]]></content><author><name>Egon Willighagen</name></author><category term="chemspider" /><category term="copyright" /><category term="nmrshiftdb" /><summary type="html"><![CDATA[ChemSpider is afraid they are doing something bad because they release their data as CC-BY-SA. Because, John Wilbanks says in Peter’s blog:]]></summary></entry><entry><title type="html">John Wilbanks replies to the ChemSpider/OpenData discussion</title><link href="https://chem-bla-ics.linkedchemistry.info/2008/05/10/john-wilbanks-replies-to.html" rel="alternate" type="text/html" title="John Wilbanks replies to the ChemSpider/OpenData discussion" /><published>2008-05-10T00:00:00+00:00</published><updated>2008-05-10T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2008/05/10/john-wilbanks-replies-to</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2008/05/10/john-wilbanks-replies-to.html"><![CDATA[<p>Not long after I posted my view on things, <a href="http://network.nature.com/blogs/user/wilbanks/2008/05/10/chemspider-good-intentions-and-the-fog-of-licensing">John posted his reply</a>
on the ChemSpider/OpenData discussion. His comment was merely to illustrate an internal advice to some organization, which got accidentally leaked. Anyway, a must read,
<a href="http://sciencecommons.org/projects/publishing/open-access-data-protocol/">with two</a>
<a href="http://www.opendatacommons.org/odc-public-domain-dedication-and-licence/">good links</a> to further reading on open data licensing.</p>

<p>His blog mentions the concept of <em>public domain</em>, where data might be dumped, but I always understood that the US public domain concept is different from that of
mainland-EU, German law in particular. This second ‘good link’ points to a license which formalizes this ‘public domain’ idea. And reading it, I realize that I
have read it before. But I had completely forgot about it.</p>

<p>A quick reread of these two links, tells me that it indeed is BSD-versus-GPL all over again; with the <a href="http://sciencecommons.org/">Science Commons</a>
license on the BSD side, and CC-BY-SA at the GPL side. The first surely makes the life easier of aggregators who wish to combine licenses. Can’t argue with that.</p>

<p>Then again… what’s wrong with a bit of viral character in the license? What’s wrong with the statement that ‘you may use my data, if I may use your aggregated
data with the same license’? That limits your what you practically can do, but does not limit your freedoms.</p>]]></content><author><name>Egon Willighagen</name></author><category term="chemspider" /><category term="opendata" /><category term="copyright" /><summary type="html"><![CDATA[Not long after I posted my view on things, John posted his reply on the ChemSpider/OpenData discussion. His comment was merely to illustrate an internal advice to some organization, which got accidentally leaked. Anyway, a must read, with two good links to further reading on open data licensing.]]></summary></entry><entry><title type="html">Legal Advice Needed: the NIH restricting access to our CC-licensed research results</title><link href="https://chem-bla-ics.linkedchemistry.info/2008/04/07/legal-advice-needed-nih-restricting.html" rel="alternate" type="text/html" title="Legal Advice Needed: the NIH restricting access to our CC-licensed research results" /><published>2008-04-07T00:00:00+00:00</published><updated>2008-04-07T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2008/04/07/legal-advice-needed-nih-restricting</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2008/04/07/legal-advice-needed-nih-restricting.html"><![CDATA[<p>In reply to <a href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1026">Peter’s</a> <a href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1025">news</a>
that the NIH’s <a href="http://www.pubmedcentral.nih.gov/">PubMed Central</a> (PMC) does not allow machine retrieval of content, I was wondering
about this section in the CC license of much of the PMC content, such as <a href="https://doi.org/10.1186/1471-2105-8-487">our paper on userscripts</a>
(section 4a of the <a href="http://creativecommons.org/licenses/by/2.0/legalcode">CC-BY 2.0</a>):</p>

<blockquote>
  <p>You may not distribute, publicly display, publicly perform, or publicly digitally perform the Work with any technological measures
that control access or use of the Work in a manner inconsistent with the terms of this License Agreement.</p>
</blockquote>

<p>CC-BY 3.0 reads differently, but has similar aims.</p>

<p>Let me make clear that I value machine readable publications much more than free (gratis, as-in-free-beer) publications. Now, the
NIH initiative now just is ‘Free Access’. An interesting step, but not one I care much about; not in relation to science anyway.</p>

<p>Now, Peter indicates that the NIH has put in place ‘technological measures to control access’ to the distribution of
<a href="https://chem-bla-ics.linkedchemistry.info/2007/12/21/christmas-presents.html">our work on userscripts <i class="fa-solid fa-recycle fa-xs"></i></a>
(<a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=18154664">the PMC entry</a>). That is in clear violation
of the CC license.</p>

<p>I know that other NIH initiatives do allow this, such as PMC OAI, but that’s just an ‘auxiliary service’. It may come down
to technical details, but some text on the PMC website is at least inaccurate:</p>

<blockquote>
  <p>Crawlers and other automated processes may NOT be used to systematically retrieve batches of articles from the PMC web site.
Bulk downloading of articles from the main PMC web site, in any way, is prohibited because of copyright restrictions.</p>
</blockquote>

<p>They way it is described right now, it is like: <em>You may not drive a car</em>. Next paragraph. <em>But, if you have a driver license,
we will approve</em>. Or, translated to this example: <em>You may only use this and that article, but only a few of them</em>.
Next paragraph. <em>Unless you use the following technical hole in the measure we took to disallow you access</em>.</p>

<p>What the PMC website should indicate, instead, is that text mining is allowed for the PMC OAI subset, but that they would highly
prefer to use the PMC OAI or PMC FTP routes. This is the least they have to do.</p>

<p>No matter what, I still have the feeling that any technical obstacles are disallowed by the CC-license. Any legal expert here,
that can explain me if the CC license allows controlling how people have access to my material?</p>]]></content><author><name>Egon Willighagen</name></author><category term="pubmed" /><category term="pmc" /><category term="copyright" /><category term="doi:10.1186/1471-2105-8-487" /><summary type="html"><![CDATA[In reply to Peter’s news that the NIH’s PubMed Central (PMC) does not allow machine retrieval of content, I was wondering about this section in the CC license of much of the PMC content, such as our paper on userscripts (section 4a of the CC-BY 2.0):]]></summary></entry><entry><title type="html">Numbers are copyrighted?</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/05/25/numbers-are-copyrighted.html" rel="alternate" type="text/html" title="Numbers are copyrighted?" /><published>2007-05-25T00:00:00+00:00</published><updated>2007-05-25T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/05/25/numbers-are-copyrighted</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/05/25/numbers-are-copyrighted.html"><![CDATA[<p>I just read on <a href="http://www.blueobelisk.org/planetbo/">Planet Blue Obelisk</a> <a href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/">Peter</a>’s
disturbing news (via <a href="https://doi.org/10.63485/mppz2-19243">Suber <i class="fa-solid fa-recycle fa-xs"></i></a>) that
<a href="https://blogs.ch.cam.ac.uk/pmr/2007/05/24/sued-for-10-data-points/">Wiley thinks it can copyright a set of numbers <i class="fa-solid fa-recycle fa-xs"></i></a> (also known as data).
That is a sad milestone in scientific publishing. It reminds me of the recent internet hype about a long number recently
flooding the internet (and notably <a href="http://www.del.icio.us/">del.icio.us</a>) related to watching DVDs you legally bought.
Some details can be found in this <a href="http://www.lwn.net/">Linux Weekly News</a> article on
<a href="http://lwn.net/Articles/233660/">How Debian packages a number</a>.</p>

<p>Interestingly, this is really not problems just regarding commercial publishers, or closed access publishing or so. Yesterday,
<a href="http://wiki.cubic.uni-koeln.de/blog/">Christoph</a> and I working on getting <a href="https://chem-bla-ics.linkedchemistry.info/2006/09/08/chemical-archeology-oscar3-to.html">the NMR spectrum text mining <i class="fa-solid fa-recycle fa-xs"></i></a>
going in <a href="http://www.bioclipse.net/">Bioclipse</a> again for the <a href="http://teacher.bmc.uu.se/BioclipseWS07/">workshop</a>,
we noticed that the open access <a href="http://bjoc.beilstein-journals.org/">Beilstein Journal of Organic Chemistry</a>,
does not make <a href="http://en.wikipedia.org/wiki/Open_Data">Open Data</a> reality either: the experimental sections are
generally (all?) excluded from the main text in HTML and obscured in .doc files in the supplementary information.</p>

<p>BTW, this makes me wonder if organic chemists still consider the experimental properties of molecules novel science.</p>]]></content><author><name>Egon Willighagen</name></author><category term="copyright" /><category term="justdoi:10.63485/mppz2-19243" /><summary type="html"><![CDATA[I just read on Planet Blue Obelisk Peter’s disturbing news (via Suber ) that Wiley thinks it can copyright a set of numbers (also known as data). That is a sad milestone in scientific publishing. It reminds me of the recent internet hype about a long number recently flooding the internet (and notably del.icio.us) related to watching DVDs you legally bought. Some details can be found in this Linux Weekly News article on How Debian packages a number.]]></summary></entry></feed>