<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://chem-bla-ics.linkedchemistry.info/feed/by_tag/excel.xml" rel="self" type="application/atom+xml" /><link href="https://chem-bla-ics.linkedchemistry.info/" rel="alternate" type="text/html" /><updated>2026-06-15T12:00:19+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/feed/by_tag/excel.xml</id><title type="html">chem-bla-ics</title><subtitle>Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.</subtitle><author><name>Egon Willighagen</name></author><entry><title type="html">Excel messes up your data analysis :)</title><link href="https://chem-bla-ics.linkedchemistry.info/2007/08/01/excel-messes-up-your-data-analysis.html" rel="alternate" type="text/html" title="Excel messes up your data analysis :)" /><published>2007-08-01T00:00:00+00:00</published><updated>2007-08-01T00:00:00+00:00</updated><id>https://chem-bla-ics.linkedchemistry.info/2007/08/01/excel-messes-up-your-data-analysis</id><content type="html" xml:base="https://chem-bla-ics.linkedchemistry.info/2007/08/01/excel-messes-up-your-data-analysis.html"><![CDATA[<p>Well, no wonder: Excel is meant to be used to process money flows. Anyway, <a href="http://del.icio.us/greyarea">greyarea</a> pointed me to
<a href="http://itre.cis.upenn.edu/~myl/languagelog/archives/002912.html">this nice blog item</a> from March 2006. It discusses a 2004 article in
<a href="http://www.biomedcentral.com/bmcbioinformatics">BMC Bioinformatics</a> <em>Mistaken Identifiers: Gene name errors can be introduced
inadvertently when using Excel in bioinformatics</em> by Barry Zeeberg et al. (DOI:<a href="https://doi.org/10.1186/1471-2105-5-80">10.1186/1471-2105-5-80</a>).
Hence, the importance of semantics and proper markup languages. The quotes are illustrative:</p>

<blockquote>
  <p>When we were beta-testing [two new bioinformatics programs] on microarray data, a frustrating problem occurred repeatedly: Some
gene names kept bouncing back as “unknown.” A little detective work revealed the reason: … A default date conversion feature in
Excel … was altering gene names that it considered to look like dates. For example, the tumor suppressor DEC1 [Deleted in
Esophageal Cancer 1] was being converted to ‘1-DEC.’ Figure 1 lists 30 gene names that suffer an analogous fate.<br /><br /></p>

  <p>…<br /><br /></p>

  <p>There is another default conversion problem for RIKEN clone identifiers identifiers of the form nnnnnnnEnn, where n denotes a
digit. These identifiers are comprised of the serial number of the plate that contains the library, information on plate status,
and the address of the clone. A search … identified more than 2,000 such identifiers out of a total set of 60,770. For example,
the RIKEN identifier “2310009E13” was converted irreversibly to the floating-point number “2.31E+13.” A non-expert user might
well fail to notice that approximately 3% of the identifiers on a microarray with tens of thousands of genes had been converted
to an incorrect form, yet the potential for 2,000 identifiers to be transmogrified without notice is a considerable concern. Most
important, these conversions to an internal date representation or floating-point number format are irreversible; the original
gene name cannot be recovered.</p>
</blockquote>

<p>Is this the article that made all bioinformaticians turn to R?</p>]]></content><author><name>Egon Willighagen</name></author><category term="bioinfo" /><category term="excel" /><category term="justdoi:10.1186/1471-2105-5-80" /><summary type="html"><![CDATA[Well, no wonder: Excel is meant to be used to process money flows. Anyway, greyarea pointed me to this nice blog item from March 2006. It discusses a 2004 article in BMC Bioinformatics Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics by Barry Zeeberg et al. (DOI:10.1186/1471-2105-5-80). Hence, the importance of semantics and proper markup languages. The quotes are illustrative:]]></summary></entry></feed>