<?xml version="1.0" encoding="UTF-8"?>
<feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:thr="http://purl.org/syndication/thread/1.0"
  xml:lang="en"
   >
  <title type="text">mmqc</title>
  <subtitle type="text">Perambulations of a physicist</subtitle>

  <updated>2011-11-03T11:07:24Z</updated>
  <generator uri="http://blogofile.com/">Blogofile</generator>

  <link rel="alternate" type="text/html" href="http://mmqc.org/notes" />
  <id>http://mmqc.org/notes/feed/atom/</id>
  <link rel="self" type="application/atom+xml" href="http://mmqc.org/notes/feed/atom/" />
  <entry>
    <author>
      <name>Karol M. Langner</name>
      <uri>http://mmqc.org/notes</uri>
    </author>
    <title type="html"><![CDATA[Zip on glusterfs]]></title>
    <link rel="alternate" type="text/html" href="http://mmqc.org/notes/2011/zip-on-glusterfs" />
    <id>http://mmqc.org/notes/2011/zip-on-glusterfs</id>
    <updated>2011-10-06T13:42:02Z</updated>
    <published>2011-10-06T13:42:02Z</published>
    <category scheme="http://mmqc.org/notes" term="glusterfs" />
    <summary type="html"><![CDATA[Zip on glusterfs]]></summary>
    <content type="html" xml:base="http://mmqc.org/notes/2011/zip-on-glusterfs"><![CDATA[<p>I've haven't found anything about this on the web yet, and it caused me to loose some data and time, so I'm issuing a warning here. Using <a href="http://en.wikipedia.org/wiki/ZIP_(file_format)">zip</a> to pack and compress files on a <a href="http://www.gluster.org/">glusterfs</a> partition can corrupt the resulting archive.</p>
<div class="pygments_murphy"><pre><span class="gp">me@machine: ~/glusterfs/ziptest$</span> dd <span class="k">if</span><span class="o">=</span>/dev/zero <span class="nv">count</span><span class="o">=</span>1 <span class="nv">of</span><span class="o">=</span>test.dat
<span class="go">1+0 records in</span>
<span class="go">1+0 records out</span>
<span class="go">512 bytes (512 B) copied, 0.000357 seconds, 1.4 MB/s</span>
<span class="gp">me@machine: ~/glusterfs/ziptest$</span> zip test.zip test.dat
<span class="go">  adding: test.dat (deflated 98%)</span>
<span class="gp">me@machine: ~/glusterfs/ziptest$</span> ls -lh
<span class="go">total 16K</span>
<span class="go">-rw-r--r-- 1 kml kml 512 Oct  6 13:47 test.dat</span>
<span class="go">-rw-r--r-- 1 kml kml  67 Oct  6 13:58 test.zip</span>
<span class="gp">me@machine: ~/glusterfs/ziptest$</span> unzip test.zip 
<span class="go">Archive:  test.zip</span>
<span class="go">  End-of-central-directory signature not found.  Either this file is not</span>
<span class="go">  a zipfile, or it constitutes one disk of a multi-part archive.  In the</span>
<span class="go">  latter case the central directory and zipfile comment will be found on</span>
<span class="go">  the last disk(s) of this archive.</span>
<span class="go">  unzip:  cannot find zipfile directory in one of test.zip or</span>
<span class="go">        test.zip.zip, and cannot find test.zip.ZIP, period.</span>
</pre></div>

<p>Although the error pops up while unzipping, the archive itself is corrupted. This can be seen by copying the archive to another, non-glusterfs partition, where the error still occurs. A file zipped on a different partition and copied to glusterfs, however, will unzip nicely.</p>
<p>I haven't studied the cause of this corruption, but I presume it is connected with <a href="http://en.wikipedia.org/wiki/ZIP_%28file_format%29#File_headers">the central directory file header</a>. The glusterfs setup in this case uses a simple distributed configuration, so it is not an issue with striping, although I haven't looked into any other configuration options. My personal solution was to abandon the application that required zip, and to use <a href="http://www.gnu.org/s/tar/">tar</a> with <a href="http://www.gzip.org/">gzip</a> or <a href="http://bzip.org/">bzip2</a> instead.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name>Karol M. Langner</name>
      <uri>http://mmqc.org/notes</uri>
    </author>
    <title type="html"><![CDATA[Is this a blog?]]></title>
    <link rel="alternate" type="text/html" href="http://mmqc.org/notes/2011/is-this-a-blog-" />
    <id>http://mmqc.org/notes/2011/is-this-a-blog-</id>
    <updated>2011-10-05T20:49:05Z</updated>
    <published>2011-09-11T22:15:16Z</published>
    <category scheme="http://mmqc.org/notes" term="website" />
    <summary type="html"><![CDATA[Is this a blog?]]></summary>
    <content type="html" xml:base="http://mmqc.org/notes/2011/is-this-a-blog-"><![CDATA[<p>Yes and no. For some time I've had an urge to blog. I follow quite a few and most are concerned with science and research, or with computing and programming. Regularly I stumble upon technical information on blogs that I find helpful in my work. In a way, this website is about me giving back to that community. It's also about organizing the various technical notes I've put down in many places.</p>
<p>Strictly speaking I will retain the core features of a blog, namely reverse-chronological order and some form of commenting. The infrastructure, however, is based on <a href="http://www.blogofile.com">Blogofile</a>, which is a static website compiler written in Python, and the ubiquitous <a href="http://git-scm.com">git revision control system</a>. In fact, I've decided to version control the whole source code for the website and <a href="https://github.com/langner/mmqc">to publish it openly on github</a>.</p>
<p>I will refer to the things I write here as <em>notes</em>, and not posts as in typical blogs, because most will be technical and short, and they will probably be published irregularly. Also, I might go back and update, change, correct and extend notes in the future. But since the source is versioned, it is also possible to go back in time if needed.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name>Karol M. Langner</name>
      <uri>http://mmqc.org/notes</uri>
    </author>
    <title type="html"><![CDATA[A NumPy exercise in optimization]]></title>
    <link rel="alternate" type="text/html" href="http://mmqc.org/notes/2011/a-numpy-exercise-in-optimization" />
    <id>http://mmqc.org/notes/2011/a-numpy-exercise-in-optimization</id>
    <updated>2011-09-13T10:14:46Z</updated>
    <published>2011-09-01T13:40:32Z</published>
    <category scheme="http://mmqc.org/notes" term="python" />
    <category scheme="http://mmqc.org/notes" term="optimization" />
    <category scheme="http://mmqc.org/notes" term="numpy" />
    <category scheme="http://mmqc.org/notes" term="algebra" />
    <summary type="html"><![CDATA[A NumPy exercise in optimization]]></summary>
    <content type="html" xml:base="http://mmqc.org/notes/2011/a-numpy-exercise-in-optimization"><![CDATA[<p>This will be a simple exercise I did a long time ago in speeding up a single operation repeated many times, namely the Euclidean <a href="http://en.wikipedia.org/wiki/Norm_%28mathematics%29" title="Wikipedia: Norm (mathematics)">vector norm</a>. It can become quite a bottleneck if done wrong, especially if you deal with millions of vectors in a Python script written quick and dirty.</p>
<p>There are tools for speeding up Python with C, such as <a href="http://wiki.python.org/moin/weave" title="weave website">weave</a> and <a href="http://cython.org/" title="Cython website">Cython</a>, or with Fortran (see <a href="http://www.scipy.org/F2py" title="f2py website">f2py</a>). But the whole point in quick and dirty solutions is to stay away from them, since writing pure Python is simpler. The first thing to do is to use <a href="http://numpy.scipy.org/" title="NumPy website">NumPy</a> effectively, and that is what this note is about.</p>
<h2>A single vector</h2>
<p>A random vector, a one dimensional <code>numpy.ndarray</code> object with three random elements:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">v</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">random</span><span class="p">((</span><span class="mi">3</span><span class="p">,))</span>
<span class="o">&gt;&gt;&gt;</span> <span class="k">print</span> <span class="n">v</span>
<span class="p">[</span> <span class="mf">0.21683143</span>  <span class="mf">0.47678871</span>  <span class="mf">0.48953654</span><span class="p">]</span>
</pre></div>

<p>We will need to compare different ways of computing the norm. I will define <a href="http://docs.python.org/reference/expressions.html#lambda" title="lambdas in the docs">lambda functions</a>, which can be inspected and timed like this:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="kn">import</span> <span class="nn">inspect</span>
<span class="o">&gt;&gt;&gt;</span> <span class="kn">import</span> <span class="nn">timeit</span>
<span class="o">&gt;&gt;&gt;</span> <span class="k">def</span> <span class="nf">timenorm</span><span class="p">(</span><span class="n">norm</span><span class="p">,</span> <span class="n">number</span><span class="p">):</span>
<span class="o">...</span>     <span class="n">name</span> <span class="o">=</span> <span class="n">inspect</span><span class="o">.</span><span class="n">getsource</span><span class="p">(</span><span class="n">norm</span><span class="p">)</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&quot;=&quot;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
<span class="o">...</span>     <span class="n">code</span> <span class="o">=</span> <span class="n">inspect</span><span class="o">.</span><span class="n">getsource</span><span class="p">(</span><span class="n">norm</span><span class="p">)</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&quot;:&quot;</span><span class="p">)[</span><span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
<span class="o">...</span>     <span class="n">setup</span> <span class="o">=</span> <span class="s">&quot;from __main__ import np,v&quot;</span>
<span class="o">...</span>     <span class="n">tim</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="n">setup</span><span class="p">,</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span>
<span class="o">...</span>     <span class="n">value</span> <span class="o">=</span> <span class="nb">eval</span><span class="p">(</span><span class="s">&quot;</span><span class="si">%s</span><span class="s">(v)&quot;</span> <span class="o">%</span><span class="n">name</span><span class="p">)</span>
<span class="o">...</span>     <span class="k">print</span> <span class="s">&quot;</span><span class="si">%s</span><span class="s">: </span><span class="si">%.6f</span><span class="s"> time: </span><span class="si">%.3f</span><span class="s"> code: </span><span class="si">%s</span><span class="s">&quot;</span> <span class="o">%</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">value</span><span class="p">,</span> <span class="n">tim</span><span class="p">,</span> <span class="n">code</span><span class="p">)</span>
</pre></div>

<p>This timing function will print the value of the norm (a sanity check), the time it took for <em>number</em> of repetitions, and the code actually executed.</p>
<p>The first thing to try is the norm provided with numpy.linalg:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">mynorm1</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">v</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">v</span><span class="p">)</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">timenorm</span><span class="p">(</span><span class="n">mynorm</span><span class="p">,</span> <span class="mi">5</span><span class="o">*</span><span class="mi">10</span><span class="o">**</span><span class="mi">6</span><span class="p">)</span>
<span class="n">mynorm1</span><span class="p">:</span> <span class="mf">1.144364</span> <span class="n">time</span><span class="p">:</span> <span class="mf">51.915</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">v</span><span class="p">)</span>
</pre></div>

<p>But <a href="https://github.com/numpy/numpy/blob/master/numpy/linalg/linalg.py#L1928" title="numpy.linalg.norm source code">numpy.linalg does a number of things we don't need</a>, so let's try a few other versions:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">mynorm2</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">v</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">v</span><span class="p">,</span><span class="n">v</span><span class="p">)))</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">mynorm3</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">v</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">v</span><span class="p">))</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">mynorm4</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">v</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span>
<span class="o">&gt;&gt;&gt;</span> <span class="k">for</span> <span class="n">mynorm</span> <span class="ow">in</span> <span class="n">mynorm1</span><span class="p">,</span> <span class="n">mynorm2</span><span class="p">,</span> <span class="n">mynorm3</span><span class="p">,</span> <span class="n">mynorm4</span><span class="p">:</span>
<span class="o">...</span>     <span class="n">timenorm</span><span class="p">(</span><span class="n">mynorm</span><span class="p">,</span> <span class="mi">5</span><span class="o">*</span><span class="mi">10</span><span class="o">**</span><span class="mi">6</span><span class="p">)</span>
<span class="n">mynorm1</span><span class="p">:</span> <span class="mf">0.716931</span> <span class="n">time</span><span class="p">:</span> <span class="mf">52.390</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">v</span><span class="p">)</span>
<span class="n">mynorm2</span><span class="p">:</span> <span class="mf">0.716931</span> <span class="n">time</span><span class="p">:</span> <span class="mf">48.008</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">dot</span><span class="p">(</span><span class="n">v</span><span class="p">,</span><span class="n">v</span><span class="p">)))</span>
<span class="n">mynorm3</span><span class="p">:</span> <span class="mf">0.716931</span> <span class="n">time</span><span class="p">:</span> <span class="mf">46.823</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">v</span><span class="p">))</span>
<span class="n">mynorm4</span><span class="p">:</span> <span class="mf">0.716931</span> <span class="n">time</span><span class="p">:</span> <span class="mf">21.482</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span>
</pre></div>

<p>It seems that the explicit version, without the additional function calls, is somewhat quicker if you have a single vector, although it will not support vectors of arbitrary length. Reasonable, but meaningless since in all cases we are dealing with microseconds per function call.</p>
<h2>Vectorize!</h2>
<p>The whole point of NumPy is to get rid of loops by vectorizing, and in practice one typically deals with large sets of different vectors. So it would make more sense to benchmark on an array of vectors:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">V</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">random</span><span class="p">((</span><span class="mi">1000</span><span class="p">,</span><span class="mi">3</span><span class="p">))</span>
<span class="o">&gt;&gt;&gt;</span> <span class="k">print</span> <span class="n">V</span>
<span class="p">[[</span> <span class="mf">0.73755195</span>  <span class="mf">0.78344111</span>  <span class="mf">0.02725284</span><span class="p">]</span>
 <span class="p">[</span> <span class="mf">0.49455093</span>  <span class="mf">0.08837641</span>  <span class="mf">0.78106238</span><span class="p">]</span>
 <span class="p">[</span> <span class="mf">0.97095203</span>  <span class="mf">0.64497806</span>  <span class="mf">0.53856876</span><span class="p">]</span>
 <span class="o">...</span><span class="p">,</span> 
 <span class="p">[</span> <span class="mf">0.67676871</span>  <span class="mf">0.41127143</span>  <span class="mf">0.89213647</span><span class="p">]</span>
 <span class="p">[</span> <span class="mf">0.50376334</span>  <span class="mf">0.01370871</span>  <span class="mf">0.35758737</span><span class="p">]</span>
 <span class="p">[</span> <span class="mf">0.05427026</span>  <span class="mf">0.42527007</span>  <span class="mf">0.88730196</span><span class="p">]]</span>
</pre></div>

<p>Here are <code>mynorm1</code> and <code>mynorm3</code> translated into list comprehensions:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">mynorms1</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">V</span><span class="p">:</span> <span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">]</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">mynorms2</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">V</span><span class="p">:</span> <span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">v</span><span class="p">))</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">]</span>
</pre></div>

<p>In the second case it will be more efficient to take the square root for the whole resulting array at once. Here is that modification, and a similar one for <code>mynorm4</code>:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">mynorms3</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">V</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">([</span><span class="nb">sum</span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">v</span><span class="p">)</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">])</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">mynorms4</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">V</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">([</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">])</span>
</pre></div>

<p>And finally, a compact version that sticks with arrays all the way:</p>
<div class="pygments_murphy"><pre><span class="n">mynorms5</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">V</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">((</span><span class="n">V</span><span class="o">*</span><span class="n">V</span><span class="p">)</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
</pre></div>

<p>Now a comparison (<code>timenorms</code> is available in <a href="/scripts/mynorm.py" title="mynorm.py">the attached script</a>):</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;&gt;</span> <span class="k">for</span> <span class="n">mynorms</span> <span class="ow">in</span> <span class="n">mynorms1</span><span class="p">,</span> <span class="n">mynorms2</span><span class="p">,</span> <span class="n">mynorms3</span><span class="p">,</span> <span class="n">mynorms4</span><span class="p">,</span> <span class="n">mynorms5</span><span class="p">:</span>
<span class="o">...</span>     <span class="n">timenorm</span><span class="p">(</span><span class="n">mynorms</span><span class="p">,</span> <span class="mi">5</span><span class="o">*</span><span class="mi">10</span><span class="o">**</span><span class="mi">6</span><span class="o">/</span><span class="nb">len</span><span class="p">(</span><span class="n">V</span><span class="p">))</span>
<span class="n">mynorms1</span><span class="p">:</span> <span class="mf">1.076339</span> <span class="n">time</span><span class="p">:</span> <span class="mf">52.706</span> <span class="n">code</span><span class="p">:</span> <span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">]</span>
<span class="n">mynorms2</span><span class="p">:</span> <span class="mf">1.076339</span> <span class="n">time</span><span class="p">:</span> <span class="mf">45.342</span> <span class="n">code</span><span class="p">:</span> <span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">v</span><span class="p">))</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">]</span>
<span class="n">mynorms3</span><span class="p">:</span> <span class="mf">1.076339</span> <span class="n">time</span><span class="p">:</span> <span class="mf">33.971</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">([</span><span class="nb">sum</span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">v</span><span class="p">)</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">])</span>
<span class="n">mynorms4</span><span class="p">:</span> <span class="mf">1.076339</span> <span class="n">time</span><span class="p">:</span> <span class="mf">13.125</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">([</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">*</span><span class="n">v</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">V</span><span class="p">])</span>
<span class="n">mynorms5</span><span class="p">:</span> <span class="mf">1.076339</span> <span class="n">time</span><span class="p">:</span> <span class="mf">0.212</span> <span class="n">code</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">((</span><span class="n">V</span><span class="o">*</span><span class="n">V</span><span class="p">)</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
</pre></div>

<p>The speedup <code>mynorms5</code> provides here will be even larger if we put all five million vectors into a single array. Of course, compiled C will be even faster, but this is more than enough for most of my quick and dirty scripts.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name>Karol M. Langner</name>
      <uri>http://mmqc.org/notes</uri>
    </author>
    <title type="html"><![CDATA[cclib in Debian]]></title>
    <link rel="alternate" type="text/html" href="http://mmqc.org/notes/2011/cclib-in-debian" />
    <id>http://mmqc.org/notes/2011/cclib-in-debian</id>
    <updated>2011-09-09T17:55:47Z</updated>
    <published>2011-08-29T11:47:08Z</published>
    <category scheme="http://mmqc.org/notes" term="cclib" />
    <category scheme="http://mmqc.org/notes" term="debian" />
    <summary type="html"><![CDATA[cclib in Debian]]></summary>
    <content type="html" xml:base="http://mmqc.org/notes/2011/cclib-in-debian"><![CDATA[<p>I'm involved in the development of <a href="http://cclib.sourceforge.net" title="cclib website">cclib</a>, which is a Python library for parsing computational chemistry output files, and progress has been sporadic at best. Nonetheless, after a few years the version number is above 1.0, the interface is quite stable and most of the commits now are bugfixes. More importantly, it seems we have acquired quite a bit of users, especially via <a href="http://gausssum.sourceforge.net/" title="GaussSum">GaussSum</a> and <a href="http://qmforge.sourceforge.net/ QMForge">QMForge</a>, which are basically graphical user interfaces for cclib. This year I decided it is time to finally introduce cclib into <a href="http://www.debian.org" title="Debian">Debian</a>, my Linux distribution of choice.</p>
<p>Although I've used Debian for many years, I've never built packages or maintained them. My entry points were the <a href="http://debichem.alioth.debian.org/" title="Debichem">debichem packaging group</a>, the <a href="http://mentors.debian.net" title="mentors.debian.net">Debian mentors website</a>, and of course the <a href="http:///www.debian.org/doc/manuals/maint-guide/" title="Debian new maintainer&apos;s guide">maintainer's guide</a>. All these were very helpful in getting the job done efficiently.</p>
<p>So, I'm happy to report that Debian users (of the current testing distribution, aka <em>wheezy</em>) can install cclib even easier than before, by typing one command at the terminal:</p>
<div class="pygments_murphy"><pre>aptitude install cclib
</pre></div>

<p>or via their favorite software package manager. This actually installs two packages, <a href="http://packages.debian.org/wheezy/python-cclib" title="python-cclib in wheezy">python-cclib</a> containing the core Python module, and <a href="http://packages.debian.org/wheezy/cclib" title="cclib in wheezy">cclib</a> which carries the user scripts. Due to current and possible future conflicts in names, these user scripts have prefixed with <em>cclib-</em>; that means that instead of <em>ccget</em> users run <em>cclib-ccget</em> and that <em>cda</em> has been changed to <em>cclib-cda</em>.</p>
<p>If you are also interested in the logfiles distributed with cclib and the accompanying unittests, you will need to install <a href="http://packages.debian.org/wheezy/cclib-data" title="cclib-data in wheezy">cclib-data</a> from the <em>non-free</em> repository. This is due to copyright issues, since the log files created by many computational chemistry programs are not free to use and distribute under all conditions (see <a href="http://lists.alioth.debian.org/pipermail/debichem-devel/2011-July/thread.html" title="debichem-devel July 2011">debichem-devel mailing list from July 2011</a> for the relevant discussion).</p>
<p>Using cclib within Python is the same as always. For examples, with all packages installed you can type this in the interpreter:</p>
<div class="pygments_murphy"><pre><span class="o">&gt;&gt;</span> <span class="kn">import</span> <span class="nn">cclib</span>
<span class="o">&gt;&gt;</span> <span class="n">cclib</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">testall</span><span class="p">()</span>
</pre></div>

<p>which should run the whole cclib unittest suite.</p>]]></content>
  </entry>
</feed>

