Python Scientific Computing Quick Overview
==========================================
Antonino Ingargiola
April 2007
About this document
-------------------
When I first started using *Python* as Matlab(R) substitute, I found a lot of
(maybe too much!) good, specific, documentation on the Internet. However
none of them gave me an eye-bird view about the various packages needed, and
how they interact with each other. Often when I needed a function I didn't
know 'how' and 'where' to search.
This document tries to fill this gap, giving a brief overview about the
various packages' functionality and pointing to few trusted resources for
the full documentation. Basically it's the document I had liked to read when
I began with numerical python computing ;-).
The document is written in link:NumericalPythonHowto.txt[plain ASCII text]
(yep!) and automatically converted to XHTML by
http://www.methods.co.nz/asciidoc/[AsciiDoc].
Number Crunching with Python
----------------------------
*Numpy* is the core python package for numerical computation in python.
It includes, for example, an array object, linear algebra functions, fft, and
advanced random number generation capabilities. More advanced feature are
listed below.
*SciPy* is an higher level wrapper for Numpy that also provides a set of
general purpose functions for scientific computing.
Numpy
~~~~~
It's useful to understand which features are included in Numpy and which should
searched elsewhere.
.Numpy main features:
* An N-dimensional array object.
* A collection of fast math functions.
* Basic linear algebra support (module: `linalg`).
* Basic Fourier Transform support (module: `fft`).
* Random number generation capabilities (module: `random`).
.Other (maybe less interesting for beginners) features:
* Tools for integrating C/C++ and FORTRAN code.
* A data-type object used to interpret memory blocks that make up array
elements.
* Array scalars.
* A Matrix object.
* A character array object for string manipulation. This is equivalent to
NUMARRAY's "strings" module.
* A record array object.
* A memory-mapped object.
* Compatibility layers for Numeric and NUMARRAY code. This includes full
C-API support.
* Tools for converting Numeric and NUMARRAY code to NUMPY.
SciPy
~~~~~
SciPy modules (and features) are listed below.
.Scipy Main Features
* *`interpolate`*: Interpolation Tools
* *`fftpack`*: Discrete Fourier Transform algorithms
* *`signal`*: Signal Processing Tools
* *`stats`*: Statistical Functions
* *`linalg`*: Linear algebra routines
* *`integrate`*: Integration routines
* *`optimize`*: Optimization Tools
* *`special`*: Special Functions (Bessel, Airy, gamma, erf, ...)
.Additional Features:
* *`cluster`*: Vector Quantization / Kmeans
* *`sparse`*: Sparse matrix
* *`ndimage`*: n-dimensional image package
* *`maxentropy`*: Routines for fitting maximum entropy models
* *`lib`*: Python wrappers to external libraries
- *`lib.lapack`*: Wrappers to LAPACK library
- *`lib.blas`*: Wrappers to BLAS library
* *`linsolve`*: Linear Solvers
- *`linsolve.umfpack`*: Interface to the UMFPACK library.
* *`io`*: Data input and output
* *`misc`*: Various utilities that don't have another home.
NOTE: Because of their ubiquitousness, some of the functions in these
subpackages are also made available in the scipy namespace to ease
their use in interactive sessions and programs.
NOTE: When a functionality is provided by both Numpy and SciPy is probably
better to use the SciPy (that's simply an higher level wrapping around
Numpy).
Data Visualization in Python: Matplotlib
----------------------------------------
The de-facto standard for 2D plots (includes images and array visualization)
in python is http://matplotlib.sourceforge.net/[Matplotlib].
*Matplotlib* is a rich 2D plotting library with publication-quality output
that also provides a compatibility layer for Matlab(R) and interactive users,
called *pylab*.
Once installed you have to put in your script (or in an interactive python shell) this line:
-------------------------
from pylab import *
-------------------------
to have access to all the similar-to-matlab functions. For a complete
list see http://matplotlib.sourceforge.net/matplotlib.pylab.html[here].
Since the *pylab* compatibility layer provides both plot functions and standard
numerical Matlab(R) functions, it needs Numpy as dependence in order make the
numerical computation. 'Matplotlib' can also work with an older numerical
package called 'Numarray'. For this reason Matplotlib documentation refers to
the numerical back-end with the unique name "Numerix".
Interactive Use: IPython
~~~~~~~~~~~~~~~~~~~~~~~~~~
For interactive use the http://ipython.scipy.org/moin/[*IPython*] shell is
strongly recommended. *IPython* is an advanced *Interactive
Python Shell* built by the scientific python community that offer nice look
and lots of shorthand for the interactive use. Just type:
-------------------------
ipython -pylab
-------------------------
to launch the ipython shell with pylab imported and other nitty-gritty details
to facilitate interactive plotting. Like in Matlab(R), a simple test plot can
be performed with:
-------------------------
plot([1, 2, 3])
-------------------------
If you like video tutorials, Ian Ozsvald has collected a
http://showmedo.com/videos/series?name=PythonIPythonSeries[few of them]
about IPython.
Documentation
-------------
On-line Documentation
~~~~~~~~~~~~~~~~~~~~
Python General Purpose Documentation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The general purpose python documentation is ample and very well written. I'll
report only some basic material.
http://docs.python.org/tut/[The Python Tutorial]::
Step by step tutorial from totally beginners to advanced usage.
Absolutely a MUST read.
http://docs.python.org/lib/[Python Library Reference]::
Reference for all the standard library modules.
http://heather.cs.ucdavis.edu/~matloff/python.html[Norm Matloff's Quick Python Tutorials]::
In-depth tutorials about python and good programming (noteworthy
are the 'introductory tutorial' and the 'thread programming' tutorial)
Numerical and Scientific Computing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
On-line documentation for python scientific computing includes:
http://matplotlib.sourceforge.net/[Matplotlib User Guide]::
Official user guide for Matplotlib: for plotting got there.
http://stsdas.stsci.edu/perry/pydatatut.pdf[Using Python for Interactive Data Analysis]::
This vast tutorial cover nearly every aspect of data analysis and modeling
in python from a practical point of view. Is written by an astrophysicist,
just skip the astronomical-specific parts if you are not interested ;-).
Other SciPy and Numpy Documentation:
* http://www.scipy.org/Cookbook[Scipy Cookbook]
A collection of recipes of many common task regarding numerical computing
in python.
* http://www.scipy.org/Numpy_Example_List_With_Doc[Numpy Example List]:
An usage example for *each* Numpy function.
* http://www.scipy.org/NumPy_for_Matlab_Users[NumPy for Matlab Users]:
Overview of the differences between Matlab(R) and Numpy.
Local Documentation
~~~~~~~~~~~~~~~~~~~
Local documentation can be extracted on-fly from the documentation in the
source code thanks to `*pydoc*`. The documentation generated can be read on
web browser. This works automatically both for installed modules (for example
*Numpy*) and both for local script!
To start the server that locally dispatch the documentation, simply type in
the dir that contains your python scripts:
--------------------
pydoc -p 1234
--------------------
Now with a browser open the location:
* http://localhost:1234[http://localhost:1234]
and there you can read the documentation for *all installed modules* and for
*all your local scripts* located in current dir (where you started `pydoc`).
TIP: This is a very convenient way to read the full
http://localhost:1234/numpy.html[*Numpy*] or
http://localhost:1234/scipy.html[*Scipy*] documentation.
Alternatively, with:
---------------
pydoc -g
---------------
you can start a little application that permits to search throughout the
documentation and to read the various pages in the browser (started
automatically).
Interactive help system
~~~~~~~~~~~~~~~~~~~~~~~
Standard python shell can give you the 'docstring' for every module, function
or class, simply typing:
--------------------
help(name)
--------------------
The *IPython* shell perform the same operation using a question mark, but it
displays the colored 'docstring' (which is much more readable):
--------------------
name?
--------------------
Using the Python Debugger (`pdb`)
---------------------------------
.Personal digression
**************************
'Until the day you give up finding a nasty bug in your application or script
you will think the debuggers are very complicated think only for software
engineers and not for casual numerical programmer. I thought so, too.'
'But one day a nasty bug pushed me (as last resource) to give a look at the
python debugger and I found it's quite simple but extremely useful in quick
debugging. So now I blame those old gray days passed putting dumb `print` for
debuging.'
***************************
Following the 'Batteries Included' philosophy, python includes a debugger too.
Its name is *pdb* and once you have used it, you'll begin to ask how you have
lived without (I know you are now thinking I'm a dumb-ass geek and that
print is more than enough, but who cares? One day you'll thank me).
*pdb* itself is a bit bare-bone at the user interface level, however the
*ipython* shell comes in handy here too. To start a script with the debugger
enabled just type in ipython:
---------
run -d myscrpt.py
---------
Now set a break point (the line where the execution will break):
---------
b 12
---------
and start the execution until the breakpoints with *c*. Now you can follow the
execution line-by-line with *n* (or with *s* to follow also the function
calls), list the code with *l*, view the stack with *w* and inspect any
variable typing its name (if the name clashed with a pdb command just type
*p 'varname'*). And that's all.
No that's not all. At each step you have a python prompt to do anything (loop,
slices, assign new variables...). For example, if you started *ipython* with
the `-pylab` flag, as previously suggested, you can plot any 'list' or 'array'
as usual with:
---------
plot(x)
---------
Neat isn't it?
As last think, I use to put this in my `~/.pdbrc` (on windows you have to find
where the file is located):
----------------
alias c c;;l
alias n n;;l
alias s s;;l
-----------------
so the *c*, *s* or *n* commands are redefined to list (*l* command) the sources
at each invocation.
You can find the full list of pdb command typing `help` at the pdb prompt or
looking at the
http://docs.python.org/lib/debugger-commands.html[official documentation].
Kudos to 'proff. Norman Matloff' for having enlightened me about the use
of the python debugger through its nice series of
http://heather.cs.ucdavis.edu/%7Ematloff/python.html[python tutorials]
(the one that talks about pdb is
http://heather.cs.ucdavis.edu/~matloff/Python/PythonIntro.pdf[PythonIntro.pdf]).
Installation
------------
Linux
~~~~~
On Debian/Ubuntu just type:
-------------
sudo aptitude install python-scipy python-matplotlib python-numpy-ext ipython
-------------
and you'll be up and running (all the other dependencies are automatically
installed).
On other distro, use the package manager of choice to search the corresponding
package names and install them.
MacOsX
~~~~~~
'Searching a volunteer to write this section. If you are a Mac user and want to
contribute please contact me.'
Windows
~~~~~~~
Install:
* http://www.python.org/download/[Python 2.5 Windows Installer]
* Install both *Scipy* and *Numpy* binaries http://www.scipy.org/Download[from here.]
* http://sourceforge.net/project/showfiles.php?group_id=80706[Latest Matplotlib] click on the win32 file for the python version you have installed (2.5 recommended)
* To install *ipython* follow this http://showmedo.com/videos/video?name=DownloadingIPythonForMSWindows&fromSeriesID=2[video tutorial] (it shows also how to install the required 'readline' and 'ctypes' packages).
Links
-----
* http://matplotlib.sourceforge.net/[Matplotlib Homepage]
* http://www.scipy.org/[SciPy Homepage]
* http://numpy.scipy.org/[Numpy Homepage]
* http://ipython.scipy.org/moin/[IPython Homepage]
* http://www.python.org/[Python Homepage]
* http://pyplotsuite.sourceforge.net/[PyplotSuite Homepage] `;-)`
***********************
This article was generated by:
---------
$ asciidoc -a toc -a icons -a badges NumericalPythonHowto.txt
---------
/////////
The DocBook tool chain can be also used:
----------
a2x -f xhtml NumericalPythonHowto.txt
----------
This makes the TOC but I don't like the output style too much so I
use plain AsciiDoc xhtml output.
To use the a2x you must install (debian/ubuntu packages names) `docbook-xsl`, and `xsltproc`. Package `fop` is for optional PDF generation.
/////////
***********************