Python Scientific Computing Quick Overview ========================================== Antonino Ingargiola April 2007 About this document ------------------- When I first started using *Python* as Matlab(R) substitute, I found a lot of (maybe too much!) good, specific, documentation on the Internet. However none of them gave me an eye-bird view about the various packages needed, and how they interact with each other. Often when I needed a function I didn't know 'how' and 'where' to search. This document tries to fill this gap, giving a brief overview about the various packages' functionality and pointing to few trusted resources for the full documentation. Basically it's the document I had liked to read when I began with numerical python computing ;-). The document is written in link:NumericalPythonHowto.txt[plain ASCII text] (yep!) and automatically converted to XHTML by http://www.methods.co.nz/asciidoc/[AsciiDoc]. Number Crunching with Python ---------------------------- *Numpy* is the core python package for numerical computation in python. It includes, for example, an array object, linear algebra functions, fft, and advanced random number generation capabilities. More advanced feature are listed below. *SciPy* is an higher level wrapper for Numpy that also provides a set of general purpose functions for scientific computing. Numpy ~~~~~ It's useful to understand which features are included in Numpy and which should searched elsewhere. .Numpy main features: * An N-dimensional array object. * A collection of fast math functions. * Basic linear algebra support (module: `linalg`). * Basic Fourier Transform support (module: `fft`). * Random number generation capabilities (module: `random`). .Other (maybe less interesting for beginners) features: * Tools for integrating C/C++ and FORTRAN code. * A data-type object used to interpret memory blocks that make up array elements. * Array scalars. * A Matrix object. * A character array object for string manipulation. This is equivalent to NUMARRAY's "strings" module. * A record array object. * A memory-mapped object. * Compatibility layers for Numeric and NUMARRAY code. This includes full C-API support. * Tools for converting Numeric and NUMARRAY code to NUMPY. SciPy ~~~~~ SciPy modules (and features) are listed below. .Scipy Main Features * *`interpolate`*: Interpolation Tools * *`fftpack`*: Discrete Fourier Transform algorithms * *`signal`*: Signal Processing Tools * *`stats`*: Statistical Functions * *`linalg`*: Linear algebra routines * *`integrate`*: Integration routines * *`optimize`*: Optimization Tools * *`special`*: Special Functions (Bessel, Airy, gamma, erf, ...) .Additional Features: * *`cluster`*: Vector Quantization / Kmeans * *`sparse`*: Sparse matrix * *`ndimage`*: n-dimensional image package * *`maxentropy`*: Routines for fitting maximum entropy models * *`lib`*: Python wrappers to external libraries - *`lib.lapack`*: Wrappers to LAPACK library - *`lib.blas`*: Wrappers to BLAS library * *`linsolve`*: Linear Solvers - *`linsolve.umfpack`*: Interface to the UMFPACK library. * *`io`*: Data input and output * *`misc`*: Various utilities that don't have another home. NOTE: Because of their ubiquitousness, some of the functions in these subpackages are also made available in the scipy namespace to ease their use in interactive sessions and programs. NOTE: When a functionality is provided by both Numpy and SciPy is probably better to use the SciPy (that's simply an higher level wrapping around Numpy). Data Visualization in Python: Matplotlib ---------------------------------------- The de-facto standard for 2D plots (includes images and array visualization) in python is http://matplotlib.sourceforge.net/[Matplotlib]. *Matplotlib* is a rich 2D plotting library with publication-quality output that also provides a compatibility layer for Matlab(R) and interactive users, called *pylab*. Once installed you have to put in your script (or in an interactive python shell) this line: ------------------------- from pylab import * ------------------------- to have access to all the similar-to-matlab functions. For a complete list see http://matplotlib.sourceforge.net/matplotlib.pylab.html[here]. Since the *pylab* compatibility layer provides both plot functions and standard numerical Matlab(R) functions, it needs Numpy as dependence in order make the numerical computation. 'Matplotlib' can also work with an older numerical package called 'Numarray'. For this reason Matplotlib documentation refers to the numerical back-end with the unique name "Numerix". Interactive Use: IPython ~~~~~~~~~~~~~~~~~~~~~~~~~~ For interactive use the http://ipython.scipy.org/moin/[*IPython*] shell is strongly recommended. *IPython* is an advanced *Interactive Python Shell* built by the scientific python community that offer nice look and lots of shorthand for the interactive use. Just type: ------------------------- ipython -pylab ------------------------- to launch the ipython shell with pylab imported and other nitty-gritty details to facilitate interactive plotting. Like in Matlab(R), a simple test plot can be performed with: ------------------------- plot([1, 2, 3]) ------------------------- If you like video tutorials, Ian Ozsvald has collected a http://showmedo.com/videos/series?name=PythonIPythonSeries[few of them] about IPython. Documentation ------------- On-line Documentation ~~~~~~~~~~~~~~~~~~~~ Python General Purpose Documentation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The general purpose python documentation is ample and very well written. I'll report only some basic material. http://docs.python.org/tut/[The Python Tutorial]:: Step by step tutorial from totally beginners to advanced usage. Absolutely a MUST read. http://docs.python.org/lib/[Python Library Reference]:: Reference for all the standard library modules. http://heather.cs.ucdavis.edu/~matloff/python.html[Norm Matloff's Quick Python Tutorials]:: In-depth tutorials about python and good programming (noteworthy are the 'introductory tutorial' and the 'thread programming' tutorial) Numerical and Scientific Computing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ On-line documentation for python scientific computing includes: http://matplotlib.sourceforge.net/[Matplotlib User Guide]:: Official user guide for Matplotlib: for plotting got there. http://stsdas.stsci.edu/perry/pydatatut.pdf[Using Python for Interactive Data Analysis]:: This vast tutorial cover nearly every aspect of data analysis and modeling in python from a practical point of view. Is written by an astrophysicist, just skip the astronomical-specific parts if you are not interested ;-). Other SciPy and Numpy Documentation: * http://www.scipy.org/Cookbook[Scipy Cookbook] A collection of recipes of many common task regarding numerical computing in python. * http://www.scipy.org/Numpy_Example_List_With_Doc[Numpy Example List]: An usage example for *each* Numpy function. * http://www.scipy.org/NumPy_for_Matlab_Users[NumPy for Matlab Users]: Overview of the differences between Matlab(R) and Numpy. Local Documentation ~~~~~~~~~~~~~~~~~~~ Local documentation can be extracted on-fly from the documentation in the source code thanks to `*pydoc*`. The documentation generated can be read on web browser. This works automatically both for installed modules (for example *Numpy*) and both for local script! To start the server that locally dispatch the documentation, simply type in the dir that contains your python scripts: -------------------- pydoc -p 1234 -------------------- Now with a browser open the location: * http://localhost:1234[http://localhost:1234] and there you can read the documentation for *all installed modules* and for *all your local scripts* located in current dir (where you started `pydoc`). TIP: This is a very convenient way to read the full http://localhost:1234/numpy.html[*Numpy*] or http://localhost:1234/scipy.html[*Scipy*] documentation. Alternatively, with: --------------- pydoc -g --------------- you can start a little application that permits to search throughout the documentation and to read the various pages in the browser (started automatically). Interactive help system ~~~~~~~~~~~~~~~~~~~~~~~ Standard python shell can give you the 'docstring' for every module, function or class, simply typing: -------------------- help(name) -------------------- The *IPython* shell perform the same operation using a question mark, but it displays the colored 'docstring' (which is much more readable): -------------------- name? -------------------- Using the Python Debugger (`pdb`) --------------------------------- .Personal digression ************************** 'Until the day you give up finding a nasty bug in your application or script you will think the debuggers are very complicated think only for software engineers and not for casual numerical programmer. I thought so, too.' 'But one day a nasty bug pushed me (as last resource) to give a look at the python debugger and I found it's quite simple but extremely useful in quick debugging. So now I blame those old gray days passed putting dumb `print` for debuging.' *************************** Following the 'Batteries Included' philosophy, python includes a debugger too. Its name is *pdb* and once you have used it, you'll begin to ask how you have lived without (I know you are now thinking I'm a dumb-ass geek and that print is more than enough, but who cares? One day you'll thank me). *pdb* itself is a bit bare-bone at the user interface level, however the *ipython* shell comes in handy here too. To start a script with the debugger enabled just type in ipython: --------- run -d myscrpt.py --------- Now set a break point (the line where the execution will break): --------- b 12 --------- and start the execution until the breakpoints with *c*. Now you can follow the execution line-by-line with *n* (or with *s* to follow also the function calls), list the code with *l*, view the stack with *w* and inspect any variable typing its name (if the name clashed with a pdb command just type *p 'varname'*). And that's all. No that's not all. At each step you have a python prompt to do anything (loop, slices, assign new variables...). For example, if you started *ipython* with the `-pylab` flag, as previously suggested, you can plot any 'list' or 'array' as usual with: --------- plot(x) --------- Neat isn't it? As last think, I use to put this in my `~/.pdbrc` (on windows you have to find where the file is located): ---------------- alias c c;;l alias n n;;l alias s s;;l ----------------- so the *c*, *s* or *n* commands are redefined to list (*l* command) the sources at each invocation. You can find the full list of pdb command typing `help` at the pdb prompt or looking at the http://docs.python.org/lib/debugger-commands.html[official documentation]. Kudos to 'proff. Norman Matloff' for having enlightened me about the use of the python debugger through its nice series of http://heather.cs.ucdavis.edu/%7Ematloff/python.html[python tutorials] (the one that talks about pdb is http://heather.cs.ucdavis.edu/~matloff/Python/PythonIntro.pdf[PythonIntro.pdf]). Installation ------------ Linux ~~~~~ On Debian/Ubuntu just type: ------------- sudo aptitude install python-scipy python-matplotlib python-numpy-ext ipython ------------- and you'll be up and running (all the other dependencies are automatically installed). On other distro, use the package manager of choice to search the corresponding package names and install them. MacOsX ~~~~~~ 'Searching a volunteer to write this section. If you are a Mac user and want to contribute please contact me.' Windows ~~~~~~~ Install: * http://www.python.org/download/[Python 2.5 Windows Installer] * Install both *Scipy* and *Numpy* binaries http://www.scipy.org/Download[from here.] * http://sourceforge.net/project/showfiles.php?group_id=80706[Latest Matplotlib] click on the win32 file for the python version you have installed (2.5 recommended) * To install *ipython* follow this http://showmedo.com/videos/video?name=DownloadingIPythonForMSWindows&fromSeriesID=2[video tutorial] (it shows also how to install the required 'readline' and 'ctypes' packages). Links ----- * http://matplotlib.sourceforge.net/[Matplotlib Homepage] * http://www.scipy.org/[SciPy Homepage] * http://numpy.scipy.org/[Numpy Homepage] * http://ipython.scipy.org/moin/[IPython Homepage] * http://www.python.org/[Python Homepage] * http://pyplotsuite.sourceforge.net/[PyplotSuite Homepage] `;-)` *********************** This article was generated by: --------- $ asciidoc -a toc -a icons -a badges NumericalPythonHowto.txt --------- ///////// The DocBook tool chain can be also used: ---------- a2x -f xhtml NumericalPythonHowto.txt ---------- This makes the TOC but I don't like the output style too much so I use plain AsciiDoc xhtml output. To use the a2x you must install (debian/ubuntu packages names) `docbook-xsl`, and `xsltproc`. Package `fop` is for optional PDF generation. ///////// ***********************