Python vs. C++ for an application that does sparse linear algebra

I'm writing an application where quite a bit of the computational time will be devoted to performing basic linear algebra operations (add, multiply, multiply by vector, multiply by scalar, etc.) on sparse matrices and vectors. Up to this point, we've built a prototype using C++ and the Boost matrix library.

I'm considering switching to Python, to ease of coding the application itself, since it seems the Boost library (the easy C++ linear algebra library) isn't particularly fast anyway. This is a research/proof of concept application, so some reduction of run time speed is acceptable (as I assume C++ will almost always outperform Python) so long as coding time is also significantly decreased.

Basically, I'm looking for general advice from people who have used these libraries before. But specifically:

1) I've found scipy.sparse and and pySparse. Are these (or other libraries) recommended?

2) What libraries beyond Boost are recommended for C++? I've seen a variety of libraries with C interfaces, but again I'm looking to do something with low complexity, if I can get relatively good performance.

3) Ultimately, will Python be somewhat comparable to C++ in terms of run time speed for the linear algebra operations? I will need to do many, many linear algebra operations and if the slowdown is significant then I probably shouldn't even try to make this switch.

Thank you in advance for any help and previous experience you can relate.


My advice is to fully test the algorithm in Python before translating it into any other language (otherwise you run the risk of optimizing prematurely a bad algorithm). Once you have clearly defined the best interface for your problems, you can factor it out to external code.

Let me explain.

Suppose your final algorithm consists of taking a bunch of numbers in (row, column, value) format and, say, computing the SVD of the corresponding sparse matrix. Then you can leave the entire interface to Python:

class Problem(object):
   def __init__(self, values):
       self.values = values

   def solve(self):
       return external_svd(self.values)

where external_svd is the Python wrapper to a Fortran/C/C++ subroutine which efficiently computes the svd given a matrix in the format (row, column, value), or whatever floats your boat.

Again, first try to use numpy and scipy , and any other standard Python tool. Only then, after you've profiled your code, should you write the actual wrapper external_svd .

If you go this route, you will have a module which is user friendly (the user interacts with Python, not with Fotran/C/C++) and, most importantly, you will be able to use different back-ends: external_svd_lapack , external_svd_paradiso , external_svd_gsl , etc. (one for each back-end you choose).

As for sparse linear algebra libraries, check the Intel Math Kernel Library, the PARADISO sparse solver, the Harwell Subroutine Library (HSL) called "MA27". I've used them successfully to solve very sparse, very large problems (check the page of the nonlinear optimization solver IPOPT to see what I mean)


As llasram says, many libs in python are written in C/C++ so python should run at an acceptable speed.

On C++ you can also test gsl (gnu scientific library) but I believe that the routines of linear algebra will be the same as Boost (the two libraries are using BLAS for that). For sparse linear algebra, you should take a look at SBLAS but I never used it. Here's a short general "pros and cons" that I see :

  • C++ :
  • Will force you to keep a well-structured program
  • Can be quite easily wrapped for high level languages (like python) to ensure fast-testing (look at the python c api or at swig).
  • Python :
  • easy to debug but can easily lead to badly-structured programs
  • can very easily import data for tests
  • there are some very reliable libraries like scipy/numpy (by the way, scipy also uses BLAS for linear algebra)
  • managed code
  • I personnaly use gsl for matrix manipulation and I wrap my C++ libraries into Python libs to test easily with data. On my mind, it's a way of combining the pros of the two languages.


    2) Looks like you are looking for Eigen.

    3) I would guess that if you are doing sparse linear algebra, rather sooner than later you will want every bit of speed-up you can get so I'd just stick with C++. I don't see a point in using Python for this unless quickly testing a prototype, which you have already done in C++ anyways.

    链接地址: http://www.djcxy.com/p/62976.html

    上一篇: 用于线性代数的C库

    下一篇: Python与C ++的比较,用于稀疏线性代数的应用程序