performance - How to speed up a C++ sparse matrix manipulation? -
i have small script manipulating sparse matrix in c++. works fine except taking time. since i'm doing manipulation on , over, critical speed up. appreciate idea.thanks
#include <stdio.h> /* printf, scanf, puts, null */ #include <stdlib.h> /* srand, rand */ #include <time.h> /* time */ #include <iostream> /* cout, fixed, scientific */ #include <string> #include <cmath> #include <vector> #include <list> #include <string> #include <sstream> /* sjw 08/09/2010 */ #include <fstream> #include <eigen/dense> #include <eigen/sparse> using namespace eigen; using namespace std; sparsematrix<double> matmaker (int n1, int n2, double prob) { matrixxd = (matrixxd::random(n1, n2) + matrixxd::ones(n1, n2))/2; = (a.array() > prob).select(0, a); return a.sparseview(); } ////////////////this needs optimized///////////////////// int sd_func(sparsematrix<double> &w, vectorxd &stvec, sparsematrix<double> &wo, int taur, int taud) { w = w + 1/taur*(wo - w); (int k = 0; k < w.outersize(); ++k) (sparsematrix<double>::inneriterator it(w, k); it; ++it) w.coeffref(it.row(),it.col()) = it.value() * (1-stvec(it.col())/taud); return 1; } int main () { sparsematrix<double> wo = matmaker(5000, 5000, 0.1); sparsematrix<double> w = matmaker(5000, 5000, 0.1); vectorxd stvec = vectorxd::random(5000); clock_t tsd1,tsd2; float timesd = 0.0; tsd1 = clock(); ///////////////////////////////// way speed function??????? sd_func(w, stvec, wo, 8000, 50); //////////////////////////////// ?????????? tsd2 = clock(); timesd += (tsd2 - tsd1); cout<<"sd time: " << timesd / clocks_per_sec << " s" << endl; return 0; }
the critical performance improvement (imo) can make not use w.coeffref(it.row(),it.col())
. performs binary search in w
element each time. using sparsematrix<double>::inneriterator it(w, k);
simple change function skip binary search:
int sd_func_2(sparsematrix<double> &w, vectorxd &stvec, sparsematrix<double> &wo, int taur, int taud) { w = w + 1/taur*(wo - w); double taudinv = 1./taud; (int k = 0; k < w.outersize(); ++k) (sparsematrix<double>::inneriterator it(w, k); it; ++it) it.valueref() *= (1-stvec(it.col())*taudinv); return 1; }
this results in x3 speedup. note i've incorporated @dshin's comment multiplying faster division, performance improvement 90% removing binary search, 10% multiplication vs. division.
Comments
Post a Comment