Juliette Salexa
2010-04-05 02:14:06 UTC
Hello,
I've had a problem that's been bothering me for a couple of weeks now and wanted to see other people's opinions,
Apart from cases like this: http://www.mathworks.com/matlabcentral/newsreader/view_thread/255653#664380
where the for loop is actually more efficient than all vectorized alternatives,
as a rule of thumb I think most people would agree that MATLAB's matrix multiplication is much faster than doing the summation in a for loop.
The problem I have is that when the matrices become extremely large, even the sparse representation takes up >100GB of memory, so its multiplication by a vector still ends up being slow.
If I were to do the summation in a for loop instead of using matrix-vector multiplication,
I could delete elements of the summation as they are added to the total sum ...
thereby saving on the storage costs of storing ALL elements of the sum in a big matrix.
In other words, the matrix-vector multiplication way stores ALL elements of the sum, even after some of them might not be needed anymore - while doing the summation in a for loop allows one to generate the elements on the fly, and delete them after they've been added to the total sum.
-----
In matlab, is the matrix-vector multiplication not itself just a for loop that is implemented efficiently since it was compiled to machine code ?
If that's true, then if I code a for loop summation and compile it to machine code (using FORTRAN or C) would it be just as fast as matrix-vector multiplication ? or is there something else that makes the matrix-vector multiplication more efficient ?
In each iteration of the for loop I might have to be calling an external function, or reading something from a file in order to figure out what is going to be added to the total sum, would this slow down my code considerably ?
I'm hoping there's a way to code a for loop that's just as fast as matrix-vector multiplication but doesn't store unnecessary elements
I've had a problem that's been bothering me for a couple of weeks now and wanted to see other people's opinions,
Apart from cases like this: http://www.mathworks.com/matlabcentral/newsreader/view_thread/255653#664380
where the for loop is actually more efficient than all vectorized alternatives,
as a rule of thumb I think most people would agree that MATLAB's matrix multiplication is much faster than doing the summation in a for loop.
The problem I have is that when the matrices become extremely large, even the sparse representation takes up >100GB of memory, so its multiplication by a vector still ends up being slow.
If I were to do the summation in a for loop instead of using matrix-vector multiplication,
I could delete elements of the summation as they are added to the total sum ...
thereby saving on the storage costs of storing ALL elements of the sum in a big matrix.
In other words, the matrix-vector multiplication way stores ALL elements of the sum, even after some of them might not be needed anymore - while doing the summation in a for loop allows one to generate the elements on the fly, and delete them after they've been added to the total sum.
-----
In matlab, is the matrix-vector multiplication not itself just a for loop that is implemented efficiently since it was compiled to machine code ?
If that's true, then if I code a for loop summation and compile it to machine code (using FORTRAN or C) would it be just as fast as matrix-vector multiplication ? or is there something else that makes the matrix-vector multiplication more efficient ?
In each iteration of the for loop I might have to be calling an external function, or reading something from a file in order to figure out what is going to be added to the total sum, would this slow down my code considerably ?
I'm hoping there's a way to code a for loop that's just as fast as matrix-vector multiplication but doesn't store unnecessary elements