Post by Bruno LuongPost by Daniel VasilakyThank you all for your comments!
Is my algorithm a continuous function of the data for a fixed perturbation parameter?
In theory, yes, and in practice one has to deal with round off error.
Let A be an m-by-n matrix where m >= n. (for m=n A ill-conditioned, for m>n A’A ill-conditioned.)
Consider the equation Ax = b and let x_0 be some initial prior solution and c>0 a perturbation parameter in ||Ax – b|| + c||x – x_0||.
This can be solved be backslash.
[A; eye(size(x))] \ [b; c*x0]
This penalization is well known. Among fields are climate and weather forecast.
Post by Daniel VasilakyThe extended algorithm produces a sequence x_0, x_1, x_2, …….. which converges exponentially fast to the solution of
Ax = b for m=n and for m > n to the solution of A’Ax = A’b (normal equations) independent of the chosen initial values x_0 and c > 0.
Simple Tikhonov is a special case when x_0 is set to the zero vector, but zero is usually not the best choice in most applications.
Of course a well-chosen x_0 and c > 0 will reduce the number of successive approximations needed, but the convergence is always guaranteed. Each successive approximation takes mn flops. A reasonable stopping rule for successive approximations is: “STOP as soon as the relative residual error is greater than the prior residual error, “ i.e. , stop if
||Ax_i – b||/||b|| > ||Ax_(i-1) – b||/||b||.
This basically says STOP if round off error starts to make things worse. This may be an overkill because experiments show that one can stop much sooner.
Using gradient conjugate it probably converge even faster.
Post by Daniel VasilakyLoosely speaking, the proof and derivation of the above result reduces Richard Bellman’s functional recurrence equation to a recurrence equation of what appears to be a recurrence equation involving approximate Householder type projections, i.e.,
(I – vv’/(c +v’v)), tempered by parameter c>0. The recurrence relation turns out to be a generalization of the Gram-Schmidt process similar to but not the same as the Householder orthogonalization. This in turn produces an approximate QR factorization of A which solves the extended perturbed problem. The orthogonalization is done once which takes approximately 2mn^2 flops.
This is again a well-known stuffs.
Post by Daniel VasilakyIt is possible to speed things up by Bjock’s modified Gram-Schmidt (MGS) trick.
The extended algorithm works well in my anomaly detection and reconstruction of anatomical image parts algorithms and has been accepted for publication in a biomedical engineering journal. But the above extended algorithm is only referenced there and not discussed in the publication. The derivation of the algorithm is quite tricky and the liner algebra is laborious but the convergence proof is not difficult. Not being from a math community I don’t know which journal to submit it to that may possibly accept it. Any suggestions would be greatly appreciated.
Please make sure there is something really novel in your algorithm before working on the publication.
Bruno
Bruno, thanks for the tip on [A; eye(size(x,1))] \ [b; c*x_0]! Is there more information on this?
I ran extensive tests which show that for severely ill-conditioned matrices my algorithm significantly outperforms both backslash \ and extended backslash
[A; eye(size(x,1))] \ [b; c*x_0];
However, for a sparse matrix, such as the Harwell-Boeing test matrix "west0479" with cond(A) = 1.4244e+12, the simple backslash \ gives by far the best results.
But even in "west0479" case my algorithm converges to the true solution, except why use it if the backslash \ is much faster.
Here is a sample of test results: x is the actual solution, and x_0 initial solution, and x1 the computed solution.
In this example the x_0 = zero vector and tried various parameter values from
c=1e-3 to c=1e-12.
Test case hilb(25) , cond(A) =1.9684e+18, x= vector of 1s, rhs: b = A*x ;.
Also tried x_0 = x and got similar results.
Successive approximations for extended backslash were looped based on:
x_n = [A; eye(size(x,1))] \ [b; c*x_n-1]
For hilb(25):
Simple backslash (\) :
max(norm(x-x1)) = 110.7277
Extended backslash [A; eye(size(x,1))] \ [b; c*x_0] 0 iterations:
max(norm(x-x1)) = 2.6627
Extended backslash [A; eye(size(x,1))] \ [b; c*x_0] 500 iterations:
max(norm(x-x1)) = 2.6890
Extended backslash [A; eye(size(x,1))] \ [b; c*x_0] 5,000 iterations:
max(norm(x-x1)) = 2.6890 (does not improve)
My algorithm running 500 successive approximations:
max(norm(x-x1)) = 0.0811
My algorithm running 1,000 successive approximations:
max(norm(x-x1)) = 0.0573
My algorithm running 5,000 successive approximations:
max(norm(x-x1)) = 0.0488
In the above example we can see that Extended backslash
[A; eye(size(x,1))] \ [b; c*x_0]
does not converge to the exact solution, it stays put after a few iterations.
On the other hand, my algorithm always converges, this turned out to be the case for other hilb(n) matrices and the higher the dimension n the greater outperformance by my algorithm.
I also compared the performances on
Discretization of a first-kind Fredholm integral equation with kernel K and right-hand side g given by
K(s,t) = exp(s*cos(t)) , Y(s) = 2*sinh(s)/s , and with integration intervals s in [0,pi/2] , t in [0,pi] and n=200 .
The solution is given by x(t) = sin(t) .
Reference: M. L. Baart, "The use of auto-correlation for pseudo-
rank determination in noisy ill-conditioned linear least-squares
problems", IMA J. Numer. Anal. 2 (1982), 241-247.
Discretized by the Galerkin method with orthonormal box functions;
one integration is exact, the other is done by Simpson's rule.
Per Christian Hansen, IMM, 09/16/92.
con(A) = 1.6196e+18 x_0 = zero vector and
c = 1e-09 , x= true solution, x1=computed solution
Simple backslash (\):
max(norm(x-x1)) = 199.0953
Extended backslash [A; eye(size(x,1))] \ [b; c*x_0] 0 itereations:
max(norm(x-x1)) = 0.7065
Extended backslash [A; eye(size(x,1))] \ [b; c*x_0] 5,000 itereations:
max(norm(x-x1)) = 0.7065 (not a typo)
My algorithm running 5,000 iterations:
max(norm(x-x1)) = 0.0369 (A 19 fold improvement)
Would these results be considered "novel" by the math community?
Thanks for your help,
Dan