# HG changeset patch # User jwe # Date 1193413978 0 # Node ID b48d486f641de961050a11bfe741177426bfbd30 # Parent c3b479e753dd21d1f8585aa530be5f9077a7c181 [project @ 2007-10-26 15:52:57 by jwe] diff -r c3b479e753dd -r b48d486f641d PROJECTS --- a/PROJECTS Fri Oct 26 15:14:35 2007 +0000 +++ b/PROJECTS Fri Oct 26 15:52:58 2007 +0000 @@ -66,7 +66,7 @@ * Consider making the behavior of the / and \ operators for non-square systems compatible with Matlab. Currently, they return - the minimum norm solution from DGELSS, which behaves differently. + the minimum norm solution from DGELSD, which behaves differently. --------------- Sparse Matrices: diff -r c3b479e753dd -r b48d486f641d doc/interpreter/linalg.txi --- a/doc/interpreter/linalg.txi Fri Oct 26 15:14:35 2007 +0000 +++ b/doc/interpreter/linalg.txi Fri Oct 26 15:52:58 2007 +0000 @@ -63,7 +63,7 @@ @item If the matrix is not square, or any of the previous solvers flags a singular or near singular matrix, find a least squares solution using -the @sc{Lapack} xGELSS function. +the @sc{Lapack} xGELSD function. @end enumerate The user can force the type of the matrix with the @code{matrix_type} diff -r c3b479e753dd -r b48d486f641d libcruft/ChangeLog --- a/libcruft/ChangeLog Fri Oct 26 15:14:35 2007 +0000 +++ b/libcruft/ChangeLog Fri Oct 26 15:52:58 2007 +0000 @@ -1,3 +1,19 @@ +2007-10-26 John W. Eaton + + * lapack/dlals0.f: New file. + * lapack/Makefile.in (FSRC): Add it to the list. + +2007-10-26 David Bateman + + * lapack/dgelsd.f, lapack/dlalsd.f, lapack/dlalsa.f, + lapack/dlasda.f, lapack/dlasdt.f, lapack/dlasdq.f + lapack/dlamrg.f, lapack/dlasd0.f, lapack/dlasd1.f, + lapack/dlasd2.f, lapack/dlasd3.f, lapack/dlasd4.f, + lapack/dlasd5.f, lapack/dlasd6.f, lapack/dlasd7.f, + lapack/dlasd8.f, lapack/dlaed6.f, lapack/zgelsd.f, + lapack/zlalsd.f , lapack/zlalsa.f, lapack/zlals0.f: New files. + * lapack/Makefile.in (FSRC): Include them here. + 2007-10-23 John W. Eaton * lapack/dgtts2.f, lapack/zgtts2.f: New files. diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/Makefile.in --- a/libcruft/lapack/Makefile.in Fri Oct 26 15:14:35 2007 +0000 +++ b/libcruft/lapack/Makefile.in Fri Oct 26 15:52:58 2007 +0000 @@ -28,44 +28,47 @@ FSRC = dbdsqr.f dgbcon.f dgbtf2.f dgbtrf.f dgbtrs.f dgebak.f dgebal.f \ dgebd2.f dgebrd.f dgecon.f dgeesx.f dgeev.f dgehd2.f dgehrd.f \ - dgelq2.f dgelqf.f dgelss.f dgelsy.f dgeqp3.f dgeqpf.f dgeqr2.f \ - dgeqrf.f dgesv.f dgesvd.f dgetf2.f dgetrf.f dgetri.f dgetrs.f \ - dggbak.f dggbal.f dgghrd.f dgtsv.f dgttrf.f dgttrs.f dgtts2.f dhgeqz.f \ - dhseqr.f dlabad.f dlabrd.f dlacn2.f dlacon.f dlacpy.f dladiv.f \ - dlae2.f dlaev2.f dlaexc.f dlag2.f dlahqr.f dlahr2.f dlahrd.f \ - dlaic1.f dlaln2.f dlamc1.f dlamc2.f dlamc3.f dlamc4.f dlamc5.f \ - dlamch.f dlange.f dlanhs.f dlanst.f dlansy.f dlantr.f dlanv2.f \ + dgelq2.f dgelqf.f dgelsd.f dgelss.f dgelsy.f dgeqp3.f dgeqpf.f \ + dgeqr2.f dgeqrf.f dgesv.f dgesvd.f dgetf2.f dgetrf.f dgetri.f \ + dgetrs.f dggbak.f dggbal.f dgghrd.f dgtsv.f dgttrf.f dgttrs.f \ + dgtts2.f dhgeqz.f dhseqr.f dlabad.f dlabrd.f dlacn2.f dlacon.f \ + dlacpy.f dladiv.f dlae2.f dlaed6.f dlaev2.f dlaexc.f dlag2.f \ + dlahqr.f dlahr2.f dlahrd.f dlaic1.f dlaln2.f dlals0.f dlalsa.f \ + dlalsd.f dlamc1.f dlamc2.f dlamc3.f dlamc4.f dlamc5.f dlamch.f \ + dlamrg.f dlange.f dlanhs.f dlanst.f dlansy.f dlantr.f dlanv2.f \ dlapy2.f dlapy3.f dlaqp2.f dlaqps.f dlaqr0.f dlaqr1.f dlaqr2.f \ dlaqr3.f dlaqr4.f dlaqr5.f dlarf.f dlarfb.f dlarfg.f dlarft.f \ dlarfx.f dlartg.f dlarz.f dlarzb.f dlarzt.f dlas2.f dlascl.f \ - dlaset.f dlasq1.f dlasq2.f dlasq3.f dlasq4.f dlasq5.f dlasq6.f \ - dlasr.f dlasrt.f dlassq.f dlasv2.f dlaswp.f dlasy2.f dlatbs.f \ - dlatrd.f dlatrs.f dlatrz.f dlauu2.f dlauum.f dlazq3.f dlazq4.f \ - dorg2l.f dorg2r.f dorgbr.f dorghr.f dorgl2.f dorglq.f dorgql.f \ - dorgqr.f dorgtr.f dorm2r.f dormbr.f dorml2.f dormlq.f dormqr.f \ - dormr3.f dormrz.f dpbcon.f dpbtf2.f dpbtrf.f dpbtrs.f dpocon.f \ - dpotf2.f dpotrf.f dpotri.f dpotrs.f dptsv.f dpttrf.f dpttrs.f \ - dptts2.f drscl.f dsteqr.f dsterf.f dsyev.f dsytd2.f dsytrd.f \ - dtgevc.f dtrcon.f dtrevc.f dtrexc.f dtrsen.f dtrsyl.f dtrti2.f \ - dtrtri.f dtrtrs.f dtzrzf.f dzsum1.f ieeeck.f ilaenv.f iparmq.f \ - izmax1.f spotf2.f spotrf.f zbdsqr.f zdrscl.f zgbcon.f zgbtf2.f \ - zgbtrf.f zgbtrs.f zgebak.f zgebal.f zgebd2.f zgebrd.f zgecon.f \ - zgeesx.f zgeev.f zgehd2.f zgehrd.f zgelq2.f zgelqf.f zgelss.f \ - zgelsy.f zgeqp3.f zgeqpf.f zgeqr2.f zgeqrf.f zgesv.f zgesvd.f \ - zgetf2.f zgetrf.f zgetri.f zgetrs.f zggbal.f zgtsv.f zgttrf.f \ - zgttrs.f zgtts2.f zheev.f zhetd2.f zhetrd.f zhseqr.f zlabrd.f zlacgv.f \ + dlasd0.f dlasd1.f dlasd2.f dlasd3.f dlasd4.f dlasd5.f dlasd6.f \ + dlasd7.f dlasd8.f dlasda.f dlasdq.f dlasdt.f dlaset.f dlasq1.f \ + dlasq2.f dlasq3.f dlasq4.f dlasq5.f dlasq6.f dlasr.f dlasrt.f \ + dlassq.f dlasv2.f dlaswp.f dlasy2.f dlatbs.f dlatrd.f dlatrs.f \ + dlatrz.f dlauu2.f dlauum.f dlazq3.f dlazq4.f dorg2l.f dorg2r.f \ + dorgbr.f dorghr.f dorgl2.f dorglq.f dorgql.f dorgqr.f dorgtr.f \ + dorm2r.f dormbr.f dorml2.f dormlq.f dormqr.f dormr3.f dormrz.f \ + dpbcon.f dpbtf2.f dpbtrf.f dpbtrs.f dpocon.f dpotf2.f dpotrf.f \ + dpotri.f dpotrs.f dptsv.f dpttrf.f dpttrs.f dptts2.f drscl.f \ + dsteqr.f dsterf.f dsyev.f dsytd2.f dsytrd.f dtgevc.f dtrcon.f \ + dtrevc.f dtrexc.f dtrsen.f dtrsyl.f dtrti2.f dtrtri.f dtrtrs.f \ + dtzrzf.f dzsum1.f ieeeck.f ilaenv.f iparmq.f izmax1.f spotf2.f \ + spotrf.f zbdsqr.f zdrscl.f zgbcon.f zgbtf2.f zgbtrf.f zgbtrs.f \ + zgebak.f zgebal.f zgebd2.f zgebrd.f zgecon.f zgeesx.f zgeev.f \ + zgehd2.f zgehrd.f zgelq2.f zgelqf.f zgelsd.f zgelss.f zgelsy.f \ + zgeqp3.f zgeqpf.f zgeqr2.f zgeqrf.f zgesv.f zgesvd.f zgetf2.f \ + zgetrf.f zgetri.f zgetrs.f zggbal.f zgtsv.f zgttrf.f zgttrs.f \ + zgtts2.f zheev.f zhetd2.f zhetrd.f zhseqr.f zlabrd.f zlacgv.f \ zlacn2.f zlacon.f zlacpy.f zladiv.f zlahqr.f zlahr2.f zlahrd.f \ - zlaic1.f zlange.f zlanhe.f zlanhs.f zlantr.f zlaqp2.f zlaqps.f \ - zlaqr0.f zlaqr1.f zlaqr2.f zlaqr3.f zlaqr4.f zlaqr5.f zlarf.f \ - zlarfb.f zlarfg.f zlarft.f zlarfx.f zlartg.f zlarz.f zlarzb.f \ - zlarzt.f zlascl.f zlaset.f zlasr.f zlassq.f zlaswp.f zlatbs.f \ - zlatrd.f zlatrs.f zlatrz.f zlauu2.f zlauum.f zpbcon.f zpbtf2.f \ - zpbtrf.f zpbtrs.f zpocon.f zpotf2.f zpotrf.f zpotri.f zpotrs.f \ - zptsv.f zpttrf.f zpttrs.f zptts2.f zrot.f zsteqr.f ztrcon.f ztrevc.f \ - ztrexc.f ztrsen.f ztrsyl.f ztrti2.f ztrtri.f ztrtrs.f ztzrzf.f \ - zung2l.f zung2r.f zungbr.f zunghr.f zungl2.f zunglq.f zungql.f \ - zungqr.f zungtr.f zunm2r.f zunmbr.f zunml2.f zunmlq.f zunmqr.f \ - zunmr3.f zunmrz.f + zlaic1.f zlals0.f zlalsa.f zlalsd.f zlange.f zlanhe.f zlanhs.f \ + zlantr.f zlaqp2.f zlaqps.f zlaqr0.f zlaqr1.f zlaqr2.f zlaqr3.f \ + zlaqr4.f zlaqr5.f zlarf.f zlarfb.f zlarfg.f zlarft.f zlarfx.f \ + zlartg.f zlarz.f zlarzb.f zlarzt.f zlascl.f zlaset.f zlasr.f \ + zlassq.f zlaswp.f zlatbs.f zlatrd.f zlatrs.f zlatrz.f zlauu2.f \ + zlauum.f zpbcon.f zpbtf2.f zpbtrf.f zpbtrs.f zpocon.f zpotf2.f \ + zpotrf.f zpotri.f zpotrs.f zptsv.f zpttrf.f zpttrs.f zptts2.f zrot.f \ + zsteqr.f ztrcon.f ztrevc.f ztrexc.f ztrsen.f ztrsyl.f ztrti2.f \ + ztrtri.f ztrtrs.f ztzrzf.f zung2l.f zung2r.f zungbr.f zunghr.f \ + zungl2.f zunglq.f zungql.f zungqr.f zungtr.f zunm2r.f zunmbr.f \ + zunml2.f zunmlq.f zunmqr.f zunmr3.f zunmrz.f include $(TOPDIR)/Makeconf diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dgelsd.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dgelsd.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,528 @@ + SUBROUTINE DGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND, RANK, + $ WORK, LWORK, IWORK, INFO ) +* +* -- LAPACK driver routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK + DOUBLE PRECISION RCOND +* .. +* .. Array Arguments .. + INTEGER IWORK( * ) + DOUBLE PRECISION A( LDA, * ), B( LDB, * ), S( * ), WORK( * ) +* .. +* +* Purpose +* ======= +* +* DGELSD computes the minimum-norm solution to a real linear least +* squares problem: +* minimize 2-norm(| b - A*x |) +* using the singular value decomposition (SVD) of A. A is an M-by-N +* matrix which may be rank-deficient. +* +* Several right hand side vectors b and solution vectors x can be +* handled in a single call; they are stored as the columns of the +* M-by-NRHS right hand side matrix B and the N-by-NRHS solution +* matrix X. +* +* The problem is solved in three steps: +* (1) Reduce the coefficient matrix A to bidiagonal form with +* Householder transformations, reducing the original problem +* into a "bidiagonal least squares problem" (BLS) +* (2) Solve the BLS using a divide and conquer approach. +* (3) Apply back all the Householder tranformations to solve +* the original least squares problem. +* +* The effective rank of A is determined by treating as zero those +* singular values which are less than RCOND times the largest singular +* value. +* +* The divide and conquer algorithm makes very mild assumptions about +* floating point arithmetic. It will work on machines with a guard +* digit in add/subtract, or on those binary machines without guard +* digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or +* Cray-2. It could conceivably fail on hexadecimal or decimal machines +* without guard digits, but we know of none. +* +* Arguments +* ========= +* +* M (input) INTEGER +* The number of rows of A. M >= 0. +* +* N (input) INTEGER +* The number of columns of A. N >= 0. +* +* NRHS (input) INTEGER +* The number of right hand sides, i.e., the number of columns +* of the matrices B and X. NRHS >= 0. +* +* A (input) DOUBLE PRECISION array, dimension (LDA,N) +* On entry, the M-by-N matrix A. +* On exit, A has been destroyed. +* +* LDA (input) INTEGER +* The leading dimension of the array A. LDA >= max(1,M). +* +* B (input/output) DOUBLE PRECISION array, dimension (LDB,NRHS) +* On entry, the M-by-NRHS right hand side matrix B. +* On exit, B is overwritten by the N-by-NRHS solution +* matrix X. If m >= n and RANK = n, the residual +* sum-of-squares for the solution in the i-th column is given +* by the sum of squares of elements n+1:m in that column. +* +* LDB (input) INTEGER +* The leading dimension of the array B. LDB >= max(1,max(M,N)). +* +* S (output) DOUBLE PRECISION array, dimension (min(M,N)) +* The singular values of A in decreasing order. +* The condition number of A in the 2-norm = S(1)/S(min(m,n)). +* +* RCOND (input) DOUBLE PRECISION +* RCOND is used to determine the effective rank of A. +* Singular values S(i) <= RCOND*S(1) are treated as zero. +* If RCOND < 0, machine precision is used instead. +* +* RANK (output) INTEGER +* The effective rank of A, i.e., the number of singular values +* which are greater than RCOND*S(1). +* +* WORK (workspace/output) DOUBLE PRECISION array, dimension (MAX(1,LWORK)) +* On exit, if INFO = 0, WORK(1) returns the optimal LWORK. +* +* LWORK (input) INTEGER +* The dimension of the array WORK. LWORK must be at least 1. +* The exact minimum amount of workspace needed depends on M, +* N and NRHS. As long as LWORK is at least +* 12*N + 2*N*SMLSIZ + 8*N*NLVL + N*NRHS + (SMLSIZ+1)**2, +* if M is greater than or equal to N or +* 12*M + 2*M*SMLSIZ + 8*M*NLVL + M*NRHS + (SMLSIZ+1)**2, +* if M is less than N, the code will execute correctly. +* SMLSIZ is returned by ILAENV and is equal to the maximum +* size of the subproblems at the bottom of the computation +* tree (usually about 25), and +* NLVL = MAX( 0, INT( LOG_2( MIN( M,N )/(SMLSIZ+1) ) ) + 1 ) +* For good performance, LWORK should generally be larger. +* +* If LWORK = -1, then a workspace query is assumed; the routine +* only calculates the optimal size of the WORK array, returns +* this value as the first entry of the WORK array, and no error +* message related to LWORK is issued by XERBLA. +* +* IWORK (workspace) INTEGER array, dimension (MAX(1,LIWORK)) +* LIWORK >= 3 * MINMN * NLVL + 11 * MINMN, +* where MINMN = MIN( M,N ). +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: the algorithm for computing the SVD failed to converge; +* if INFO = i, i off-diagonal elements of an intermediate +* bidiagonal form did not converge to zero. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE, TWO + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0 ) +* .. +* .. Local Scalars .. + LOGICAL LQUERY + INTEGER IASCL, IBSCL, IE, IL, ITAU, ITAUP, ITAUQ, + $ LDWORK, MAXMN, MAXWRK, MINMN, MINWRK, MM, + $ MNTHR, NLVL, NWORK, SMLSIZ, WLALSD + DOUBLE PRECISION ANRM, BIGNUM, BNRM, EPS, SFMIN, SMLNUM +* .. +* .. External Subroutines .. + EXTERNAL DGEBRD, DGELQF, DGEQRF, DLABAD, DLACPY, DLALSD, + $ DLASCL, DLASET, DORMBR, DORMLQ, DORMQR, XERBLA +* .. +* .. External Functions .. + INTEGER ILAENV + DOUBLE PRECISION DLAMCH, DLANGE + EXTERNAL ILAENV, DLAMCH, DLANGE +* .. +* .. Intrinsic Functions .. + INTRINSIC DBLE, INT, LOG, MAX, MIN +* .. +* .. Executable Statements .. +* +* Test the input arguments. +* + INFO = 0 + MINMN = MIN( M, N ) + MAXMN = MAX( M, N ) + MNTHR = ILAENV( 6, 'DGELSD', ' ', M, N, NRHS, -1 ) + LQUERY = ( LWORK.EQ.-1 ) + IF( M.LT.0 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( NRHS.LT.0 ) THEN + INFO = -3 + ELSE IF( LDA.LT.MAX( 1, M ) ) THEN + INFO = -5 + ELSE IF( LDB.LT.MAX( 1, MAXMN ) ) THEN + INFO = -7 + END IF +* + SMLSIZ = ILAENV( 9, 'DGELSD', ' ', 0, 0, 0, 0 ) +* +* Compute workspace. +* (Note: Comments in the code beginning "Workspace:" describe the +* minimal amount of workspace needed at that point in the code, +* as well as the preferred amount for good performance. +* NB refers to the optimal block size for the immediately +* following subroutine, as returned by ILAENV.) +* + MINWRK = 1 + MINMN = MAX( 1, MINMN ) + NLVL = MAX( INT( LOG( DBLE( MINMN ) / DBLE( SMLSIZ+1 ) ) / + $ LOG( TWO ) ) + 1, 0 ) +* + IF( INFO.EQ.0 ) THEN + MAXWRK = 0 + MM = M + IF( M.GE.N .AND. M.GE.MNTHR ) THEN +* +* Path 1a - overdetermined, with many more rows than columns. +* + MM = N + MAXWRK = MAX( MAXWRK, N+N*ILAENV( 1, 'DGEQRF', ' ', M, N, + $ -1, -1 ) ) + MAXWRK = MAX( MAXWRK, N+NRHS* + $ ILAENV( 1, 'DORMQR', 'LT', M, NRHS, N, -1 ) ) + END IF + IF( M.GE.N ) THEN +* +* Path 1 - overdetermined or exactly determined. +* + MAXWRK = MAX( MAXWRK, 3*N+( MM+N )* + $ ILAENV( 1, 'DGEBRD', ' ', MM, N, -1, -1 ) ) + MAXWRK = MAX( MAXWRK, 3*N+NRHS* + $ ILAENV( 1, 'DORMBR', 'QLT', MM, NRHS, N, -1 ) ) + MAXWRK = MAX( MAXWRK, 3*N+( N-1 )* + $ ILAENV( 1, 'DORMBR', 'PLN', N, NRHS, N, -1 ) ) + WLALSD = 9*N+2*N*SMLSIZ+8*N*NLVL+N*NRHS+(SMLSIZ+1)**2 + MAXWRK = MAX( MAXWRK, 3*N+WLALSD ) + MINWRK = MAX( 3*N+MM, 3*N+NRHS, 3*N+WLALSD ) + END IF + IF( N.GT.M ) THEN + WLALSD = 9*M+2*M*SMLSIZ+8*M*NLVL+M*NRHS+(SMLSIZ+1)**2 + IF( N.GE.MNTHR ) THEN +* +* Path 2a - underdetermined, with many more columns +* than rows. +* + MAXWRK = M + M*ILAENV( 1, 'DGELQF', ' ', M, N, -1, -1 ) + MAXWRK = MAX( MAXWRK, M*M+4*M+2*M* + $ ILAENV( 1, 'DGEBRD', ' ', M, M, -1, -1 ) ) + MAXWRK = MAX( MAXWRK, M*M+4*M+NRHS* + $ ILAENV( 1, 'DORMBR', 'QLT', M, NRHS, M, -1 ) ) + MAXWRK = MAX( MAXWRK, M*M+4*M+( M-1 )* + $ ILAENV( 1, 'DORMBR', 'PLN', M, NRHS, M, -1 ) ) + IF( NRHS.GT.1 ) THEN + MAXWRK = MAX( MAXWRK, M*M+M+M*NRHS ) + ELSE + MAXWRK = MAX( MAXWRK, M*M+2*M ) + END IF + MAXWRK = MAX( MAXWRK, M+NRHS* + $ ILAENV( 1, 'DORMLQ', 'LT', N, NRHS, M, -1 ) ) + MAXWRK = MAX( MAXWRK, M*M+4*M+WLALSD ) + ELSE +* +* Path 2 - remaining underdetermined cases. +* + MAXWRK = 3*M + ( N+M )*ILAENV( 1, 'DGEBRD', ' ', M, N, + $ -1, -1 ) + MAXWRK = MAX( MAXWRK, 3*M+NRHS* + $ ILAENV( 1, 'DORMBR', 'QLT', M, NRHS, N, -1 ) ) + MAXWRK = MAX( MAXWRK, 3*M+M* + $ ILAENV( 1, 'DORMBR', 'PLN', N, NRHS, M, -1 ) ) + MAXWRK = MAX( MAXWRK, 3*M+WLALSD ) + END IF + MINWRK = MAX( 3*M+NRHS, 3*M+M, 3*M+WLALSD ) + END IF + MINWRK = MIN( MINWRK, MAXWRK ) + WORK( 1 ) = MAXWRK + IF( LWORK.LT.MINWRK .AND. .NOT.LQUERY ) THEN + INFO = -12 + END IF + END IF +* + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DGELSD', -INFO ) + RETURN + ELSE IF( LQUERY ) THEN + GO TO 10 + END IF +* +* Quick return if possible. +* + IF( M.EQ.0 .OR. N.EQ.0 ) THEN + RANK = 0 + RETURN + END IF +* +* Get machine parameters. +* + EPS = DLAMCH( 'P' ) + SFMIN = DLAMCH( 'S' ) + SMLNUM = SFMIN / EPS + BIGNUM = ONE / SMLNUM + CALL DLABAD( SMLNUM, BIGNUM ) +* +* Scale A if max entry outside range [SMLNUM,BIGNUM]. +* + ANRM = DLANGE( 'M', M, N, A, LDA, WORK ) + IASCL = 0 + IF( ANRM.GT.ZERO .AND. ANRM.LT.SMLNUM ) THEN +* +* Scale matrix norm up to SMLNUM. +* + CALL DLASCL( 'G', 0, 0, ANRM, SMLNUM, M, N, A, LDA, INFO ) + IASCL = 1 + ELSE IF( ANRM.GT.BIGNUM ) THEN +* +* Scale matrix norm down to BIGNUM. +* + CALL DLASCL( 'G', 0, 0, ANRM, BIGNUM, M, N, A, LDA, INFO ) + IASCL = 2 + ELSE IF( ANRM.EQ.ZERO ) THEN +* +* Matrix all zero. Return zero solution. +* + CALL DLASET( 'F', MAX( M, N ), NRHS, ZERO, ZERO, B, LDB ) + CALL DLASET( 'F', MINMN, 1, ZERO, ZERO, S, 1 ) + RANK = 0 + GO TO 10 + END IF +* +* Scale B if max entry outside range [SMLNUM,BIGNUM]. +* + BNRM = DLANGE( 'M', M, NRHS, B, LDB, WORK ) + IBSCL = 0 + IF( BNRM.GT.ZERO .AND. BNRM.LT.SMLNUM ) THEN +* +* Scale matrix norm up to SMLNUM. +* + CALL DLASCL( 'G', 0, 0, BNRM, SMLNUM, M, NRHS, B, LDB, INFO ) + IBSCL = 1 + ELSE IF( BNRM.GT.BIGNUM ) THEN +* +* Scale matrix norm down to BIGNUM. +* + CALL DLASCL( 'G', 0, 0, BNRM, BIGNUM, M, NRHS, B, LDB, INFO ) + IBSCL = 2 + END IF +* +* If M < N make sure certain entries of B are zero. +* + IF( M.LT.N ) + $ CALL DLASET( 'F', N-M, NRHS, ZERO, ZERO, B( M+1, 1 ), LDB ) +* +* Overdetermined case. +* + IF( M.GE.N ) THEN +* +* Path 1 - overdetermined or exactly determined. +* + MM = M + IF( M.GE.MNTHR ) THEN +* +* Path 1a - overdetermined, with many more rows than columns. +* + MM = N + ITAU = 1 + NWORK = ITAU + N +* +* Compute A=Q*R. +* (Workspace: need 2*N, prefer N+N*NB) +* + CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Multiply B by transpose(Q). +* (Workspace: need N+NRHS, prefer N+NRHS*NB) +* + CALL DORMQR( 'L', 'T', M, NRHS, N, A, LDA, WORK( ITAU ), B, + $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* +* Zero out below R. +* + IF( N.GT.1 ) THEN + CALL DLASET( 'L', N-1, N-1, ZERO, ZERO, A( 2, 1 ), LDA ) + END IF + END IF +* + IE = 1 + ITAUQ = IE + N + ITAUP = ITAUQ + N + NWORK = ITAUP + N +* +* Bidiagonalize R in A. +* (Workspace: need 3*N+MM, prefer 3*N+(MM+N)*NB) +* + CALL DGEBRD( MM, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), + $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, + $ INFO ) +* +* Multiply B by transpose of left bidiagonalizing vectors of R. +* (Workspace: need 3*N+NRHS, prefer 3*N+NRHS*NB) +* + CALL DORMBR( 'Q', 'L', 'T', MM, NRHS, N, A, LDA, WORK( ITAUQ ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* +* Solve the bidiagonal least squares problem. +* + CALL DLALSD( 'U', SMLSIZ, N, NRHS, S, WORK( IE ), B, LDB, + $ RCOND, RANK, WORK( NWORK ), IWORK, INFO ) + IF( INFO.NE.0 ) THEN + GO TO 10 + END IF +* +* Multiply B by right bidiagonalizing vectors of R. +* + CALL DORMBR( 'P', 'L', 'N', N, NRHS, N, A, LDA, WORK( ITAUP ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* + ELSE IF( N.GE.MNTHR .AND. LWORK.GE.4*M+M*M+ + $ MAX( M, 2*M-4, NRHS, N-3*M, WLALSD ) ) THEN +* +* Path 2a - underdetermined, with many more columns than rows +* and sufficient workspace for an efficient algorithm. +* + LDWORK = M + IF( LWORK.GE.MAX( 4*M+M*LDA+MAX( M, 2*M-4, NRHS, N-3*M ), + $ M*LDA+M+M*NRHS, 4*M+M*LDA+WLALSD ) )LDWORK = LDA + ITAU = 1 + NWORK = M + 1 +* +* Compute A=L*Q. +* (Workspace: need 2*M, prefer M+M*NB) +* + CALL DGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) + IL = NWORK +* +* Copy L to WORK(IL), zeroing out above its diagonal. +* + CALL DLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWORK ) + CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, WORK( IL+LDWORK ), + $ LDWORK ) + IE = IL + LDWORK*M + ITAUQ = IE + M + ITAUP = ITAUQ + M + NWORK = ITAUP + M +* +* Bidiagonalize L in WORK(IL). +* (Workspace: need M*M+5*M, prefer M*M+4*M+2*M*NB) +* + CALL DGEBRD( M, M, WORK( IL ), LDWORK, S, WORK( IE ), + $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Multiply B by transpose of left bidiagonalizing vectors of L. +* (Workspace: need M*M+4*M+NRHS, prefer M*M+4*M+NRHS*NB) +* + CALL DORMBR( 'Q', 'L', 'T', M, NRHS, M, WORK( IL ), LDWORK, + $ WORK( ITAUQ ), B, LDB, WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Solve the bidiagonal least squares problem. +* + CALL DLALSD( 'U', SMLSIZ, M, NRHS, S, WORK( IE ), B, LDB, + $ RCOND, RANK, WORK( NWORK ), IWORK, INFO ) + IF( INFO.NE.0 ) THEN + GO TO 10 + END IF +* +* Multiply B by right bidiagonalizing vectors of L. +* + CALL DORMBR( 'P', 'L', 'N', M, NRHS, M, WORK( IL ), LDWORK, + $ WORK( ITAUP ), B, LDB, WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Zero out below first M rows of B. +* + CALL DLASET( 'F', N-M, NRHS, ZERO, ZERO, B( M+1, 1 ), LDB ) + NWORK = ITAU + M +* +* Multiply transpose(Q) by B. +* (Workspace: need M+NRHS, prefer M+NRHS*NB) +* + CALL DORMLQ( 'L', 'T', N, NRHS, M, A, LDA, WORK( ITAU ), B, + $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* + ELSE +* +* Path 2 - remaining underdetermined cases. +* + IE = 1 + ITAUQ = IE + M + ITAUP = ITAUQ + M + NWORK = ITAUP + M +* +* Bidiagonalize A. +* (Workspace: need 3*M+N, prefer 3*M+(M+N)*NB) +* + CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), + $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, + $ INFO ) +* +* Multiply B by transpose of left bidiagonalizing vectors. +* (Workspace: need 3*M+NRHS, prefer 3*M+NRHS*NB) +* + CALL DORMBR( 'Q', 'L', 'T', M, NRHS, N, A, LDA, WORK( ITAUQ ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* +* Solve the bidiagonal least squares problem. +* + CALL DLALSD( 'L', SMLSIZ, M, NRHS, S, WORK( IE ), B, LDB, + $ RCOND, RANK, WORK( NWORK ), IWORK, INFO ) + IF( INFO.NE.0 ) THEN + GO TO 10 + END IF +* +* Multiply B by right bidiagonalizing vectors of A. +* + CALL DORMBR( 'P', 'L', 'N', N, NRHS, M, A, LDA, WORK( ITAUP ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* + END IF +* +* Undo scaling. +* + IF( IASCL.EQ.1 ) THEN + CALL DLASCL( 'G', 0, 0, ANRM, SMLNUM, N, NRHS, B, LDB, INFO ) + CALL DLASCL( 'G', 0, 0, SMLNUM, ANRM, MINMN, 1, S, MINMN, + $ INFO ) + ELSE IF( IASCL.EQ.2 ) THEN + CALL DLASCL( 'G', 0, 0, ANRM, BIGNUM, N, NRHS, B, LDB, INFO ) + CALL DLASCL( 'G', 0, 0, BIGNUM, ANRM, MINMN, 1, S, MINMN, + $ INFO ) + END IF + IF( IBSCL.EQ.1 ) THEN + CALL DLASCL( 'G', 0, 0, SMLNUM, BNRM, N, NRHS, B, LDB, INFO ) + ELSE IF( IBSCL.EQ.2 ) THEN + CALL DLASCL( 'G', 0, 0, BIGNUM, BNRM, N, NRHS, B, LDB, INFO ) + END IF +* + 10 CONTINUE + WORK( 1 ) = MAXWRK + RETURN +* +* End of DGELSD +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlaed6.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlaed6.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,327 @@ + SUBROUTINE DLAED6( KNITER, ORGATI, RHO, D, Z, FINIT, TAU, INFO ) +* +* -- LAPACK routine (version 3.1.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* February 2007 +* +* .. Scalar Arguments .. + LOGICAL ORGATI + INTEGER INFO, KNITER + DOUBLE PRECISION FINIT, RHO, TAU +* .. +* .. Array Arguments .. + DOUBLE PRECISION D( 3 ), Z( 3 ) +* .. +* +* Purpose +* ======= +* +* DLAED6 computes the positive or negative root (closest to the origin) +* of +* z(1) z(2) z(3) +* f(x) = rho + --------- + ---------- + --------- +* d(1)-x d(2)-x d(3)-x +* +* It is assumed that +* +* if ORGATI = .true. the root is between d(2) and d(3); +* otherwise it is between d(1) and d(2) +* +* This routine will be called by DLAED4 when necessary. In most cases, +* the root sought is the smallest in magnitude, though it might not be +* in some extremely rare situations. +* +* Arguments +* ========= +* +* KNITER (input) INTEGER +* Refer to DLAED4 for its significance. +* +* ORGATI (input) LOGICAL +* If ORGATI is true, the needed root is between d(2) and +* d(3); otherwise it is between d(1) and d(2). See +* DLAED4 for further details. +* +* RHO (input) DOUBLE PRECISION +* Refer to the equation f(x) above. +* +* D (input) DOUBLE PRECISION array, dimension (3) +* D satisfies d(1) < d(2) < d(3). +* +* Z (input) DOUBLE PRECISION array, dimension (3) +* Each of the elements in z must be positive. +* +* FINIT (input) DOUBLE PRECISION +* The value of f at 0. It is more accurate than the one +* evaluated inside this routine (if someone wants to do +* so). +* +* TAU (output) DOUBLE PRECISION +* The root of the equation f(x). +* +* INFO (output) INTEGER +* = 0: successful exit +* > 0: if INFO = 1, failure to converge +* +* Further Details +* =============== +* +* 30/06/99: Based on contributions by +* Ren-Cang Li, Computer Science Division, University of California +* at Berkeley, USA +* +* 10/02/03: This version has a few statements commented out for thread +* safety (machine parameters are computed on each entry). SJH. +* +* 05/10/06: Modified from a new version of Ren-Cang Li, use +* Gragg-Thornton-Warner cubic convergent scheme for better stability. +* +* ===================================================================== +* +* .. Parameters .. + INTEGER MAXIT + PARAMETER ( MAXIT = 40 ) + DOUBLE PRECISION ZERO, ONE, TWO, THREE, FOUR, EIGHT + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0, + $ THREE = 3.0D0, FOUR = 4.0D0, EIGHT = 8.0D0 ) +* .. +* .. External Functions .. + DOUBLE PRECISION DLAMCH + EXTERNAL DLAMCH +* .. +* .. Local Arrays .. + DOUBLE PRECISION DSCALE( 3 ), ZSCALE( 3 ) +* .. +* .. Local Scalars .. + LOGICAL SCALE + INTEGER I, ITER, NITER + DOUBLE PRECISION A, B, BASE, C, DDF, DF, EPS, ERRETM, ETA, F, + $ FC, SCLFAC, SCLINV, SMALL1, SMALL2, SMINV1, + $ SMINV2, TEMP, TEMP1, TEMP2, TEMP3, TEMP4, + $ LBD, UBD +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, INT, LOG, MAX, MIN, SQRT +* .. +* .. Executable Statements .. +* + INFO = 0 +* + IF( ORGATI ) THEN + LBD = D(2) + UBD = D(3) + ELSE + LBD = D(1) + UBD = D(2) + END IF + IF( FINIT .LT. ZERO )THEN + LBD = ZERO + ELSE + UBD = ZERO + END IF +* + NITER = 1 + TAU = ZERO + IF( KNITER.EQ.2 ) THEN + IF( ORGATI ) THEN + TEMP = ( D( 3 )-D( 2 ) ) / TWO + C = RHO + Z( 1 ) / ( ( D( 1 )-D( 2 ) )-TEMP ) + A = C*( D( 2 )+D( 3 ) ) + Z( 2 ) + Z( 3 ) + B = C*D( 2 )*D( 3 ) + Z( 2 )*D( 3 ) + Z( 3 )*D( 2 ) + ELSE + TEMP = ( D( 1 )-D( 2 ) ) / TWO + C = RHO + Z( 3 ) / ( ( D( 3 )-D( 2 ) )-TEMP ) + A = C*( D( 1 )+D( 2 ) ) + Z( 1 ) + Z( 2 ) + B = C*D( 1 )*D( 2 ) + Z( 1 )*D( 2 ) + Z( 2 )*D( 1 ) + END IF + TEMP = MAX( ABS( A ), ABS( B ), ABS( C ) ) + A = A / TEMP + B = B / TEMP + C = C / TEMP + IF( C.EQ.ZERO ) THEN + TAU = B / A + ELSE IF( A.LE.ZERO ) THEN + TAU = ( A-SQRT( ABS( A*A-FOUR*B*C ) ) ) / ( TWO*C ) + ELSE + TAU = TWO*B / ( A+SQRT( ABS( A*A-FOUR*B*C ) ) ) + END IF + IF( TAU .LT. LBD .OR. TAU .GT. UBD ) + $ TAU = ( LBD+UBD )/TWO + IF( D(1).EQ.TAU .OR. D(2).EQ.TAU .OR. D(3).EQ.TAU ) THEN + TAU = ZERO + ELSE + TEMP = FINIT + TAU*Z(1)/( D(1)*( D( 1 )-TAU ) ) + + $ TAU*Z(2)/( D(2)*( D( 2 )-TAU ) ) + + $ TAU*Z(3)/( D(3)*( D( 3 )-TAU ) ) + IF( TEMP .LE. ZERO )THEN + LBD = TAU + ELSE + UBD = TAU + END IF + IF( ABS( FINIT ).LE.ABS( TEMP ) ) + $ TAU = ZERO + END IF + END IF +* +* get machine parameters for possible scaling to avoid overflow +* +* modified by Sven: parameters SMALL1, SMINV1, SMALL2, +* SMINV2, EPS are not SAVEd anymore between one call to the +* others but recomputed at each call +* + EPS = DLAMCH( 'Epsilon' ) + BASE = DLAMCH( 'Base' ) + SMALL1 = BASE**( INT( LOG( DLAMCH( 'SafMin' ) ) / LOG( BASE ) / + $ THREE ) ) + SMINV1 = ONE / SMALL1 + SMALL2 = SMALL1*SMALL1 + SMINV2 = SMINV1*SMINV1 +* +* Determine if scaling of inputs necessary to avoid overflow +* when computing 1/TEMP**3 +* + IF( ORGATI ) THEN + TEMP = MIN( ABS( D( 2 )-TAU ), ABS( D( 3 )-TAU ) ) + ELSE + TEMP = MIN( ABS( D( 1 )-TAU ), ABS( D( 2 )-TAU ) ) + END IF + SCALE = .FALSE. + IF( TEMP.LE.SMALL1 ) THEN + SCALE = .TRUE. + IF( TEMP.LE.SMALL2 ) THEN +* +* Scale up by power of radix nearest 1/SAFMIN**(2/3) +* + SCLFAC = SMINV2 + SCLINV = SMALL2 + ELSE +* +* Scale up by power of radix nearest 1/SAFMIN**(1/3) +* + SCLFAC = SMINV1 + SCLINV = SMALL1 + END IF +* +* Scaling up safe because D, Z, TAU scaled elsewhere to be O(1) +* + DO 10 I = 1, 3 + DSCALE( I ) = D( I )*SCLFAC + ZSCALE( I ) = Z( I )*SCLFAC + 10 CONTINUE + TAU = TAU*SCLFAC + LBD = LBD*SCLFAC + UBD = UBD*SCLFAC + ELSE +* +* Copy D and Z to DSCALE and ZSCALE +* + DO 20 I = 1, 3 + DSCALE( I ) = D( I ) + ZSCALE( I ) = Z( I ) + 20 CONTINUE + END IF +* + FC = ZERO + DF = ZERO + DDF = ZERO + DO 30 I = 1, 3 + TEMP = ONE / ( DSCALE( I )-TAU ) + TEMP1 = ZSCALE( I )*TEMP + TEMP2 = TEMP1*TEMP + TEMP3 = TEMP2*TEMP + FC = FC + TEMP1 / DSCALE( I ) + DF = DF + TEMP2 + DDF = DDF + TEMP3 + 30 CONTINUE + F = FINIT + TAU*FC +* + IF( ABS( F ).LE.ZERO ) + $ GO TO 60 + IF( F .LE. ZERO )THEN + LBD = TAU + ELSE + UBD = TAU + END IF +* +* Iteration begins -- Use Gragg-Thornton-Warner cubic convergent +* scheme +* +* It is not hard to see that +* +* 1) Iterations will go up monotonically +* if FINIT < 0; +* +* 2) Iterations will go down monotonically +* if FINIT > 0. +* + ITER = NITER + 1 +* + DO 50 NITER = ITER, MAXIT +* + IF( ORGATI ) THEN + TEMP1 = DSCALE( 2 ) - TAU + TEMP2 = DSCALE( 3 ) - TAU + ELSE + TEMP1 = DSCALE( 1 ) - TAU + TEMP2 = DSCALE( 2 ) - TAU + END IF + A = ( TEMP1+TEMP2 )*F - TEMP1*TEMP2*DF + B = TEMP1*TEMP2*F + C = F - ( TEMP1+TEMP2 )*DF + TEMP1*TEMP2*DDF + TEMP = MAX( ABS( A ), ABS( B ), ABS( C ) ) + A = A / TEMP + B = B / TEMP + C = C / TEMP + IF( C.EQ.ZERO ) THEN + ETA = B / A + ELSE IF( A.LE.ZERO ) THEN + ETA = ( A-SQRT( ABS( A*A-FOUR*B*C ) ) ) / ( TWO*C ) + ELSE + ETA = TWO*B / ( A+SQRT( ABS( A*A-FOUR*B*C ) ) ) + END IF + IF( F*ETA.GE.ZERO ) THEN + ETA = -F / DF + END IF +* + TAU = TAU + ETA + IF( TAU .LT. LBD .OR. TAU .GT. UBD ) + $ TAU = ( LBD + UBD )/TWO +* + FC = ZERO + ERRETM = ZERO + DF = ZERO + DDF = ZERO + DO 40 I = 1, 3 + TEMP = ONE / ( DSCALE( I )-TAU ) + TEMP1 = ZSCALE( I )*TEMP + TEMP2 = TEMP1*TEMP + TEMP3 = TEMP2*TEMP + TEMP4 = TEMP1 / DSCALE( I ) + FC = FC + TEMP4 + ERRETM = ERRETM + ABS( TEMP4 ) + DF = DF + TEMP2 + DDF = DDF + TEMP3 + 40 CONTINUE + F = FINIT + TAU*FC + ERRETM = EIGHT*( ABS( FINIT )+ABS( TAU )*ERRETM ) + + $ ABS( TAU )*DF + IF( ABS( F ).LE.EPS*ERRETM ) + $ GO TO 60 + IF( F .LE. ZERO )THEN + LBD = TAU + ELSE + UBD = TAU + END IF + 50 CONTINUE + INFO = 1 + 60 CONTINUE +* +* Undo scaling +* + IF( SCALE ) + $ TAU = TAU*SCLINV + RETURN +* +* End of DLAED6 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlals0.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlals0.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,377 @@ + SUBROUTINE DLALS0( ICOMPQ, NL, NR, SQRE, NRHS, B, LDB, BX, LDBX, + $ PERM, GIVPTR, GIVCOL, LDGCOL, GIVNUM, LDGNUM, + $ POLES, DIFL, DIFR, Z, K, C, S, WORK, INFO ) +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER GIVPTR, ICOMPQ, INFO, K, LDB, LDBX, LDGCOL, + $ LDGNUM, NL, NR, NRHS, SQRE + DOUBLE PRECISION C, S +* .. +* .. Array Arguments .. + INTEGER GIVCOL( LDGCOL, * ), PERM( * ) + DOUBLE PRECISION B( LDB, * ), BX( LDBX, * ), DIFL( * ), + $ DIFR( LDGNUM, * ), GIVNUM( LDGNUM, * ), + $ POLES( LDGNUM, * ), WORK( * ), Z( * ) +* .. +* +* Purpose +* ======= +* +* DLALS0 applies back the multiplying factors of either the left or the +* right singular vector matrix of a diagonal matrix appended by a row +* to the right hand side matrix B in solving the least squares problem +* using the divide-and-conquer SVD approach. +* +* For the left singular vector matrix, three types of orthogonal +* matrices are involved: +* +* (1L) Givens rotations: the number of such rotations is GIVPTR; the +* pairs of columns/rows they were applied to are stored in GIVCOL; +* and the C- and S-values of these rotations are stored in GIVNUM. +* +* (2L) Permutation. The (NL+1)-st row of B is to be moved to the first +* row, and for J=2:N, PERM(J)-th row of B is to be moved to the +* J-th row. +* +* (3L) The left singular vector matrix of the remaining matrix. +* +* For the right singular vector matrix, four types of orthogonal +* matrices are involved: +* +* (1R) The right singular vector matrix of the remaining matrix. +* +* (2R) If SQRE = 1, one extra Givens rotation to generate the right +* null space. +* +* (3R) The inverse transformation of (2L). +* +* (4R) The inverse transformation of (1L). +* +* Arguments +* ========= +* +* ICOMPQ (input) INTEGER +* Specifies whether singular vectors are to be computed in +* factored form: +* = 0: Left singular vector matrix. +* = 1: Right singular vector matrix. +* +* NL (input) INTEGER +* The row dimension of the upper block. NL >= 1. +* +* NR (input) INTEGER +* The row dimension of the lower block. NR >= 1. +* +* SQRE (input) INTEGER +* = 0: the lower block is an NR-by-NR square matrix. +* = 1: the lower block is an NR-by-(NR+1) rectangular matrix. +* +* The bidiagonal matrix has row dimension N = NL + NR + 1, +* and column dimension M = N + SQRE. +* +* NRHS (input) INTEGER +* The number of columns of B and BX. NRHS must be at least 1. +* +* B (input/output) DOUBLE PRECISION array, dimension ( LDB, NRHS ) +* On input, B contains the right hand sides of the least +* squares problem in rows 1 through M. On output, B contains +* the solution X in rows 1 through N. +* +* LDB (input) INTEGER +* The leading dimension of B. LDB must be at least +* max(1,MAX( M, N ) ). +* +* BX (workspace) DOUBLE PRECISION array, dimension ( LDBX, NRHS ) +* +* LDBX (input) INTEGER +* The leading dimension of BX. +* +* PERM (input) INTEGER array, dimension ( N ) +* The permutations (from deflation and sorting) applied +* to the two blocks. +* +* GIVPTR (input) INTEGER +* The number of Givens rotations which took place in this +* subproblem. +* +* GIVCOL (input) INTEGER array, dimension ( LDGCOL, 2 ) +* Each pair of numbers indicates a pair of rows/columns +* involved in a Givens rotation. +* +* LDGCOL (input) INTEGER +* The leading dimension of GIVCOL, must be at least N. +* +* GIVNUM (input) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ) +* Each number indicates the C or S value used in the +* corresponding Givens rotation. +* +* LDGNUM (input) INTEGER +* The leading dimension of arrays DIFR, POLES and +* GIVNUM, must be at least K. +* +* POLES (input) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ) +* On entry, POLES(1:K, 1) contains the new singular +* values obtained from solving the secular equation, and +* POLES(1:K, 2) is an array containing the poles in the secular +* equation. +* +* DIFL (input) DOUBLE PRECISION array, dimension ( K ). +* On entry, DIFL(I) is the distance between I-th updated +* (undeflated) singular value and the I-th (undeflated) old +* singular value. +* +* DIFR (input) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ). +* On entry, DIFR(I, 1) contains the distances between I-th +* updated (undeflated) singular value and the I+1-th +* (undeflated) old singular value. And DIFR(I, 2) is the +* normalizing factor for the I-th right singular vector. +* +* Z (input) DOUBLE PRECISION array, dimension ( K ) +* Contain the components of the deflation-adjusted updating row +* vector. +* +* K (input) INTEGER +* Contains the dimension of the non-deflated matrix, +* This is the order of the related secular equation. 1 <= K <=N. +* +* C (input) DOUBLE PRECISION +* C contains garbage if SQRE =0 and the C-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* S (input) DOUBLE PRECISION +* S contains garbage if SQRE =0 and the S-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* WORK (workspace) DOUBLE PRECISION array, dimension ( K ) +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ONE, ZERO, NEGONE + PARAMETER ( ONE = 1.0D0, ZERO = 0.0D0, NEGONE = -1.0D0 ) +* .. +* .. Local Scalars .. + INTEGER I, J, M, N, NLP1 + DOUBLE PRECISION DIFLJ, DIFRJ, DJ, DSIGJ, DSIGJP, TEMP +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DGEMV, DLACPY, DLASCL, DROT, DSCAL, + $ XERBLA +* .. +* .. External Functions .. + DOUBLE PRECISION DLAMC3, DNRM2 + EXTERNAL DLAMC3, DNRM2 +* .. +* .. Intrinsic Functions .. + INTRINSIC MAX +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( NL.LT.1 ) THEN + INFO = -2 + ELSE IF( NR.LT.1 ) THEN + INFO = -3 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -4 + END IF +* + N = NL + NR + 1 +* + IF( NRHS.LT.1 ) THEN + INFO = -5 + ELSE IF( LDB.LT.N ) THEN + INFO = -7 + ELSE IF( LDBX.LT.N ) THEN + INFO = -9 + ELSE IF( GIVPTR.LT.0 ) THEN + INFO = -11 + ELSE IF( LDGCOL.LT.N ) THEN + INFO = -13 + ELSE IF( LDGNUM.LT.N ) THEN + INFO = -15 + ELSE IF( K.LT.1 ) THEN + INFO = -20 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLALS0', -INFO ) + RETURN + END IF +* + M = N + SQRE + NLP1 = NL + 1 +* + IF( ICOMPQ.EQ.0 ) THEN +* +* Apply back orthogonal transformations from the left. +* +* Step (1L): apply back the Givens rotations performed. +* + DO 10 I = 1, GIVPTR + CALL DROT( NRHS, B( GIVCOL( I, 2 ), 1 ), LDB, + $ B( GIVCOL( I, 1 ), 1 ), LDB, GIVNUM( I, 2 ), + $ GIVNUM( I, 1 ) ) + 10 CONTINUE +* +* Step (2L): permute rows of B. +* + CALL DCOPY( NRHS, B( NLP1, 1 ), LDB, BX( 1, 1 ), LDBX ) + DO 20 I = 2, N + CALL DCOPY( NRHS, B( PERM( I ), 1 ), LDB, BX( I, 1 ), LDBX ) + 20 CONTINUE +* +* Step (3L): apply the inverse of the left singular vector +* matrix to BX. +* + IF( K.EQ.1 ) THEN + CALL DCOPY( NRHS, BX, LDBX, B, LDB ) + IF( Z( 1 ).LT.ZERO ) THEN + CALL DSCAL( NRHS, NEGONE, B, LDB ) + END IF + ELSE + DO 50 J = 1, K + DIFLJ = DIFL( J ) + DJ = POLES( J, 1 ) + DSIGJ = -POLES( J, 2 ) + IF( J.LT.K ) THEN + DIFRJ = -DIFR( J, 1 ) + DSIGJP = -POLES( J+1, 2 ) + END IF + IF( ( Z( J ).EQ.ZERO ) .OR. ( POLES( J, 2 ).EQ.ZERO ) ) + $ THEN + WORK( J ) = ZERO + ELSE + WORK( J ) = -POLES( J, 2 )*Z( J ) / DIFLJ / + $ ( POLES( J, 2 )+DJ ) + END IF + DO 30 I = 1, J - 1 + IF( ( Z( I ).EQ.ZERO ) .OR. + $ ( POLES( I, 2 ).EQ.ZERO ) ) THEN + WORK( I ) = ZERO + ELSE + WORK( I ) = POLES( I, 2 )*Z( I ) / + $ ( DLAMC3( POLES( I, 2 ), DSIGJ )- + $ DIFLJ ) / ( POLES( I, 2 )+DJ ) + END IF + 30 CONTINUE + DO 40 I = J + 1, K + IF( ( Z( I ).EQ.ZERO ) .OR. + $ ( POLES( I, 2 ).EQ.ZERO ) ) THEN + WORK( I ) = ZERO + ELSE + WORK( I ) = POLES( I, 2 )*Z( I ) / + $ ( DLAMC3( POLES( I, 2 ), DSIGJP )+ + $ DIFRJ ) / ( POLES( I, 2 )+DJ ) + END IF + 40 CONTINUE + WORK( 1 ) = NEGONE + TEMP = DNRM2( K, WORK, 1 ) + CALL DGEMV( 'T', K, NRHS, ONE, BX, LDBX, WORK, 1, ZERO, + $ B( J, 1 ), LDB ) + CALL DLASCL( 'G', 0, 0, TEMP, ONE, 1, NRHS, B( J, 1 ), + $ LDB, INFO ) + 50 CONTINUE + END IF +* +* Move the deflated rows of BX to B also. +* + IF( K.LT.MAX( M, N ) ) + $ CALL DLACPY( 'A', N-K, NRHS, BX( K+1, 1 ), LDBX, + $ B( K+1, 1 ), LDB ) + ELSE +* +* Apply back the right orthogonal transformations. +* +* Step (1R): apply back the new right singular vector matrix +* to B. +* + IF( K.EQ.1 ) THEN + CALL DCOPY( NRHS, B, LDB, BX, LDBX ) + ELSE + DO 80 J = 1, K + DSIGJ = POLES( J, 2 ) + IF( Z( J ).EQ.ZERO ) THEN + WORK( J ) = ZERO + ELSE + WORK( J ) = -Z( J ) / DIFL( J ) / + $ ( DSIGJ+POLES( J, 1 ) ) / DIFR( J, 2 ) + END IF + DO 60 I = 1, J - 1 + IF( Z( J ).EQ.ZERO ) THEN + WORK( I ) = ZERO + ELSE + WORK( I ) = Z( J ) / ( DLAMC3( DSIGJ, -POLES( I+1, + $ 2 ) )-DIFR( I, 1 ) ) / + $ ( DSIGJ+POLES( I, 1 ) ) / DIFR( I, 2 ) + END IF + 60 CONTINUE + DO 70 I = J + 1, K + IF( Z( J ).EQ.ZERO ) THEN + WORK( I ) = ZERO + ELSE + WORK( I ) = Z( J ) / ( DLAMC3( DSIGJ, -POLES( I, + $ 2 ) )-DIFL( I ) ) / + $ ( DSIGJ+POLES( I, 1 ) ) / DIFR( I, 2 ) + END IF + 70 CONTINUE + CALL DGEMV( 'T', K, NRHS, ONE, B, LDB, WORK, 1, ZERO, + $ BX( J, 1 ), LDBX ) + 80 CONTINUE + END IF +* +* Step (2R): if SQRE = 1, apply back the rotation that is +* related to the right null space of the subproblem. +* + IF( SQRE.EQ.1 ) THEN + CALL DCOPY( NRHS, B( M, 1 ), LDB, BX( M, 1 ), LDBX ) + CALL DROT( NRHS, BX( 1, 1 ), LDBX, BX( M, 1 ), LDBX, C, S ) + END IF + IF( K.LT.MAX( M, N ) ) + $ CALL DLACPY( 'A', N-K, NRHS, B( K+1, 1 ), LDB, BX( K+1, 1 ), + $ LDBX ) +* +* Step (3R): permute rows of B. +* + CALL DCOPY( NRHS, BX( 1, 1 ), LDBX, B( NLP1, 1 ), LDB ) + IF( SQRE.EQ.1 ) THEN + CALL DCOPY( NRHS, BX( M, 1 ), LDBX, B( M, 1 ), LDB ) + END IF + DO 90 I = 2, N + CALL DCOPY( NRHS, BX( I, 1 ), LDBX, B( PERM( I ), 1 ), LDB ) + 90 CONTINUE +* +* Step (4R): apply back the Givens rotations performed. +* + DO 100 I = GIVPTR, 1, -1 + CALL DROT( NRHS, B( GIVCOL( I, 2 ), 1 ), LDB, + $ B( GIVCOL( I, 1 ), 1 ), LDB, GIVNUM( I, 2 ), + $ -GIVNUM( I, 1 ) ) + 100 CONTINUE + END IF +* + RETURN +* +* End of DLALS0 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlalsa.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlalsa.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,362 @@ + SUBROUTINE DLALSA( ICOMPQ, SMLSIZ, N, NRHS, B, LDB, BX, LDBX, U, + $ LDU, VT, K, DIFL, DIFR, Z, POLES, GIVPTR, + $ GIVCOL, LDGCOL, PERM, GIVNUM, C, S, WORK, + $ IWORK, INFO ) +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER ICOMPQ, INFO, LDB, LDBX, LDGCOL, LDU, N, NRHS, + $ SMLSIZ +* .. +* .. Array Arguments .. + INTEGER GIVCOL( LDGCOL, * ), GIVPTR( * ), IWORK( * ), + $ K( * ), PERM( LDGCOL, * ) + DOUBLE PRECISION B( LDB, * ), BX( LDBX, * ), C( * ), + $ DIFL( LDU, * ), DIFR( LDU, * ), + $ GIVNUM( LDU, * ), POLES( LDU, * ), S( * ), + $ U( LDU, * ), VT( LDU, * ), WORK( * ), + $ Z( LDU, * ) +* .. +* +* Purpose +* ======= +* +* DLALSA is an itermediate step in solving the least squares problem +* by computing the SVD of the coefficient matrix in compact form (The +* singular vectors are computed as products of simple orthorgonal +* matrices.). +* +* If ICOMPQ = 0, DLALSA applies the inverse of the left singular vector +* matrix of an upper bidiagonal matrix to the right hand side; and if +* ICOMPQ = 1, DLALSA applies the right singular vector matrix to the +* right hand side. The singular vector matrices were generated in +* compact form by DLALSA. +* +* Arguments +* ========= +* +* +* ICOMPQ (input) INTEGER +* Specifies whether the left or the right singular vector +* matrix is involved. +* = 0: Left singular vector matrix +* = 1: Right singular vector matrix +* +* SMLSIZ (input) INTEGER +* The maximum size of the subproblems at the bottom of the +* computation tree. +* +* N (input) INTEGER +* The row and column dimensions of the upper bidiagonal matrix. +* +* NRHS (input) INTEGER +* The number of columns of B and BX. NRHS must be at least 1. +* +* B (input/output) DOUBLE PRECISION array, dimension ( LDB, NRHS ) +* On input, B contains the right hand sides of the least +* squares problem in rows 1 through M. +* On output, B contains the solution X in rows 1 through N. +* +* LDB (input) INTEGER +* The leading dimension of B in the calling subprogram. +* LDB must be at least max(1,MAX( M, N ) ). +* +* BX (output) DOUBLE PRECISION array, dimension ( LDBX, NRHS ) +* On exit, the result of applying the left or right singular +* vector matrix to B. +* +* LDBX (input) INTEGER +* The leading dimension of BX. +* +* U (input) DOUBLE PRECISION array, dimension ( LDU, SMLSIZ ). +* On entry, U contains the left singular vector matrices of all +* subproblems at the bottom level. +* +* LDU (input) INTEGER, LDU = > N. +* The leading dimension of arrays U, VT, DIFL, DIFR, +* POLES, GIVNUM, and Z. +* +* VT (input) DOUBLE PRECISION array, dimension ( LDU, SMLSIZ+1 ). +* On entry, VT' contains the right singular vector matrices of +* all subproblems at the bottom level. +* +* K (input) INTEGER array, dimension ( N ). +* +* DIFL (input) DOUBLE PRECISION array, dimension ( LDU, NLVL ). +* where NLVL = INT(log_2 (N/(SMLSIZ+1))) + 1. +* +* DIFR (input) DOUBLE PRECISION array, dimension ( LDU, 2 * NLVL ). +* On entry, DIFL(*, I) and DIFR(*, 2 * I -1) record +* distances between singular values on the I-th level and +* singular values on the (I -1)-th level, and DIFR(*, 2 * I) +* record the normalizing factors of the right singular vectors +* matrices of subproblems on I-th level. +* +* Z (input) DOUBLE PRECISION array, dimension ( LDU, NLVL ). +* On entry, Z(1, I) contains the components of the deflation- +* adjusted updating row vector for subproblems on the I-th +* level. +* +* POLES (input) DOUBLE PRECISION array, dimension ( LDU, 2 * NLVL ). +* On entry, POLES(*, 2 * I -1: 2 * I) contains the new and old +* singular values involved in the secular equations on the I-th +* level. +* +* GIVPTR (input) INTEGER array, dimension ( N ). +* On entry, GIVPTR( I ) records the number of Givens +* rotations performed on the I-th problem on the computation +* tree. +* +* GIVCOL (input) INTEGER array, dimension ( LDGCOL, 2 * NLVL ). +* On entry, for each I, GIVCOL(*, 2 * I - 1: 2 * I) records the +* locations of Givens rotations performed on the I-th level on +* the computation tree. +* +* LDGCOL (input) INTEGER, LDGCOL = > N. +* The leading dimension of arrays GIVCOL and PERM. +* +* PERM (input) INTEGER array, dimension ( LDGCOL, NLVL ). +* On entry, PERM(*, I) records permutations done on the I-th +* level of the computation tree. +* +* GIVNUM (input) DOUBLE PRECISION array, dimension ( LDU, 2 * NLVL ). +* On entry, GIVNUM(*, 2 *I -1 : 2 * I) records the C- and S- +* values of Givens rotations performed on the I-th level on the +* computation tree. +* +* C (input) DOUBLE PRECISION array, dimension ( N ). +* On entry, if the I-th subproblem is not square, +* C( I ) contains the C-value of a Givens rotation related to +* the right null space of the I-th subproblem. +* +* S (input) DOUBLE PRECISION array, dimension ( N ). +* On entry, if the I-th subproblem is not square, +* S( I ) contains the S-value of a Givens rotation related to +* the right null space of the I-th subproblem. +* +* WORK (workspace) DOUBLE PRECISION array. +* The dimension must be at least N. +* +* IWORK (workspace) INTEGER array. +* The dimension must be at least 3 * N +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) +* .. +* .. Local Scalars .. + INTEGER I, I1, IC, IM1, INODE, J, LF, LL, LVL, LVL2, + $ ND, NDB1, NDIML, NDIMR, NL, NLF, NLP1, NLVL, + $ NR, NRF, NRP1, SQRE +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DGEMM, DLALS0, DLASDT, XERBLA +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( SMLSIZ.LT.3 ) THEN + INFO = -2 + ELSE IF( N.LT.SMLSIZ ) THEN + INFO = -3 + ELSE IF( NRHS.LT.1 ) THEN + INFO = -4 + ELSE IF( LDB.LT.N ) THEN + INFO = -6 + ELSE IF( LDBX.LT.N ) THEN + INFO = -8 + ELSE IF( LDU.LT.N ) THEN + INFO = -10 + ELSE IF( LDGCOL.LT.N ) THEN + INFO = -19 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLALSA', -INFO ) + RETURN + END IF +* +* Book-keeping and setting up the computation tree. +* + INODE = 1 + NDIML = INODE + N + NDIMR = NDIML + N +* + CALL DLASDT( N, NLVL, ND, IWORK( INODE ), IWORK( NDIML ), + $ IWORK( NDIMR ), SMLSIZ ) +* +* The following code applies back the left singular vector factors. +* For applying back the right singular vector factors, go to 50. +* + IF( ICOMPQ.EQ.1 ) THEN + GO TO 50 + END IF +* +* The nodes on the bottom level of the tree were solved +* by DLASDQ. The corresponding left and right singular vector +* matrices are in explicit form. First apply back the left +* singular vector matrices. +* + NDB1 = ( ND+1 ) / 2 + DO 10 I = NDB1, ND +* +* IC : center row of each node +* NL : number of rows of left subproblem +* NR : number of rows of right subproblem +* NLF: starting row of the left subproblem +* NRF: starting row of the right subproblem +* + I1 = I - 1 + IC = IWORK( INODE+I1 ) + NL = IWORK( NDIML+I1 ) + NR = IWORK( NDIMR+I1 ) + NLF = IC - NL + NRF = IC + 1 + CALL DGEMM( 'T', 'N', NL, NRHS, NL, ONE, U( NLF, 1 ), LDU, + $ B( NLF, 1 ), LDB, ZERO, BX( NLF, 1 ), LDBX ) + CALL DGEMM( 'T', 'N', NR, NRHS, NR, ONE, U( NRF, 1 ), LDU, + $ B( NRF, 1 ), LDB, ZERO, BX( NRF, 1 ), LDBX ) + 10 CONTINUE +* +* Next copy the rows of B that correspond to unchanged rows +* in the bidiagonal matrix to BX. +* + DO 20 I = 1, ND + IC = IWORK( INODE+I-1 ) + CALL DCOPY( NRHS, B( IC, 1 ), LDB, BX( IC, 1 ), LDBX ) + 20 CONTINUE +* +* Finally go through the left singular vector matrices of all +* the other subproblems bottom-up on the tree. +* + J = 2**NLVL + SQRE = 0 +* + DO 40 LVL = NLVL, 1, -1 + LVL2 = 2*LVL - 1 +* +* find the first node LF and last node LL on +* the current level LVL +* + IF( LVL.EQ.1 ) THEN + LF = 1 + LL = 1 + ELSE + LF = 2**( LVL-1 ) + LL = 2*LF - 1 + END IF + DO 30 I = LF, LL + IM1 = I - 1 + IC = IWORK( INODE+IM1 ) + NL = IWORK( NDIML+IM1 ) + NR = IWORK( NDIMR+IM1 ) + NLF = IC - NL + NRF = IC + 1 + J = J - 1 + CALL DLALS0( ICOMPQ, NL, NR, SQRE, NRHS, BX( NLF, 1 ), LDBX, + $ B( NLF, 1 ), LDB, PERM( NLF, LVL ), + $ GIVPTR( J ), GIVCOL( NLF, LVL2 ), LDGCOL, + $ GIVNUM( NLF, LVL2 ), LDU, POLES( NLF, LVL2 ), + $ DIFL( NLF, LVL ), DIFR( NLF, LVL2 ), + $ Z( NLF, LVL ), K( J ), C( J ), S( J ), WORK, + $ INFO ) + 30 CONTINUE + 40 CONTINUE + GO TO 90 +* +* ICOMPQ = 1: applying back the right singular vector factors. +* + 50 CONTINUE +* +* First now go through the right singular vector matrices of all +* the tree nodes top-down. +* + J = 0 + DO 70 LVL = 1, NLVL + LVL2 = 2*LVL - 1 +* +* Find the first node LF and last node LL on +* the current level LVL. +* + IF( LVL.EQ.1 ) THEN + LF = 1 + LL = 1 + ELSE + LF = 2**( LVL-1 ) + LL = 2*LF - 1 + END IF + DO 60 I = LL, LF, -1 + IM1 = I - 1 + IC = IWORK( INODE+IM1 ) + NL = IWORK( NDIML+IM1 ) + NR = IWORK( NDIMR+IM1 ) + NLF = IC - NL + NRF = IC + 1 + IF( I.EQ.LL ) THEN + SQRE = 0 + ELSE + SQRE = 1 + END IF + J = J + 1 + CALL DLALS0( ICOMPQ, NL, NR, SQRE, NRHS, B( NLF, 1 ), LDB, + $ BX( NLF, 1 ), LDBX, PERM( NLF, LVL ), + $ GIVPTR( J ), GIVCOL( NLF, LVL2 ), LDGCOL, + $ GIVNUM( NLF, LVL2 ), LDU, POLES( NLF, LVL2 ), + $ DIFL( NLF, LVL ), DIFR( NLF, LVL2 ), + $ Z( NLF, LVL ), K( J ), C( J ), S( J ), WORK, + $ INFO ) + 60 CONTINUE + 70 CONTINUE +* +* The nodes on the bottom level of the tree were solved +* by DLASDQ. The corresponding right singular vector +* matrices are in explicit form. Apply them back. +* + NDB1 = ( ND+1 ) / 2 + DO 80 I = NDB1, ND + I1 = I - 1 + IC = IWORK( INODE+I1 ) + NL = IWORK( NDIML+I1 ) + NR = IWORK( NDIMR+I1 ) + NLP1 = NL + 1 + IF( I.EQ.ND ) THEN + NRP1 = NR + ELSE + NRP1 = NR + 1 + END IF + NLF = IC - NL + NRF = IC + 1 + CALL DGEMM( 'T', 'N', NLP1, NRHS, NLP1, ONE, VT( NLF, 1 ), LDU, + $ B( NLF, 1 ), LDB, ZERO, BX( NLF, 1 ), LDBX ) + CALL DGEMM( 'T', 'N', NRP1, NRHS, NRP1, ONE, VT( NRF, 1 ), LDU, + $ B( NRF, 1 ), LDB, ZERO, BX( NRF, 1 ), LDBX ) + 80 CONTINUE +* + 90 CONTINUE +* + RETURN +* +* End of DLALSA +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlalsd.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlalsd.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,434 @@ + SUBROUTINE DLALSD( UPLO, SMLSIZ, N, NRHS, D, E, B, LDB, RCOND, + $ RANK, WORK, IWORK, INFO ) +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + CHARACTER UPLO + INTEGER INFO, LDB, N, NRHS, RANK, SMLSIZ + DOUBLE PRECISION RCOND +* .. +* .. Array Arguments .. + INTEGER IWORK( * ) + DOUBLE PRECISION B( LDB, * ), D( * ), E( * ), WORK( * ) +* .. +* +* Purpose +* ======= +* +* DLALSD uses the singular value decomposition of A to solve the least +* squares problem of finding X to minimize the Euclidean norm of each +* column of A*X-B, where A is N-by-N upper bidiagonal, and X and B +* are N-by-NRHS. The solution X overwrites B. +* +* The singular values of A smaller than RCOND times the largest +* singular value are treated as zero in solving the least squares +* problem; in this case a minimum norm solution is returned. +* The actual singular values are returned in D in ascending order. +* +* This code makes very mild assumptions about floating point +* arithmetic. It will work on machines with a guard digit in +* add/subtract, or on those binary machines without guard digits +* which subtract like the Cray XMP, Cray YMP, Cray C 90, or Cray 2. +* It could conceivably fail on hexadecimal or decimal machines +* without guard digits, but we know of none. +* +* Arguments +* ========= +* +* UPLO (input) CHARACTER*1 +* = 'U': D and E define an upper bidiagonal matrix. +* = 'L': D and E define a lower bidiagonal matrix. +* +* SMLSIZ (input) INTEGER +* The maximum size of the subproblems at the bottom of the +* computation tree. +* +* N (input) INTEGER +* The dimension of the bidiagonal matrix. N >= 0. +* +* NRHS (input) INTEGER +* The number of columns of B. NRHS must be at least 1. +* +* D (input/output) DOUBLE PRECISION array, dimension (N) +* On entry D contains the main diagonal of the bidiagonal +* matrix. On exit, if INFO = 0, D contains its singular values. +* +* E (input/output) DOUBLE PRECISION array, dimension (N-1) +* Contains the super-diagonal entries of the bidiagonal matrix. +* On exit, E has been destroyed. +* +* B (input/output) DOUBLE PRECISION array, dimension (LDB,NRHS) +* On input, B contains the right hand sides of the least +* squares problem. On output, B contains the solution X. +* +* LDB (input) INTEGER +* The leading dimension of B in the calling subprogram. +* LDB must be at least max(1,N). +* +* RCOND (input) DOUBLE PRECISION +* The singular values of A less than or equal to RCOND times +* the largest singular value are treated as zero in solving +* the least squares problem. If RCOND is negative, +* machine precision is used instead. +* For example, if diag(S)*X=B were the least squares problem, +* where diag(S) is a diagonal matrix of singular values, the +* solution would be X(i) = B(i) / S(i) if S(i) is greater than +* RCOND*max(S), and X(i) = 0 if S(i) is less than or equal to +* RCOND*max(S). +* +* RANK (output) INTEGER +* The number of singular values of A greater than RCOND times +* the largest singular value. +* +* WORK (workspace) DOUBLE PRECISION array, dimension at least +* (9*N + 2*N*SMLSIZ + 8*N*NLVL + N*NRHS + (SMLSIZ+1)**2), +* where NLVL = max(0, INT(log_2 (N/(SMLSIZ+1))) + 1). +* +* IWORK (workspace) INTEGER array, dimension at least +* (3*N*NLVL + 11*N) +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: The algorithm failed to compute an singular value while +* working on the submatrix lying in rows and columns +* INFO/(N+1) through MOD(INFO,N+1). +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE, TWO + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0 ) +* .. +* .. Local Scalars .. + INTEGER BX, BXST, C, DIFL, DIFR, GIVCOL, GIVNUM, + $ GIVPTR, I, ICMPQ1, ICMPQ2, IWK, J, K, NLVL, + $ NM1, NSIZE, NSUB, NWORK, PERM, POLES, S, SIZEI, + $ SMLSZP, SQRE, ST, ST1, U, VT, Z + DOUBLE PRECISION CS, EPS, ORGNRM, R, RCND, SN, TOL +* .. +* .. External Functions .. + INTEGER IDAMAX + DOUBLE PRECISION DLAMCH, DLANST + EXTERNAL IDAMAX, DLAMCH, DLANST +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DGEMM, DLACPY, DLALSA, DLARTG, DLASCL, + $ DLASDA, DLASDQ, DLASET, DLASRT, DROT, XERBLA +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, DBLE, INT, LOG, SIGN +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( N.LT.0 ) THEN + INFO = -3 + ELSE IF( NRHS.LT.1 ) THEN + INFO = -4 + ELSE IF( ( LDB.LT.1 ) .OR. ( LDB.LT.N ) ) THEN + INFO = -8 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLALSD', -INFO ) + RETURN + END IF +* + EPS = DLAMCH( 'Epsilon' ) +* +* Set up the tolerance. +* + IF( ( RCOND.LE.ZERO ) .OR. ( RCOND.GE.ONE ) ) THEN + RCND = EPS + ELSE + RCND = RCOND + END IF +* + RANK = 0 +* +* Quick return if possible. +* + IF( N.EQ.0 ) THEN + RETURN + ELSE IF( N.EQ.1 ) THEN + IF( D( 1 ).EQ.ZERO ) THEN + CALL DLASET( 'A', 1, NRHS, ZERO, ZERO, B, LDB ) + ELSE + RANK = 1 + CALL DLASCL( 'G', 0, 0, D( 1 ), ONE, 1, NRHS, B, LDB, INFO ) + D( 1 ) = ABS( D( 1 ) ) + END IF + RETURN + END IF +* +* Rotate the matrix if it is lower bidiagonal. +* + IF( UPLO.EQ.'L' ) THEN + DO 10 I = 1, N - 1 + CALL DLARTG( D( I ), E( I ), CS, SN, R ) + D( I ) = R + E( I ) = SN*D( I+1 ) + D( I+1 ) = CS*D( I+1 ) + IF( NRHS.EQ.1 ) THEN + CALL DROT( 1, B( I, 1 ), 1, B( I+1, 1 ), 1, CS, SN ) + ELSE + WORK( I*2-1 ) = CS + WORK( I*2 ) = SN + END IF + 10 CONTINUE + IF( NRHS.GT.1 ) THEN + DO 30 I = 1, NRHS + DO 20 J = 1, N - 1 + CS = WORK( J*2-1 ) + SN = WORK( J*2 ) + CALL DROT( 1, B( J, I ), 1, B( J+1, I ), 1, CS, SN ) + 20 CONTINUE + 30 CONTINUE + END IF + END IF +* +* Scale. +* + NM1 = N - 1 + ORGNRM = DLANST( 'M', N, D, E ) + IF( ORGNRM.EQ.ZERO ) THEN + CALL DLASET( 'A', N, NRHS, ZERO, ZERO, B, LDB ) + RETURN + END IF +* + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, N, 1, D, N, INFO ) + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, NM1, 1, E, NM1, INFO ) +* +* If N is smaller than the minimum divide size SMLSIZ, then solve +* the problem with another solver. +* + IF( N.LE.SMLSIZ ) THEN + NWORK = 1 + N*N + CALL DLASET( 'A', N, N, ZERO, ONE, WORK, N ) + CALL DLASDQ( 'U', 0, N, N, 0, NRHS, D, E, WORK, N, WORK, N, B, + $ LDB, WORK( NWORK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + TOL = RCND*ABS( D( IDAMAX( N, D, 1 ) ) ) + DO 40 I = 1, N + IF( D( I ).LE.TOL ) THEN + CALL DLASET( 'A', 1, NRHS, ZERO, ZERO, B( I, 1 ), LDB ) + ELSE + CALL DLASCL( 'G', 0, 0, D( I ), ONE, 1, NRHS, B( I, 1 ), + $ LDB, INFO ) + RANK = RANK + 1 + END IF + 40 CONTINUE + CALL DGEMM( 'T', 'N', N, NRHS, N, ONE, WORK, N, B, LDB, ZERO, + $ WORK( NWORK ), N ) + CALL DLACPY( 'A', N, NRHS, WORK( NWORK ), N, B, LDB ) +* +* Unscale. +* + CALL DLASCL( 'G', 0, 0, ONE, ORGNRM, N, 1, D, N, INFO ) + CALL DLASRT( 'D', N, D, INFO ) + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, N, NRHS, B, LDB, INFO ) +* + RETURN + END IF +* +* Book-keeping and setting up some constants. +* + NLVL = INT( LOG( DBLE( N ) / DBLE( SMLSIZ+1 ) ) / LOG( TWO ) ) + 1 +* + SMLSZP = SMLSIZ + 1 +* + U = 1 + VT = 1 + SMLSIZ*N + DIFL = VT + SMLSZP*N + DIFR = DIFL + NLVL*N + Z = DIFR + NLVL*N*2 + C = Z + NLVL*N + S = C + N + POLES = S + N + GIVNUM = POLES + 2*NLVL*N + BX = GIVNUM + 2*NLVL*N + NWORK = BX + N*NRHS +* + SIZEI = 1 + N + K = SIZEI + N + GIVPTR = K + N + PERM = GIVPTR + N + GIVCOL = PERM + NLVL*N + IWK = GIVCOL + NLVL*N*2 +* + ST = 1 + SQRE = 0 + ICMPQ1 = 1 + ICMPQ2 = 0 + NSUB = 0 +* + DO 50 I = 1, N + IF( ABS( D( I ) ).LT.EPS ) THEN + D( I ) = SIGN( EPS, D( I ) ) + END IF + 50 CONTINUE +* + DO 60 I = 1, NM1 + IF( ( ABS( E( I ) ).LT.EPS ) .OR. ( I.EQ.NM1 ) ) THEN + NSUB = NSUB + 1 + IWORK( NSUB ) = ST +* +* Subproblem found. First determine its size and then +* apply divide and conquer on it. +* + IF( I.LT.NM1 ) THEN +* +* A subproblem with E(I) small for I < NM1. +* + NSIZE = I - ST + 1 + IWORK( SIZEI+NSUB-1 ) = NSIZE + ELSE IF( ABS( E( I ) ).GE.EPS ) THEN +* +* A subproblem with E(NM1) not too small but I = NM1. +* + NSIZE = N - ST + 1 + IWORK( SIZEI+NSUB-1 ) = NSIZE + ELSE +* +* A subproblem with E(NM1) small. This implies an +* 1-by-1 subproblem at D(N), which is not solved +* explicitly. +* + NSIZE = I - ST + 1 + IWORK( SIZEI+NSUB-1 ) = NSIZE + NSUB = NSUB + 1 + IWORK( NSUB ) = N + IWORK( SIZEI+NSUB-1 ) = 1 + CALL DCOPY( NRHS, B( N, 1 ), LDB, WORK( BX+NM1 ), N ) + END IF + ST1 = ST - 1 + IF( NSIZE.EQ.1 ) THEN +* +* This is a 1-by-1 subproblem and is not solved +* explicitly. +* + CALL DCOPY( NRHS, B( ST, 1 ), LDB, WORK( BX+ST1 ), N ) + ELSE IF( NSIZE.LE.SMLSIZ ) THEN +* +* This is a small subproblem and is solved by DLASDQ. +* + CALL DLASET( 'A', NSIZE, NSIZE, ZERO, ONE, + $ WORK( VT+ST1 ), N ) + CALL DLASDQ( 'U', 0, NSIZE, NSIZE, 0, NRHS, D( ST ), + $ E( ST ), WORK( VT+ST1 ), N, WORK( NWORK ), + $ N, B( ST, 1 ), LDB, WORK( NWORK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + CALL DLACPY( 'A', NSIZE, NRHS, B( ST, 1 ), LDB, + $ WORK( BX+ST1 ), N ) + ELSE +* +* A large problem. Solve it using divide and conquer. +* + CALL DLASDA( ICMPQ1, SMLSIZ, NSIZE, SQRE, D( ST ), + $ E( ST ), WORK( U+ST1 ), N, WORK( VT+ST1 ), + $ IWORK( K+ST1 ), WORK( DIFL+ST1 ), + $ WORK( DIFR+ST1 ), WORK( Z+ST1 ), + $ WORK( POLES+ST1 ), IWORK( GIVPTR+ST1 ), + $ IWORK( GIVCOL+ST1 ), N, IWORK( PERM+ST1 ), + $ WORK( GIVNUM+ST1 ), WORK( C+ST1 ), + $ WORK( S+ST1 ), WORK( NWORK ), IWORK( IWK ), + $ INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + BXST = BX + ST1 + CALL DLALSA( ICMPQ2, SMLSIZ, NSIZE, NRHS, B( ST, 1 ), + $ LDB, WORK( BXST ), N, WORK( U+ST1 ), N, + $ WORK( VT+ST1 ), IWORK( K+ST1 ), + $ WORK( DIFL+ST1 ), WORK( DIFR+ST1 ), + $ WORK( Z+ST1 ), WORK( POLES+ST1 ), + $ IWORK( GIVPTR+ST1 ), IWORK( GIVCOL+ST1 ), N, + $ IWORK( PERM+ST1 ), WORK( GIVNUM+ST1 ), + $ WORK( C+ST1 ), WORK( S+ST1 ), WORK( NWORK ), + $ IWORK( IWK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + END IF + ST = I + 1 + END IF + 60 CONTINUE +* +* Apply the singular values and treat the tiny ones as zero. +* + TOL = RCND*ABS( D( IDAMAX( N, D, 1 ) ) ) +* + DO 70 I = 1, N +* +* Some of the elements in D can be negative because 1-by-1 +* subproblems were not solved explicitly. +* + IF( ABS( D( I ) ).LE.TOL ) THEN + CALL DLASET( 'A', 1, NRHS, ZERO, ZERO, WORK( BX+I-1 ), N ) + ELSE + RANK = RANK + 1 + CALL DLASCL( 'G', 0, 0, D( I ), ONE, 1, NRHS, + $ WORK( BX+I-1 ), N, INFO ) + END IF + D( I ) = ABS( D( I ) ) + 70 CONTINUE +* +* Now apply back the right singular vectors. +* + ICMPQ2 = 1 + DO 80 I = 1, NSUB + ST = IWORK( I ) + ST1 = ST - 1 + NSIZE = IWORK( SIZEI+I-1 ) + BXST = BX + ST1 + IF( NSIZE.EQ.1 ) THEN + CALL DCOPY( NRHS, WORK( BXST ), N, B( ST, 1 ), LDB ) + ELSE IF( NSIZE.LE.SMLSIZ ) THEN + CALL DGEMM( 'T', 'N', NSIZE, NRHS, NSIZE, ONE, + $ WORK( VT+ST1 ), N, WORK( BXST ), N, ZERO, + $ B( ST, 1 ), LDB ) + ELSE + CALL DLALSA( ICMPQ2, SMLSIZ, NSIZE, NRHS, WORK( BXST ), N, + $ B( ST, 1 ), LDB, WORK( U+ST1 ), N, + $ WORK( VT+ST1 ), IWORK( K+ST1 ), + $ WORK( DIFL+ST1 ), WORK( DIFR+ST1 ), + $ WORK( Z+ST1 ), WORK( POLES+ST1 ), + $ IWORK( GIVPTR+ST1 ), IWORK( GIVCOL+ST1 ), N, + $ IWORK( PERM+ST1 ), WORK( GIVNUM+ST1 ), + $ WORK( C+ST1 ), WORK( S+ST1 ), WORK( NWORK ), + $ IWORK( IWK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + END IF + 80 CONTINUE +* +* Unscale and sort the singular values. +* + CALL DLASCL( 'G', 0, 0, ONE, ORGNRM, N, 1, D, N, INFO ) + CALL DLASRT( 'D', N, D, INFO ) + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, N, NRHS, B, LDB, INFO ) +* + RETURN +* +* End of DLALSD +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlamrg.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlamrg.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,103 @@ + SUBROUTINE DLAMRG( N1, N2, A, DTRD1, DTRD2, INDEX ) +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER DTRD1, DTRD2, N1, N2 +* .. +* .. Array Arguments .. + INTEGER INDEX( * ) + DOUBLE PRECISION A( * ) +* .. +* +* Purpose +* ======= +* +* DLAMRG will create a permutation list which will merge the elements +* of A (which is composed of two independently sorted sets) into a +* single set which is sorted in ascending order. +* +* Arguments +* ========= +* +* N1 (input) INTEGER +* N2 (input) INTEGER +* These arguements contain the respective lengths of the two +* sorted lists to be merged. +* +* A (input) DOUBLE PRECISION array, dimension (N1+N2) +* The first N1 elements of A contain a list of numbers which +* are sorted in either ascending or descending order. Likewise +* for the final N2 elements. +* +* DTRD1 (input) INTEGER +* DTRD2 (input) INTEGER +* These are the strides to be taken through the array A. +* Allowable strides are 1 and -1. They indicate whether a +* subset of A is sorted in ascending (DTRDx = 1) or descending +* (DTRDx = -1) order. +* +* INDEX (output) INTEGER array, dimension (N1+N2) +* On exit this array will contain a permutation such that +* if B( I ) = A( INDEX( I ) ) for I=1,N1+N2, then B will be +* sorted in ascending order. +* +* ===================================================================== +* +* .. Local Scalars .. + INTEGER I, IND1, IND2, N1SV, N2SV +* .. +* .. Executable Statements .. +* + N1SV = N1 + N2SV = N2 + IF( DTRD1.GT.0 ) THEN + IND1 = 1 + ELSE + IND1 = N1 + END IF + IF( DTRD2.GT.0 ) THEN + IND2 = 1 + N1 + ELSE + IND2 = N1 + N2 + END IF + I = 1 +* while ( (N1SV > 0) & (N2SV > 0) ) + 10 CONTINUE + IF( N1SV.GT.0 .AND. N2SV.GT.0 ) THEN + IF( A( IND1 ).LE.A( IND2 ) ) THEN + INDEX( I ) = IND1 + I = I + 1 + IND1 = IND1 + DTRD1 + N1SV = N1SV - 1 + ELSE + INDEX( I ) = IND2 + I = I + 1 + IND2 = IND2 + DTRD2 + N2SV = N2SV - 1 + END IF + GO TO 10 + END IF +* end while + IF( N1SV.EQ.0 ) THEN + DO 20 N1SV = 1, N2SV + INDEX( I ) = IND2 + I = I + 1 + IND2 = IND2 + DTRD2 + 20 CONTINUE + ELSE +* N2SV .EQ. 0 + DO 30 N2SV = 1, N1SV + INDEX( I ) = IND1 + I = I + 1 + IND1 = IND1 + DTRD1 + 30 CONTINUE + END IF +* + RETURN +* +* End of DLAMRG +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd0.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd0.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,230 @@ + SUBROUTINE DLASD0( N, SQRE, D, E, U, LDU, VT, LDVT, SMLSIZ, IWORK, + $ WORK, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER INFO, LDU, LDVT, N, SMLSIZ, SQRE +* .. +* .. Array Arguments .. + INTEGER IWORK( * ) + DOUBLE PRECISION D( * ), E( * ), U( LDU, * ), VT( LDVT, * ), + $ WORK( * ) +* .. +* +* Purpose +* ======= +* +* Using a divide and conquer approach, DLASD0 computes the singular +* value decomposition (SVD) of a real upper bidiagonal N-by-M +* matrix B with diagonal D and offdiagonal E, where M = N + SQRE. +* The algorithm computes orthogonal matrices U and VT such that +* B = U * S * VT. The singular values S are overwritten on D. +* +* A related subroutine, DLASDA, computes only the singular values, +* and optionally, the singular vectors in compact form. +* +* Arguments +* ========= +* +* N (input) INTEGER +* On entry, the row dimension of the upper bidiagonal matrix. +* This is also the dimension of the main diagonal array D. +* +* SQRE (input) INTEGER +* Specifies the column dimension of the bidiagonal matrix. +* = 0: The bidiagonal matrix has column dimension M = N; +* = 1: The bidiagonal matrix has column dimension M = N+1; +* +* D (input/output) DOUBLE PRECISION array, dimension (N) +* On entry D contains the main diagonal of the bidiagonal +* matrix. +* On exit D, if INFO = 0, contains its singular values. +* +* E (input) DOUBLE PRECISION array, dimension (M-1) +* Contains the subdiagonal entries of the bidiagonal matrix. +* On exit, E has been destroyed. +* +* U (output) DOUBLE PRECISION array, dimension at least (LDQ, N) +* On exit, U contains the left singular vectors. +* +* LDU (input) INTEGER +* On entry, leading dimension of U. +* +* VT (output) DOUBLE PRECISION array, dimension at least (LDVT, M) +* On exit, VT' contains the right singular vectors. +* +* LDVT (input) INTEGER +* On entry, leading dimension of VT. +* +* SMLSIZ (input) INTEGER +* On entry, maximum size of the subproblems at the +* bottom of the computation tree. +* +* IWORK (workspace) INTEGER work array. +* Dimension must be at least (8 * N) +* +* WORK (workspace) DOUBLE PRECISION work array. +* Dimension must be at least (3 * M**2 + 2 * M) +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: if INFO = 1, an singular value did not converge +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Local Scalars .. + INTEGER I, I1, IC, IDXQ, IDXQC, IM1, INODE, ITEMP, IWK, + $ J, LF, LL, LVL, M, NCC, ND, NDB1, NDIML, NDIMR, + $ NL, NLF, NLP1, NLVL, NR, NRF, NRP1, SQREI + DOUBLE PRECISION ALPHA, BETA +* .. +* .. External Subroutines .. + EXTERNAL DLASD1, DLASDQ, DLASDT, XERBLA +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( N.LT.0 ) THEN + INFO = -1 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -2 + END IF +* + M = N + SQRE +* + IF( LDU.LT.N ) THEN + INFO = -6 + ELSE IF( LDVT.LT.M ) THEN + INFO = -8 + ELSE IF( SMLSIZ.LT.3 ) THEN + INFO = -9 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASD0', -INFO ) + RETURN + END IF +* +* If the input matrix is too small, call DLASDQ to find the SVD. +* + IF( N.LE.SMLSIZ ) THEN + CALL DLASDQ( 'U', SQRE, N, M, N, 0, D, E, VT, LDVT, U, LDU, U, + $ LDU, WORK, INFO ) + RETURN + END IF +* +* Set up the computation tree. +* + INODE = 1 + NDIML = INODE + N + NDIMR = NDIML + N + IDXQ = NDIMR + N + IWK = IDXQ + N + CALL DLASDT( N, NLVL, ND, IWORK( INODE ), IWORK( NDIML ), + $ IWORK( NDIMR ), SMLSIZ ) +* +* For the nodes on bottom level of the tree, solve +* their subproblems by DLASDQ. +* + NDB1 = ( ND+1 ) / 2 + NCC = 0 + DO 30 I = NDB1, ND +* +* IC : center row of each node +* NL : number of rows of left subproblem +* NR : number of rows of right subproblem +* NLF: starting row of the left subproblem +* NRF: starting row of the right subproblem +* + I1 = I - 1 + IC = IWORK( INODE+I1 ) + NL = IWORK( NDIML+I1 ) + NLP1 = NL + 1 + NR = IWORK( NDIMR+I1 ) + NRP1 = NR + 1 + NLF = IC - NL + NRF = IC + 1 + SQREI = 1 + CALL DLASDQ( 'U', SQREI, NL, NLP1, NL, NCC, D( NLF ), E( NLF ), + $ VT( NLF, NLF ), LDVT, U( NLF, NLF ), LDU, + $ U( NLF, NLF ), LDU, WORK, INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + ITEMP = IDXQ + NLF - 2 + DO 10 J = 1, NL + IWORK( ITEMP+J ) = J + 10 CONTINUE + IF( I.EQ.ND ) THEN + SQREI = SQRE + ELSE + SQREI = 1 + END IF + NRP1 = NR + SQREI + CALL DLASDQ( 'U', SQREI, NR, NRP1, NR, NCC, D( NRF ), E( NRF ), + $ VT( NRF, NRF ), LDVT, U( NRF, NRF ), LDU, + $ U( NRF, NRF ), LDU, WORK, INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + ITEMP = IDXQ + IC + DO 20 J = 1, NR + IWORK( ITEMP+J-1 ) = J + 20 CONTINUE + 30 CONTINUE +* +* Now conquer each subproblem bottom-up. +* + DO 50 LVL = NLVL, 1, -1 +* +* Find the first node LF and last node LL on the +* current level LVL. +* + IF( LVL.EQ.1 ) THEN + LF = 1 + LL = 1 + ELSE + LF = 2**( LVL-1 ) + LL = 2*LF - 1 + END IF + DO 40 I = LF, LL + IM1 = I - 1 + IC = IWORK( INODE+IM1 ) + NL = IWORK( NDIML+IM1 ) + NR = IWORK( NDIMR+IM1 ) + NLF = IC - NL + IF( ( SQRE.EQ.0 ) .AND. ( I.EQ.LL ) ) THEN + SQREI = SQRE + ELSE + SQREI = 1 + END IF + IDXQC = IDXQ + NLF - 1 + ALPHA = D( IC ) + BETA = E( IC ) + CALL DLASD1( NL, NR, SQREI, D( NLF ), ALPHA, BETA, + $ U( NLF, NLF ), LDU, VT( NLF, NLF ), LDVT, + $ IWORK( IDXQC ), IWORK( IWK ), WORK, INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + 40 CONTINUE + 50 CONTINUE +* + RETURN +* +* End of DLASD0 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd1.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd1.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,232 @@ + SUBROUTINE DLASD1( NL, NR, SQRE, D, ALPHA, BETA, U, LDU, VT, LDVT, + $ IDXQ, IWORK, WORK, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER INFO, LDU, LDVT, NL, NR, SQRE + DOUBLE PRECISION ALPHA, BETA +* .. +* .. Array Arguments .. + INTEGER IDXQ( * ), IWORK( * ) + DOUBLE PRECISION D( * ), U( LDU, * ), VT( LDVT, * ), WORK( * ) +* .. +* +* Purpose +* ======= +* +* DLASD1 computes the SVD of an upper bidiagonal N-by-M matrix B, +* where N = NL + NR + 1 and M = N + SQRE. DLASD1 is called from DLASD0. +* +* A related subroutine DLASD7 handles the case in which the singular +* values (and the singular vectors in factored form) are desired. +* +* DLASD1 computes the SVD as follows: +* +* ( D1(in) 0 0 0 ) +* B = U(in) * ( Z1' a Z2' b ) * VT(in) +* ( 0 0 D2(in) 0 ) +* +* = U(out) * ( D(out) 0) * VT(out) +* +* where Z' = (Z1' a Z2' b) = u' VT', and u is a vector of dimension M +* with ALPHA and BETA in the NL+1 and NL+2 th entries and zeros +* elsewhere; and the entry b is empty if SQRE = 0. +* +* The left singular vectors of the original matrix are stored in U, and +* the transpose of the right singular vectors are stored in VT, and the +* singular values are in D. The algorithm consists of three stages: +* +* The first stage consists of deflating the size of the problem +* when there are multiple singular values or when there are zeros in +* the Z vector. For each such occurence the dimension of the +* secular equation problem is reduced by one. This stage is +* performed by the routine DLASD2. +* +* The second stage consists of calculating the updated +* singular values. This is done by finding the square roots of the +* roots of the secular equation via the routine DLASD4 (as called +* by DLASD3). This routine also calculates the singular vectors of +* the current problem. +* +* The final stage consists of computing the updated singular vectors +* directly using the updated singular values. The singular vectors +* for the current problem are multiplied with the singular vectors +* from the overall problem. +* +* Arguments +* ========= +* +* NL (input) INTEGER +* The row dimension of the upper block. NL >= 1. +* +* NR (input) INTEGER +* The row dimension of the lower block. NR >= 1. +* +* SQRE (input) INTEGER +* = 0: the lower block is an NR-by-NR square matrix. +* = 1: the lower block is an NR-by-(NR+1) rectangular matrix. +* +* The bidiagonal matrix has row dimension N = NL + NR + 1, +* and column dimension M = N + SQRE. +* +* D (input/output) DOUBLE PRECISION array, +* dimension (N = NL+NR+1). +* On entry D(1:NL,1:NL) contains the singular values of the +* upper block; and D(NL+2:N) contains the singular values of +* the lower block. On exit D(1:N) contains the singular values +* of the modified matrix. +* +* ALPHA (input/output) DOUBLE PRECISION +* Contains the diagonal element associated with the added row. +* +* BETA (input/output) DOUBLE PRECISION +* Contains the off-diagonal element associated with the added +* row. +* +* U (input/output) DOUBLE PRECISION array, dimension(LDU,N) +* On entry U(1:NL, 1:NL) contains the left singular vectors of +* the upper block; U(NL+2:N, NL+2:N) contains the left singular +* vectors of the lower block. On exit U contains the left +* singular vectors of the bidiagonal matrix. +* +* LDU (input) INTEGER +* The leading dimension of the array U. LDU >= max( 1, N ). +* +* VT (input/output) DOUBLE PRECISION array, dimension(LDVT,M) +* where M = N + SQRE. +* On entry VT(1:NL+1, 1:NL+1)' contains the right singular +* vectors of the upper block; VT(NL+2:M, NL+2:M)' contains +* the right singular vectors of the lower block. On exit +* VT' contains the right singular vectors of the +* bidiagonal matrix. +* +* LDVT (input) INTEGER +* The leading dimension of the array VT. LDVT >= max( 1, M ). +* +* IDXQ (output) INTEGER array, dimension(N) +* This contains the permutation which will reintegrate the +* subproblem just solved back into sorted order, i.e. +* D( IDXQ( I = 1, N ) ) will be in ascending order. +* +* IWORK (workspace) INTEGER array, dimension( 4 * N ) +* +* WORK (workspace) DOUBLE PRECISION array, dimension( 3*M**2 + 2*M ) +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: if INFO = 1, an singular value did not converge +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. +* + DOUBLE PRECISION ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +* .. +* .. Local Scalars .. + INTEGER COLTYP, I, IDX, IDXC, IDXP, IQ, ISIGMA, IU2, + $ IVT2, IZ, K, LDQ, LDU2, LDVT2, M, N, N1, N2 + DOUBLE PRECISION ORGNRM +* .. +* .. External Subroutines .. + EXTERNAL DLAMRG, DLASCL, DLASD2, DLASD3, XERBLA +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, MAX +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( NL.LT.1 ) THEN + INFO = -1 + ELSE IF( NR.LT.1 ) THEN + INFO = -2 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -3 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASD1', -INFO ) + RETURN + END IF +* + N = NL + NR + 1 + M = N + SQRE +* +* The following values are for bookkeeping purposes only. They are +* integer pointers which indicate the portion of the workspace +* used by a particular array in DLASD2 and DLASD3. +* + LDU2 = N + LDVT2 = M +* + IZ = 1 + ISIGMA = IZ + M + IU2 = ISIGMA + N + IVT2 = IU2 + LDU2*N + IQ = IVT2 + LDVT2*M +* + IDX = 1 + IDXC = IDX + N + COLTYP = IDXC + N + IDXP = COLTYP + N +* +* Scale. +* + ORGNRM = MAX( ABS( ALPHA ), ABS( BETA ) ) + D( NL+1 ) = ZERO + DO 10 I = 1, N + IF( ABS( D( I ) ).GT.ORGNRM ) THEN + ORGNRM = ABS( D( I ) ) + END IF + 10 CONTINUE + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, N, 1, D, N, INFO ) + ALPHA = ALPHA / ORGNRM + BETA = BETA / ORGNRM +* +* Deflate singular values. +* + CALL DLASD2( NL, NR, SQRE, K, D, WORK( IZ ), ALPHA, BETA, U, LDU, + $ VT, LDVT, WORK( ISIGMA ), WORK( IU2 ), LDU2, + $ WORK( IVT2 ), LDVT2, IWORK( IDXP ), IWORK( IDX ), + $ IWORK( IDXC ), IDXQ, IWORK( COLTYP ), INFO ) +* +* Solve Secular Equation and update singular vectors. +* + LDQ = K + CALL DLASD3( NL, NR, SQRE, K, D, WORK( IQ ), LDQ, WORK( ISIGMA ), + $ U, LDU, WORK( IU2 ), LDU2, VT, LDVT, WORK( IVT2 ), + $ LDVT2, IWORK( IDXC ), IWORK( COLTYP ), WORK( IZ ), + $ INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF +* +* Unscale. +* + CALL DLASCL( 'G', 0, 0, ONE, ORGNRM, N, 1, D, N, INFO ) +* +* Prepare the IDXQ sorting permutation. +* + N1 = K + N2 = N - K + CALL DLAMRG( N1, N2, D, 1, -1, IDXQ ) +* + RETURN +* +* End of DLASD1 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd2.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd2.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,512 @@ + SUBROUTINE DLASD2( NL, NR, SQRE, K, D, Z, ALPHA, BETA, U, LDU, VT, + $ LDVT, DSIGMA, U2, LDU2, VT2, LDVT2, IDXP, IDX, + $ IDXC, IDXQ, COLTYP, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER INFO, K, LDU, LDU2, LDVT, LDVT2, NL, NR, SQRE + DOUBLE PRECISION ALPHA, BETA +* .. +* .. Array Arguments .. + INTEGER COLTYP( * ), IDX( * ), IDXC( * ), IDXP( * ), + $ IDXQ( * ) + DOUBLE PRECISION D( * ), DSIGMA( * ), U( LDU, * ), + $ U2( LDU2, * ), VT( LDVT, * ), VT2( LDVT2, * ), + $ Z( * ) +* .. +* +* Purpose +* ======= +* +* DLASD2 merges the two sets of singular values together into a single +* sorted set. Then it tries to deflate the size of the problem. +* There are two ways in which deflation can occur: when two or more +* singular values are close together or if there is a tiny entry in the +* Z vector. For each such occurrence the order of the related secular +* equation problem is reduced by one. +* +* DLASD2 is called from DLASD1. +* +* Arguments +* ========= +* +* NL (input) INTEGER +* The row dimension of the upper block. NL >= 1. +* +* NR (input) INTEGER +* The row dimension of the lower block. NR >= 1. +* +* SQRE (input) INTEGER +* = 0: the lower block is an NR-by-NR square matrix. +* = 1: the lower block is an NR-by-(NR+1) rectangular matrix. +* +* The bidiagonal matrix has N = NL + NR + 1 rows and +* M = N + SQRE >= N columns. +* +* K (output) INTEGER +* Contains the dimension of the non-deflated matrix, +* This is the order of the related secular equation. 1 <= K <=N. +* +* D (input/output) DOUBLE PRECISION array, dimension(N) +* On entry D contains the singular values of the two submatrices +* to be combined. On exit D contains the trailing (N-K) updated +* singular values (those which were deflated) sorted into +* increasing order. +* +* Z (output) DOUBLE PRECISION array, dimension(N) +* On exit Z contains the updating row vector in the secular +* equation. +* +* ALPHA (input) DOUBLE PRECISION +* Contains the diagonal element associated with the added row. +* +* BETA (input) DOUBLE PRECISION +* Contains the off-diagonal element associated with the added +* row. +* +* U (input/output) DOUBLE PRECISION array, dimension(LDU,N) +* On entry U contains the left singular vectors of two +* submatrices in the two square blocks with corners at (1,1), +* (NL, NL), and (NL+2, NL+2), (N,N). +* On exit U contains the trailing (N-K) updated left singular +* vectors (those which were deflated) in its last N-K columns. +* +* LDU (input) INTEGER +* The leading dimension of the array U. LDU >= N. +* +* VT (input/output) DOUBLE PRECISION array, dimension(LDVT,M) +* On entry VT' contains the right singular vectors of two +* submatrices in the two square blocks with corners at (1,1), +* (NL+1, NL+1), and (NL+2, NL+2), (M,M). +* On exit VT' contains the trailing (N-K) updated right singular +* vectors (those which were deflated) in its last N-K columns. +* In case SQRE =1, the last row of VT spans the right null +* space. +* +* LDVT (input) INTEGER +* The leading dimension of the array VT. LDVT >= M. +* +* DSIGMA (output) DOUBLE PRECISION array, dimension (N) +* Contains a copy of the diagonal elements (K-1 singular values +* and one zero) in the secular equation. +* +* U2 (output) DOUBLE PRECISION array, dimension(LDU2,N) +* Contains a copy of the first K-1 left singular vectors which +* will be used by DLASD3 in a matrix multiply (DGEMM) to solve +* for the new left singular vectors. U2 is arranged into four +* blocks. The first block contains a column with 1 at NL+1 and +* zero everywhere else; the second block contains non-zero +* entries only at and above NL; the third contains non-zero +* entries only below NL+1; and the fourth is dense. +* +* LDU2 (input) INTEGER +* The leading dimension of the array U2. LDU2 >= N. +* +* VT2 (output) DOUBLE PRECISION array, dimension(LDVT2,N) +* VT2' contains a copy of the first K right singular vectors +* which will be used by DLASD3 in a matrix multiply (DGEMM) to +* solve for the new right singular vectors. VT2 is arranged into +* three blocks. The first block contains a row that corresponds +* to the special 0 diagonal element in SIGMA; the second block +* contains non-zeros only at and before NL +1; the third block +* contains non-zeros only at and after NL +2. +* +* LDVT2 (input) INTEGER +* The leading dimension of the array VT2. LDVT2 >= M. +* +* IDXP (workspace) INTEGER array dimension(N) +* This will contain the permutation used to place deflated +* values of D at the end of the array. On output IDXP(2:K) +* points to the nondeflated D-values and IDXP(K+1:N) +* points to the deflated singular values. +* +* IDX (workspace) INTEGER array dimension(N) +* This will contain the permutation used to sort the contents of +* D into ascending order. +* +* IDXC (output) INTEGER array dimension(N) +* This will contain the permutation used to arrange the columns +* of the deflated U matrix into three groups: the first group +* contains non-zero entries only at and above NL, the second +* contains non-zero entries only below NL+2, and the third is +* dense. +* +* IDXQ (input/output) INTEGER array dimension(N) +* This contains the permutation which separately sorts the two +* sub-problems in D into ascending order. Note that entries in +* the first hlaf of this permutation must first be moved one +* position backward; and entries in the second half +* must first have NL+1 added to their values. +* +* COLTYP (workspace/output) INTEGER array dimension(N) +* As workspace, this will contain a label which will indicate +* which of the following types a column in the U2 matrix or a +* row in the VT2 matrix is: +* 1 : non-zero in the upper half only +* 2 : non-zero in the lower half only +* 3 : dense +* 4 : deflated +* +* On exit, it is an array of dimension 4, with COLTYP(I) being +* the dimension of the I-th type columns. +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE, TWO, EIGHT + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0, + $ EIGHT = 8.0D+0 ) +* .. +* .. Local Arrays .. + INTEGER CTOT( 4 ), PSM( 4 ) +* .. +* .. Local Scalars .. + INTEGER CT, I, IDXI, IDXJ, IDXJP, J, JP, JPREV, K2, M, + $ N, NLP1, NLP2 + DOUBLE PRECISION C, EPS, HLFTOL, S, TAU, TOL, Z1 +* .. +* .. External Functions .. + DOUBLE PRECISION DLAMCH, DLAPY2 + EXTERNAL DLAMCH, DLAPY2 +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DLACPY, DLAMRG, DLASET, DROT, XERBLA +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, MAX +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( NL.LT.1 ) THEN + INFO = -1 + ELSE IF( NR.LT.1 ) THEN + INFO = -2 + ELSE IF( ( SQRE.NE.1 ) .AND. ( SQRE.NE.0 ) ) THEN + INFO = -3 + END IF +* + N = NL + NR + 1 + M = N + SQRE +* + IF( LDU.LT.N ) THEN + INFO = -10 + ELSE IF( LDVT.LT.M ) THEN + INFO = -12 + ELSE IF( LDU2.LT.N ) THEN + INFO = -15 + ELSE IF( LDVT2.LT.M ) THEN + INFO = -17 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASD2', -INFO ) + RETURN + END IF +* + NLP1 = NL + 1 + NLP2 = NL + 2 +* +* Generate the first part of the vector Z; and move the singular +* values in the first part of D one position backward. +* + Z1 = ALPHA*VT( NLP1, NLP1 ) + Z( 1 ) = Z1 + DO 10 I = NL, 1, -1 + Z( I+1 ) = ALPHA*VT( I, NLP1 ) + D( I+1 ) = D( I ) + IDXQ( I+1 ) = IDXQ( I ) + 1 + 10 CONTINUE +* +* Generate the second part of the vector Z. +* + DO 20 I = NLP2, M + Z( I ) = BETA*VT( I, NLP2 ) + 20 CONTINUE +* +* Initialize some reference arrays. +* + DO 30 I = 2, NLP1 + COLTYP( I ) = 1 + 30 CONTINUE + DO 40 I = NLP2, N + COLTYP( I ) = 2 + 40 CONTINUE +* +* Sort the singular values into increasing order +* + DO 50 I = NLP2, N + IDXQ( I ) = IDXQ( I ) + NLP1 + 50 CONTINUE +* +* DSIGMA, IDXC, IDXC, and the first column of U2 +* are used as storage space. +* + DO 60 I = 2, N + DSIGMA( I ) = D( IDXQ( I ) ) + U2( I, 1 ) = Z( IDXQ( I ) ) + IDXC( I ) = COLTYP( IDXQ( I ) ) + 60 CONTINUE +* + CALL DLAMRG( NL, NR, DSIGMA( 2 ), 1, 1, IDX( 2 ) ) +* + DO 70 I = 2, N + IDXI = 1 + IDX( I ) + D( I ) = DSIGMA( IDXI ) + Z( I ) = U2( IDXI, 1 ) + COLTYP( I ) = IDXC( IDXI ) + 70 CONTINUE +* +* Calculate the allowable deflation tolerance +* + EPS = DLAMCH( 'Epsilon' ) + TOL = MAX( ABS( ALPHA ), ABS( BETA ) ) + TOL = EIGHT*EPS*MAX( ABS( D( N ) ), TOL ) +* +* There are 2 kinds of deflation -- first a value in the z-vector +* is small, second two (or more) singular values are very close +* together (their difference is small). +* +* If the value in the z-vector is small, we simply permute the +* array so that the corresponding singular value is moved to the +* end. +* +* If two values in the D-vector are close, we perform a two-sided +* rotation designed to make one of the corresponding z-vector +* entries zero, and then permute the array so that the deflated +* singular value is moved to the end. +* +* If there are multiple singular values then the problem deflates. +* Here the number of equal singular values are found. As each equal +* singular value is found, an elementary reflector is computed to +* rotate the corresponding singular subspace so that the +* corresponding components of Z are zero in this new basis. +* + K = 1 + K2 = N + 1 + DO 80 J = 2, N + IF( ABS( Z( J ) ).LE.TOL ) THEN +* +* Deflate due to small z component. +* + K2 = K2 - 1 + IDXP( K2 ) = J + COLTYP( J ) = 4 + IF( J.EQ.N ) + $ GO TO 120 + ELSE + JPREV = J + GO TO 90 + END IF + 80 CONTINUE + 90 CONTINUE + J = JPREV + 100 CONTINUE + J = J + 1 + IF( J.GT.N ) + $ GO TO 110 + IF( ABS( Z( J ) ).LE.TOL ) THEN +* +* Deflate due to small z component. +* + K2 = K2 - 1 + IDXP( K2 ) = J + COLTYP( J ) = 4 + ELSE +* +* Check if singular values are close enough to allow deflation. +* + IF( ABS( D( J )-D( JPREV ) ).LE.TOL ) THEN +* +* Deflation is possible. +* + S = Z( JPREV ) + C = Z( J ) +* +* Find sqrt(a**2+b**2) without overflow or +* destructive underflow. +* + TAU = DLAPY2( C, S ) + C = C / TAU + S = -S / TAU + Z( J ) = TAU + Z( JPREV ) = ZERO +* +* Apply back the Givens rotation to the left and right +* singular vector matrices. +* + IDXJP = IDXQ( IDX( JPREV )+1 ) + IDXJ = IDXQ( IDX( J )+1 ) + IF( IDXJP.LE.NLP1 ) THEN + IDXJP = IDXJP - 1 + END IF + IF( IDXJ.LE.NLP1 ) THEN + IDXJ = IDXJ - 1 + END IF + CALL DROT( N, U( 1, IDXJP ), 1, U( 1, IDXJ ), 1, C, S ) + CALL DROT( M, VT( IDXJP, 1 ), LDVT, VT( IDXJ, 1 ), LDVT, C, + $ S ) + IF( COLTYP( J ).NE.COLTYP( JPREV ) ) THEN + COLTYP( J ) = 3 + END IF + COLTYP( JPREV ) = 4 + K2 = K2 - 1 + IDXP( K2 ) = JPREV + JPREV = J + ELSE + K = K + 1 + U2( K, 1 ) = Z( JPREV ) + DSIGMA( K ) = D( JPREV ) + IDXP( K ) = JPREV + JPREV = J + END IF + END IF + GO TO 100 + 110 CONTINUE +* +* Record the last singular value. +* + K = K + 1 + U2( K, 1 ) = Z( JPREV ) + DSIGMA( K ) = D( JPREV ) + IDXP( K ) = JPREV +* + 120 CONTINUE +* +* Count up the total number of the various types of columns, then +* form a permutation which positions the four column types into +* four groups of uniform structure (although one or more of these +* groups may be empty). +* + DO 130 J = 1, 4 + CTOT( J ) = 0 + 130 CONTINUE + DO 140 J = 2, N + CT = COLTYP( J ) + CTOT( CT ) = CTOT( CT ) + 1 + 140 CONTINUE +* +* PSM(*) = Position in SubMatrix (of types 1 through 4) +* + PSM( 1 ) = 2 + PSM( 2 ) = 2 + CTOT( 1 ) + PSM( 3 ) = PSM( 2 ) + CTOT( 2 ) + PSM( 4 ) = PSM( 3 ) + CTOT( 3 ) +* +* Fill out the IDXC array so that the permutation which it induces +* will place all type-1 columns first, all type-2 columns next, +* then all type-3's, and finally all type-4's, starting from the +* second column. This applies similarly to the rows of VT. +* + DO 150 J = 2, N + JP = IDXP( J ) + CT = COLTYP( JP ) + IDXC( PSM( CT ) ) = J + PSM( CT ) = PSM( CT ) + 1 + 150 CONTINUE +* +* Sort the singular values and corresponding singular vectors into +* DSIGMA, U2, and VT2 respectively. The singular values/vectors +* which were not deflated go into the first K slots of DSIGMA, U2, +* and VT2 respectively, while those which were deflated go into the +* last N - K slots, except that the first column/row will be treated +* separately. +* + DO 160 J = 2, N + JP = IDXP( J ) + DSIGMA( J ) = D( JP ) + IDXJ = IDXQ( IDX( IDXP( IDXC( J ) ) )+1 ) + IF( IDXJ.LE.NLP1 ) THEN + IDXJ = IDXJ - 1 + END IF + CALL DCOPY( N, U( 1, IDXJ ), 1, U2( 1, J ), 1 ) + CALL DCOPY( M, VT( IDXJ, 1 ), LDVT, VT2( J, 1 ), LDVT2 ) + 160 CONTINUE +* +* Determine DSIGMA(1), DSIGMA(2) and Z(1) +* + DSIGMA( 1 ) = ZERO + HLFTOL = TOL / TWO + IF( ABS( DSIGMA( 2 ) ).LE.HLFTOL ) + $ DSIGMA( 2 ) = HLFTOL + IF( M.GT.N ) THEN + Z( 1 ) = DLAPY2( Z1, Z( M ) ) + IF( Z( 1 ).LE.TOL ) THEN + C = ONE + S = ZERO + Z( 1 ) = TOL + ELSE + C = Z1 / Z( 1 ) + S = Z( M ) / Z( 1 ) + END IF + ELSE + IF( ABS( Z1 ).LE.TOL ) THEN + Z( 1 ) = TOL + ELSE + Z( 1 ) = Z1 + END IF + END IF +* +* Move the rest of the updating row to Z. +* + CALL DCOPY( K-1, U2( 2, 1 ), 1, Z( 2 ), 1 ) +* +* Determine the first column of U2, the first row of VT2 and the +* last row of VT. +* + CALL DLASET( 'A', N, 1, ZERO, ZERO, U2, LDU2 ) + U2( NLP1, 1 ) = ONE + IF( M.GT.N ) THEN + DO 170 I = 1, NLP1 + VT( M, I ) = -S*VT( NLP1, I ) + VT2( 1, I ) = C*VT( NLP1, I ) + 170 CONTINUE + DO 180 I = NLP2, M + VT2( 1, I ) = S*VT( M, I ) + VT( M, I ) = C*VT( M, I ) + 180 CONTINUE + ELSE + CALL DCOPY( M, VT( NLP1, 1 ), LDVT, VT2( 1, 1 ), LDVT2 ) + END IF + IF( M.GT.N ) THEN + CALL DCOPY( M, VT( M, 1 ), LDVT, VT2( M, 1 ), LDVT2 ) + END IF +* +* The deflated singular values and their corresponding vectors go +* into the back of D, U, and V respectively. +* + IF( N.GT.K ) THEN + CALL DCOPY( N-K, DSIGMA( K+1 ), 1, D( K+1 ), 1 ) + CALL DLACPY( 'A', N, N-K, U2( 1, K+1 ), LDU2, U( 1, K+1 ), + $ LDU ) + CALL DLACPY( 'A', N-K, M, VT2( K+1, 1 ), LDVT2, VT( K+1, 1 ), + $ LDVT ) + END IF +* +* Copy CTOT into COLTYP for referencing in DLASD3. +* + DO 190 J = 1, 4 + COLTYP( J ) = CTOT( J ) + 190 CONTINUE +* + RETURN +* +* End of DLASD2 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd3.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd3.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,358 @@ + SUBROUTINE DLASD3( NL, NR, SQRE, K, D, Q, LDQ, DSIGMA, U, LDU, U2, + $ LDU2, VT, LDVT, VT2, LDVT2, IDXC, CTOT, Z, + $ INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER INFO, K, LDQ, LDU, LDU2, LDVT, LDVT2, NL, NR, + $ SQRE +* .. +* .. Array Arguments .. + INTEGER CTOT( * ), IDXC( * ) + DOUBLE PRECISION D( * ), DSIGMA( * ), Q( LDQ, * ), U( LDU, * ), + $ U2( LDU2, * ), VT( LDVT, * ), VT2( LDVT2, * ), + $ Z( * ) +* .. +* +* Purpose +* ======= +* +* DLASD3 finds all the square roots of the roots of the secular +* equation, as defined by the values in D and Z. It makes the +* appropriate calls to DLASD4 and then updates the singular +* vectors by matrix multiplication. +* +* This code makes very mild assumptions about floating point +* arithmetic. It will work on machines with a guard digit in +* add/subtract, or on those binary machines without guard digits +* which subtract like the Cray XMP, Cray YMP, Cray C 90, or Cray 2. +* It could conceivably fail on hexadecimal or decimal machines +* without guard digits, but we know of none. +* +* DLASD3 is called from DLASD1. +* +* Arguments +* ========= +* +* NL (input) INTEGER +* The row dimension of the upper block. NL >= 1. +* +* NR (input) INTEGER +* The row dimension of the lower block. NR >= 1. +* +* SQRE (input) INTEGER +* = 0: the lower block is an NR-by-NR square matrix. +* = 1: the lower block is an NR-by-(NR+1) rectangular matrix. +* +* The bidiagonal matrix has N = NL + NR + 1 rows and +* M = N + SQRE >= N columns. +* +* K (input) INTEGER +* The size of the secular equation, 1 =< K = < N. +* +* D (output) DOUBLE PRECISION array, dimension(K) +* On exit the square roots of the roots of the secular equation, +* in ascending order. +* +* Q (workspace) DOUBLE PRECISION array, +* dimension at least (LDQ,K). +* +* LDQ (input) INTEGER +* The leading dimension of the array Q. LDQ >= K. +* +* DSIGMA (input) DOUBLE PRECISION array, dimension(K) +* The first K elements of this array contain the old roots +* of the deflated updating problem. These are the poles +* of the secular equation. +* +* U (output) DOUBLE PRECISION array, dimension (LDU, N) +* The last N - K columns of this matrix contain the deflated +* left singular vectors. +* +* LDU (input) INTEGER +* The leading dimension of the array U. LDU >= N. +* +* U2 (input/output) DOUBLE PRECISION array, dimension (LDU2, N) +* The first K columns of this matrix contain the non-deflated +* left singular vectors for the split problem. +* +* LDU2 (input) INTEGER +* The leading dimension of the array U2. LDU2 >= N. +* +* VT (output) DOUBLE PRECISION array, dimension (LDVT, M) +* The last M - K columns of VT' contain the deflated +* right singular vectors. +* +* LDVT (input) INTEGER +* The leading dimension of the array VT. LDVT >= N. +* +* VT2 (input/output) DOUBLE PRECISION array, dimension (LDVT2, N) +* The first K columns of VT2' contain the non-deflated +* right singular vectors for the split problem. +* +* LDVT2 (input) INTEGER +* The leading dimension of the array VT2. LDVT2 >= N. +* +* IDXC (input) INTEGER array, dimension ( N ) +* The permutation used to arrange the columns of U (and rows of +* VT) into three groups: the first group contains non-zero +* entries only at and above (or before) NL +1; the second +* contains non-zero entries only at and below (or after) NL+2; +* and the third is dense. The first column of U and the row of +* VT are treated separately, however. +* +* The rows of the singular vectors found by DLASD4 +* must be likewise permuted before the matrix multiplies can +* take place. +* +* CTOT (input) INTEGER array, dimension ( 4 ) +* A count of the total number of the various types of columns +* in U (or rows in VT), as described in IDXC. The fourth column +* type is any column which has been deflated. +* +* Z (input) DOUBLE PRECISION array, dimension (K) +* The first K elements of this array contain the components +* of the deflation-adjusted updating row vector. +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: if INFO = 1, an singular value did not converge +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ONE, ZERO, NEGONE + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0, + $ NEGONE = -1.0D+0 ) +* .. +* .. Local Scalars .. + INTEGER CTEMP, I, J, JC, KTEMP, M, N, NLP1, NLP2, NRP1 + DOUBLE PRECISION RHO, TEMP +* .. +* .. External Functions .. + DOUBLE PRECISION DLAMC3, DNRM2 + EXTERNAL DLAMC3, DNRM2 +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DGEMM, DLACPY, DLASCL, DLASD4, XERBLA +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, SIGN, SQRT +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( NL.LT.1 ) THEN + INFO = -1 + ELSE IF( NR.LT.1 ) THEN + INFO = -2 + ELSE IF( ( SQRE.NE.1 ) .AND. ( SQRE.NE.0 ) ) THEN + INFO = -3 + END IF +* + N = NL + NR + 1 + M = N + SQRE + NLP1 = NL + 1 + NLP2 = NL + 2 +* + IF( ( K.LT.1 ) .OR. ( K.GT.N ) ) THEN + INFO = -4 + ELSE IF( LDQ.LT.K ) THEN + INFO = -7 + ELSE IF( LDU.LT.N ) THEN + INFO = -10 + ELSE IF( LDU2.LT.N ) THEN + INFO = -12 + ELSE IF( LDVT.LT.M ) THEN + INFO = -14 + ELSE IF( LDVT2.LT.M ) THEN + INFO = -16 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASD3', -INFO ) + RETURN + END IF +* +* Quick return if possible +* + IF( K.EQ.1 ) THEN + D( 1 ) = ABS( Z( 1 ) ) + CALL DCOPY( M, VT2( 1, 1 ), LDVT2, VT( 1, 1 ), LDVT ) + IF( Z( 1 ).GT.ZERO ) THEN + CALL DCOPY( N, U2( 1, 1 ), 1, U( 1, 1 ), 1 ) + ELSE + DO 10 I = 1, N + U( I, 1 ) = -U2( I, 1 ) + 10 CONTINUE + END IF + RETURN + END IF +* +* Modify values DSIGMA(i) to make sure all DSIGMA(i)-DSIGMA(j) can +* be computed with high relative accuracy (barring over/underflow). +* This is a problem on machines without a guard digit in +* add/subtract (Cray XMP, Cray YMP, Cray C 90 and Cray 2). +* The following code replaces DSIGMA(I) by 2*DSIGMA(I)-DSIGMA(I), +* which on any of these machines zeros out the bottommost +* bit of DSIGMA(I) if it is 1; this makes the subsequent +* subtractions DSIGMA(I)-DSIGMA(J) unproblematic when cancellation +* occurs. On binary machines with a guard digit (almost all +* machines) it does not change DSIGMA(I) at all. On hexadecimal +* and decimal machines with a guard digit, it slightly +* changes the bottommost bits of DSIGMA(I). It does not account +* for hexadecimal or decimal machines without guard digits +* (we know of none). We use a subroutine call to compute +* 2*DSIGMA(I) to prevent optimizing compilers from eliminating +* this code. +* + DO 20 I = 1, K + DSIGMA( I ) = DLAMC3( DSIGMA( I ), DSIGMA( I ) ) - DSIGMA( I ) + 20 CONTINUE +* +* Keep a copy of Z. +* + CALL DCOPY( K, Z, 1, Q, 1 ) +* +* Normalize Z. +* + RHO = DNRM2( K, Z, 1 ) + CALL DLASCL( 'G', 0, 0, RHO, ONE, K, 1, Z, K, INFO ) + RHO = RHO*RHO +* +* Find the new singular values. +* + DO 30 J = 1, K + CALL DLASD4( K, J, DSIGMA, Z, U( 1, J ), RHO, D( J ), + $ VT( 1, J ), INFO ) +* +* If the zero finder fails, the computation is terminated. +* + IF( INFO.NE.0 ) THEN + RETURN + END IF + 30 CONTINUE +* +* Compute updated Z. +* + DO 60 I = 1, K + Z( I ) = U( I, K )*VT( I, K ) + DO 40 J = 1, I - 1 + Z( I ) = Z( I )*( U( I, J )*VT( I, J ) / + $ ( DSIGMA( I )-DSIGMA( J ) ) / + $ ( DSIGMA( I )+DSIGMA( J ) ) ) + 40 CONTINUE + DO 50 J = I, K - 1 + Z( I ) = Z( I )*( U( I, J )*VT( I, J ) / + $ ( DSIGMA( I )-DSIGMA( J+1 ) ) / + $ ( DSIGMA( I )+DSIGMA( J+1 ) ) ) + 50 CONTINUE + Z( I ) = SIGN( SQRT( ABS( Z( I ) ) ), Q( I, 1 ) ) + 60 CONTINUE +* +* Compute left singular vectors of the modified diagonal matrix, +* and store related information for the right singular vectors. +* + DO 90 I = 1, K + VT( 1, I ) = Z( 1 ) / U( 1, I ) / VT( 1, I ) + U( 1, I ) = NEGONE + DO 70 J = 2, K + VT( J, I ) = Z( J ) / U( J, I ) / VT( J, I ) + U( J, I ) = DSIGMA( J )*VT( J, I ) + 70 CONTINUE + TEMP = DNRM2( K, U( 1, I ), 1 ) + Q( 1, I ) = U( 1, I ) / TEMP + DO 80 J = 2, K + JC = IDXC( J ) + Q( J, I ) = U( JC, I ) / TEMP + 80 CONTINUE + 90 CONTINUE +* +* Update the left singular vector matrix. +* + IF( K.EQ.2 ) THEN + CALL DGEMM( 'N', 'N', N, K, K, ONE, U2, LDU2, Q, LDQ, ZERO, U, + $ LDU ) + GO TO 100 + END IF + IF( CTOT( 1 ).GT.0 ) THEN + CALL DGEMM( 'N', 'N', NL, K, CTOT( 1 ), ONE, U2( 1, 2 ), LDU2, + $ Q( 2, 1 ), LDQ, ZERO, U( 1, 1 ), LDU ) + IF( CTOT( 3 ).GT.0 ) THEN + KTEMP = 2 + CTOT( 1 ) + CTOT( 2 ) + CALL DGEMM( 'N', 'N', NL, K, CTOT( 3 ), ONE, U2( 1, KTEMP ), + $ LDU2, Q( KTEMP, 1 ), LDQ, ONE, U( 1, 1 ), LDU ) + END IF + ELSE IF( CTOT( 3 ).GT.0 ) THEN + KTEMP = 2 + CTOT( 1 ) + CTOT( 2 ) + CALL DGEMM( 'N', 'N', NL, K, CTOT( 3 ), ONE, U2( 1, KTEMP ), + $ LDU2, Q( KTEMP, 1 ), LDQ, ZERO, U( 1, 1 ), LDU ) + ELSE + CALL DLACPY( 'F', NL, K, U2, LDU2, U, LDU ) + END IF + CALL DCOPY( K, Q( 1, 1 ), LDQ, U( NLP1, 1 ), LDU ) + KTEMP = 2 + CTOT( 1 ) + CTEMP = CTOT( 2 ) + CTOT( 3 ) + CALL DGEMM( 'N', 'N', NR, K, CTEMP, ONE, U2( NLP2, KTEMP ), LDU2, + $ Q( KTEMP, 1 ), LDQ, ZERO, U( NLP2, 1 ), LDU ) +* +* Generate the right singular vectors. +* + 100 CONTINUE + DO 120 I = 1, K + TEMP = DNRM2( K, VT( 1, I ), 1 ) + Q( I, 1 ) = VT( 1, I ) / TEMP + DO 110 J = 2, K + JC = IDXC( J ) + Q( I, J ) = VT( JC, I ) / TEMP + 110 CONTINUE + 120 CONTINUE +* +* Update the right singular vector matrix. +* + IF( K.EQ.2 ) THEN + CALL DGEMM( 'N', 'N', K, M, K, ONE, Q, LDQ, VT2, LDVT2, ZERO, + $ VT, LDVT ) + RETURN + END IF + KTEMP = 1 + CTOT( 1 ) + CALL DGEMM( 'N', 'N', K, NLP1, KTEMP, ONE, Q( 1, 1 ), LDQ, + $ VT2( 1, 1 ), LDVT2, ZERO, VT( 1, 1 ), LDVT ) + KTEMP = 2 + CTOT( 1 ) + CTOT( 2 ) + IF( KTEMP.LE.LDVT2 ) + $ CALL DGEMM( 'N', 'N', K, NLP1, CTOT( 3 ), ONE, Q( 1, KTEMP ), + $ LDQ, VT2( KTEMP, 1 ), LDVT2, ONE, VT( 1, 1 ), + $ LDVT ) +* + KTEMP = CTOT( 1 ) + 1 + NRP1 = NR + SQRE + IF( KTEMP.GT.1 ) THEN + DO 130 I = 1, K + Q( I, KTEMP ) = Q( I, 1 ) + 130 CONTINUE + DO 140 I = NLP2, M + VT2( KTEMP, I ) = VT2( 1, I ) + 140 CONTINUE + END IF + CTEMP = 1 + CTOT( 2 ) + CTOT( 3 ) + CALL DGEMM( 'N', 'N', K, NRP1, CTEMP, ONE, Q( 1, KTEMP ), LDQ, + $ VT2( KTEMP, NLP2 ), LDVT2, ZERO, VT( 1, NLP2 ), LDVT ) +* + RETURN +* +* End of DLASD3 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd4.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd4.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,890 @@ + SUBROUTINE DLASD4( N, I, D, Z, DELTA, RHO, SIGMA, WORK, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER I, INFO, N + DOUBLE PRECISION RHO, SIGMA +* .. +* .. Array Arguments .. + DOUBLE PRECISION D( * ), DELTA( * ), WORK( * ), Z( * ) +* .. +* +* Purpose +* ======= +* +* This subroutine computes the square root of the I-th updated +* eigenvalue of a positive symmetric rank-one modification to +* a positive diagonal matrix whose entries are given as the squares +* of the corresponding entries in the array d, and that +* +* 0 <= D(i) < D(j) for i < j +* +* and that RHO > 0. This is arranged by the calling routine, and is +* no loss in generality. The rank-one modified system is thus +* +* diag( D ) * diag( D ) + RHO * Z * Z_transpose. +* +* where we assume the Euclidean norm of Z is 1. +* +* The method consists of approximating the rational functions in the +* secular equation by simpler interpolating rational functions. +* +* Arguments +* ========= +* +* N (input) INTEGER +* The length of all arrays. +* +* I (input) INTEGER +* The index of the eigenvalue to be computed. 1 <= I <= N. +* +* D (input) DOUBLE PRECISION array, dimension ( N ) +* The original eigenvalues. It is assumed that they are in +* order, 0 <= D(I) < D(J) for I < J. +* +* Z (input) DOUBLE PRECISION array, dimension ( N ) +* The components of the updating vector. +* +* DELTA (output) DOUBLE PRECISION array, dimension ( N ) +* If N .ne. 1, DELTA contains (D(j) - sigma_I) in its j-th +* component. If N = 1, then DELTA(1) = 1. The vector DELTA +* contains the information necessary to construct the +* (singular) eigenvectors. +* +* RHO (input) DOUBLE PRECISION +* The scalar in the symmetric updating formula. +* +* SIGMA (output) DOUBLE PRECISION +* The computed sigma_I, the I-th updated eigenvalue. +* +* WORK (workspace) DOUBLE PRECISION array, dimension ( N ) +* If N .ne. 1, WORK contains (D(j) + sigma_I) in its j-th +* component. If N = 1, then WORK( 1 ) = 1. +* +* INFO (output) INTEGER +* = 0: successful exit +* > 0: if INFO = 1, the updating process failed. +* +* Internal Parameters +* =================== +* +* Logical variable ORGATI (origin-at-i?) is used for distinguishing +* whether D(i) or D(i+1) is treated as the origin. +* +* ORGATI = .true. origin at i +* ORGATI = .false. origin at i+1 +* +* Logical variable SWTCH3 (switch-for-3-poles?) is for noting +* if we are working with THREE poles! +* +* MAXIT is the maximum number of iterations allowed for each +* eigenvalue. +* +* Further Details +* =============== +* +* Based on contributions by +* Ren-Cang Li, Computer Science Division, University of California +* at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + INTEGER MAXIT + PARAMETER ( MAXIT = 20 ) + DOUBLE PRECISION ZERO, ONE, TWO, THREE, FOUR, EIGHT, TEN + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0, + $ THREE = 3.0D+0, FOUR = 4.0D+0, EIGHT = 8.0D+0, + $ TEN = 10.0D+0 ) +* .. +* .. Local Scalars .. + LOGICAL ORGATI, SWTCH, SWTCH3 + INTEGER II, IIM1, IIP1, IP1, ITER, J, NITER + DOUBLE PRECISION A, B, C, DELSQ, DELSQ2, DPHI, DPSI, DTIIM, + $ DTIIP, DTIPSQ, DTISQ, DTNSQ, DTNSQ1, DW, EPS, + $ ERRETM, ETA, PHI, PREW, PSI, RHOINV, SG2LB, + $ SG2UB, TAU, TEMP, TEMP1, TEMP2, W +* .. +* .. Local Arrays .. + DOUBLE PRECISION DD( 3 ), ZZ( 3 ) +* .. +* .. External Subroutines .. + EXTERNAL DLAED6, DLASD5 +* .. +* .. External Functions .. + DOUBLE PRECISION DLAMCH + EXTERNAL DLAMCH +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, MAX, MIN, SQRT +* .. +* .. Executable Statements .. +* +* Since this routine is called in an inner loop, we do no argument +* checking. +* +* Quick return for N=1 and 2. +* + INFO = 0 + IF( N.EQ.1 ) THEN +* +* Presumably, I=1 upon entry +* + SIGMA = SQRT( D( 1 )*D( 1 )+RHO*Z( 1 )*Z( 1 ) ) + DELTA( 1 ) = ONE + WORK( 1 ) = ONE + RETURN + END IF + IF( N.EQ.2 ) THEN + CALL DLASD5( I, D, Z, DELTA, RHO, SIGMA, WORK ) + RETURN + END IF +* +* Compute machine epsilon +* + EPS = DLAMCH( 'Epsilon' ) + RHOINV = ONE / RHO +* +* The case I = N +* + IF( I.EQ.N ) THEN +* +* Initialize some basic variables +* + II = N - 1 + NITER = 1 +* +* Calculate initial guess +* + TEMP = RHO / TWO +* +* If ||Z||_2 is not one, then TEMP should be set to +* RHO * ||Z||_2^2 / TWO +* + TEMP1 = TEMP / ( D( N )+SQRT( D( N )*D( N )+TEMP ) ) + DO 10 J = 1, N + WORK( J ) = D( J ) + D( N ) + TEMP1 + DELTA( J ) = ( D( J )-D( N ) ) - TEMP1 + 10 CONTINUE +* + PSI = ZERO + DO 20 J = 1, N - 2 + PSI = PSI + Z( J )*Z( J ) / ( DELTA( J )*WORK( J ) ) + 20 CONTINUE +* + C = RHOINV + PSI + W = C + Z( II )*Z( II ) / ( DELTA( II )*WORK( II ) ) + + $ Z( N )*Z( N ) / ( DELTA( N )*WORK( N ) ) +* + IF( W.LE.ZERO ) THEN + TEMP1 = SQRT( D( N )*D( N )+RHO ) + TEMP = Z( N-1 )*Z( N-1 ) / ( ( D( N-1 )+TEMP1 )* + $ ( D( N )-D( N-1 )+RHO / ( D( N )+TEMP1 ) ) ) + + $ Z( N )*Z( N ) / RHO +* +* The following TAU is to approximate +* SIGMA_n^2 - D( N )*D( N ) +* + IF( C.LE.TEMP ) THEN + TAU = RHO + ELSE + DELSQ = ( D( N )-D( N-1 ) )*( D( N )+D( N-1 ) ) + A = -C*DELSQ + Z( N-1 )*Z( N-1 ) + Z( N )*Z( N ) + B = Z( N )*Z( N )*DELSQ + IF( A.LT.ZERO ) THEN + TAU = TWO*B / ( SQRT( A*A+FOUR*B*C )-A ) + ELSE + TAU = ( A+SQRT( A*A+FOUR*B*C ) ) / ( TWO*C ) + END IF + END IF +* +* It can be proved that +* D(N)^2+RHO/2 <= SIGMA_n^2 < D(N)^2+TAU <= D(N)^2+RHO +* + ELSE + DELSQ = ( D( N )-D( N-1 ) )*( D( N )+D( N-1 ) ) + A = -C*DELSQ + Z( N-1 )*Z( N-1 ) + Z( N )*Z( N ) + B = Z( N )*Z( N )*DELSQ +* +* The following TAU is to approximate +* SIGMA_n^2 - D( N )*D( N ) +* + IF( A.LT.ZERO ) THEN + TAU = TWO*B / ( SQRT( A*A+FOUR*B*C )-A ) + ELSE + TAU = ( A+SQRT( A*A+FOUR*B*C ) ) / ( TWO*C ) + END IF +* +* It can be proved that +* D(N)^2 < D(N)^2+TAU < SIGMA(N)^2 < D(N)^2+RHO/2 +* + END IF +* +* The following ETA is to approximate SIGMA_n - D( N ) +* + ETA = TAU / ( D( N )+SQRT( D( N )*D( N )+TAU ) ) +* + SIGMA = D( N ) + ETA + DO 30 J = 1, N + DELTA( J ) = ( D( J )-D( I ) ) - ETA + WORK( J ) = D( J ) + D( I ) + ETA + 30 CONTINUE +* +* Evaluate PSI and the derivative DPSI +* + DPSI = ZERO + PSI = ZERO + ERRETM = ZERO + DO 40 J = 1, II + TEMP = Z( J ) / ( DELTA( J )*WORK( J ) ) + PSI = PSI + Z( J )*TEMP + DPSI = DPSI + TEMP*TEMP + ERRETM = ERRETM + PSI + 40 CONTINUE + ERRETM = ABS( ERRETM ) +* +* Evaluate PHI and the derivative DPHI +* + TEMP = Z( N ) / ( DELTA( N )*WORK( N ) ) + PHI = Z( N )*TEMP + DPHI = TEMP*TEMP + ERRETM = EIGHT*( -PHI-PSI ) + ERRETM - PHI + RHOINV + + $ ABS( TAU )*( DPSI+DPHI ) +* + W = RHOINV + PHI + PSI +* +* Test for convergence +* + IF( ABS( W ).LE.EPS*ERRETM ) THEN + GO TO 240 + END IF +* +* Calculate the new step +* + NITER = NITER + 1 + DTNSQ1 = WORK( N-1 )*DELTA( N-1 ) + DTNSQ = WORK( N )*DELTA( N ) + C = W - DTNSQ1*DPSI - DTNSQ*DPHI + A = ( DTNSQ+DTNSQ1 )*W - DTNSQ*DTNSQ1*( DPSI+DPHI ) + B = DTNSQ*DTNSQ1*W + IF( C.LT.ZERO ) + $ C = ABS( C ) + IF( C.EQ.ZERO ) THEN + ETA = RHO - SIGMA*SIGMA + ELSE IF( A.GE.ZERO ) THEN + ETA = ( A+SQRT( ABS( A*A-FOUR*B*C ) ) ) / ( TWO*C ) + ELSE + ETA = TWO*B / ( A-SQRT( ABS( A*A-FOUR*B*C ) ) ) + END IF +* +* Note, eta should be positive if w is negative, and +* eta should be negative otherwise. However, +* if for some reason caused by roundoff, eta*w > 0, +* we simply use one Newton step instead. This way +* will guarantee eta*w < 0. +* + IF( W*ETA.GT.ZERO ) + $ ETA = -W / ( DPSI+DPHI ) + TEMP = ETA - DTNSQ + IF( TEMP.GT.RHO ) + $ ETA = RHO + DTNSQ +* + TAU = TAU + ETA + ETA = ETA / ( SIGMA+SQRT( ETA+SIGMA*SIGMA ) ) + DO 50 J = 1, N + DELTA( J ) = DELTA( J ) - ETA + WORK( J ) = WORK( J ) + ETA + 50 CONTINUE +* + SIGMA = SIGMA + ETA +* +* Evaluate PSI and the derivative DPSI +* + DPSI = ZERO + PSI = ZERO + ERRETM = ZERO + DO 60 J = 1, II + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PSI = PSI + Z( J )*TEMP + DPSI = DPSI + TEMP*TEMP + ERRETM = ERRETM + PSI + 60 CONTINUE + ERRETM = ABS( ERRETM ) +* +* Evaluate PHI and the derivative DPHI +* + TEMP = Z( N ) / ( WORK( N )*DELTA( N ) ) + PHI = Z( N )*TEMP + DPHI = TEMP*TEMP + ERRETM = EIGHT*( -PHI-PSI ) + ERRETM - PHI + RHOINV + + $ ABS( TAU )*( DPSI+DPHI ) +* + W = RHOINV + PHI + PSI +* +* Main loop to update the values of the array DELTA +* + ITER = NITER + 1 +* + DO 90 NITER = ITER, MAXIT +* +* Test for convergence +* + IF( ABS( W ).LE.EPS*ERRETM ) THEN + GO TO 240 + END IF +* +* Calculate the new step +* + DTNSQ1 = WORK( N-1 )*DELTA( N-1 ) + DTNSQ = WORK( N )*DELTA( N ) + C = W - DTNSQ1*DPSI - DTNSQ*DPHI + A = ( DTNSQ+DTNSQ1 )*W - DTNSQ1*DTNSQ*( DPSI+DPHI ) + B = DTNSQ1*DTNSQ*W + IF( A.GE.ZERO ) THEN + ETA = ( A+SQRT( ABS( A*A-FOUR*B*C ) ) ) / ( TWO*C ) + ELSE + ETA = TWO*B / ( A-SQRT( ABS( A*A-FOUR*B*C ) ) ) + END IF +* +* Note, eta should be positive if w is negative, and +* eta should be negative otherwise. However, +* if for some reason caused by roundoff, eta*w > 0, +* we simply use one Newton step instead. This way +* will guarantee eta*w < 0. +* + IF( W*ETA.GT.ZERO ) + $ ETA = -W / ( DPSI+DPHI ) + TEMP = ETA - DTNSQ + IF( TEMP.LE.ZERO ) + $ ETA = ETA / TWO +* + TAU = TAU + ETA + ETA = ETA / ( SIGMA+SQRT( ETA+SIGMA*SIGMA ) ) + DO 70 J = 1, N + DELTA( J ) = DELTA( J ) - ETA + WORK( J ) = WORK( J ) + ETA + 70 CONTINUE +* + SIGMA = SIGMA + ETA +* +* Evaluate PSI and the derivative DPSI +* + DPSI = ZERO + PSI = ZERO + ERRETM = ZERO + DO 80 J = 1, II + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PSI = PSI + Z( J )*TEMP + DPSI = DPSI + TEMP*TEMP + ERRETM = ERRETM + PSI + 80 CONTINUE + ERRETM = ABS( ERRETM ) +* +* Evaluate PHI and the derivative DPHI +* + TEMP = Z( N ) / ( WORK( N )*DELTA( N ) ) + PHI = Z( N )*TEMP + DPHI = TEMP*TEMP + ERRETM = EIGHT*( -PHI-PSI ) + ERRETM - PHI + RHOINV + + $ ABS( TAU )*( DPSI+DPHI ) +* + W = RHOINV + PHI + PSI + 90 CONTINUE +* +* Return with INFO = 1, NITER = MAXIT and not converged +* + INFO = 1 + GO TO 240 +* +* End for the case I = N +* + ELSE +* +* The case for I < N +* + NITER = 1 + IP1 = I + 1 +* +* Calculate initial guess +* + DELSQ = ( D( IP1 )-D( I ) )*( D( IP1 )+D( I ) ) + DELSQ2 = DELSQ / TWO + TEMP = DELSQ2 / ( D( I )+SQRT( D( I )*D( I )+DELSQ2 ) ) + DO 100 J = 1, N + WORK( J ) = D( J ) + D( I ) + TEMP + DELTA( J ) = ( D( J )-D( I ) ) - TEMP + 100 CONTINUE +* + PSI = ZERO + DO 110 J = 1, I - 1 + PSI = PSI + Z( J )*Z( J ) / ( WORK( J )*DELTA( J ) ) + 110 CONTINUE +* + PHI = ZERO + DO 120 J = N, I + 2, -1 + PHI = PHI + Z( J )*Z( J ) / ( WORK( J )*DELTA( J ) ) + 120 CONTINUE + C = RHOINV + PSI + PHI + W = C + Z( I )*Z( I ) / ( WORK( I )*DELTA( I ) ) + + $ Z( IP1 )*Z( IP1 ) / ( WORK( IP1 )*DELTA( IP1 ) ) +* + IF( W.GT.ZERO ) THEN +* +* d(i)^2 < the ith sigma^2 < (d(i)^2+d(i+1)^2)/2 +* +* We choose d(i) as origin. +* + ORGATI = .TRUE. + SG2LB = ZERO + SG2UB = DELSQ2 + A = C*DELSQ + Z( I )*Z( I ) + Z( IP1 )*Z( IP1 ) + B = Z( I )*Z( I )*DELSQ + IF( A.GT.ZERO ) THEN + TAU = TWO*B / ( A+SQRT( ABS( A*A-FOUR*B*C ) ) ) + ELSE + TAU = ( A-SQRT( ABS( A*A-FOUR*B*C ) ) ) / ( TWO*C ) + END IF +* +* TAU now is an estimation of SIGMA^2 - D( I )^2. The +* following, however, is the corresponding estimation of +* SIGMA - D( I ). +* + ETA = TAU / ( D( I )+SQRT( D( I )*D( I )+TAU ) ) + ELSE +* +* (d(i)^2+d(i+1)^2)/2 <= the ith sigma^2 < d(i+1)^2/2 +* +* We choose d(i+1) as origin. +* + ORGATI = .FALSE. + SG2LB = -DELSQ2 + SG2UB = ZERO + A = C*DELSQ - Z( I )*Z( I ) - Z( IP1 )*Z( IP1 ) + B = Z( IP1 )*Z( IP1 )*DELSQ + IF( A.LT.ZERO ) THEN + TAU = TWO*B / ( A-SQRT( ABS( A*A+FOUR*B*C ) ) ) + ELSE + TAU = -( A+SQRT( ABS( A*A+FOUR*B*C ) ) ) / ( TWO*C ) + END IF +* +* TAU now is an estimation of SIGMA^2 - D( IP1 )^2. The +* following, however, is the corresponding estimation of +* SIGMA - D( IP1 ). +* + ETA = TAU / ( D( IP1 )+SQRT( ABS( D( IP1 )*D( IP1 )+ + $ TAU ) ) ) + END IF +* + IF( ORGATI ) THEN + II = I + SIGMA = D( I ) + ETA + DO 130 J = 1, N + WORK( J ) = D( J ) + D( I ) + ETA + DELTA( J ) = ( D( J )-D( I ) ) - ETA + 130 CONTINUE + ELSE + II = I + 1 + SIGMA = D( IP1 ) + ETA + DO 140 J = 1, N + WORK( J ) = D( J ) + D( IP1 ) + ETA + DELTA( J ) = ( D( J )-D( IP1 ) ) - ETA + 140 CONTINUE + END IF + IIM1 = II - 1 + IIP1 = II + 1 +* +* Evaluate PSI and the derivative DPSI +* + DPSI = ZERO + PSI = ZERO + ERRETM = ZERO + DO 150 J = 1, IIM1 + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PSI = PSI + Z( J )*TEMP + DPSI = DPSI + TEMP*TEMP + ERRETM = ERRETM + PSI + 150 CONTINUE + ERRETM = ABS( ERRETM ) +* +* Evaluate PHI and the derivative DPHI +* + DPHI = ZERO + PHI = ZERO + DO 160 J = N, IIP1, -1 + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PHI = PHI + Z( J )*TEMP + DPHI = DPHI + TEMP*TEMP + ERRETM = ERRETM + PHI + 160 CONTINUE +* + W = RHOINV + PHI + PSI +* +* W is the value of the secular function with +* its ii-th element removed. +* + SWTCH3 = .FALSE. + IF( ORGATI ) THEN + IF( W.LT.ZERO ) + $ SWTCH3 = .TRUE. + ELSE + IF( W.GT.ZERO ) + $ SWTCH3 = .TRUE. + END IF + IF( II.EQ.1 .OR. II.EQ.N ) + $ SWTCH3 = .FALSE. +* + TEMP = Z( II ) / ( WORK( II )*DELTA( II ) ) + DW = DPSI + DPHI + TEMP*TEMP + TEMP = Z( II )*TEMP + W = W + TEMP + ERRETM = EIGHT*( PHI-PSI ) + ERRETM + TWO*RHOINV + + $ THREE*ABS( TEMP ) + ABS( TAU )*DW +* +* Test for convergence +* + IF( ABS( W ).LE.EPS*ERRETM ) THEN + GO TO 240 + END IF +* + IF( W.LE.ZERO ) THEN + SG2LB = MAX( SG2LB, TAU ) + ELSE + SG2UB = MIN( SG2UB, TAU ) + END IF +* +* Calculate the new step +* + NITER = NITER + 1 + IF( .NOT.SWTCH3 ) THEN + DTIPSQ = WORK( IP1 )*DELTA( IP1 ) + DTISQ = WORK( I )*DELTA( I ) + IF( ORGATI ) THEN + C = W - DTIPSQ*DW + DELSQ*( Z( I ) / DTISQ )**2 + ELSE + C = W - DTISQ*DW - DELSQ*( Z( IP1 ) / DTIPSQ )**2 + END IF + A = ( DTIPSQ+DTISQ )*W - DTIPSQ*DTISQ*DW + B = DTIPSQ*DTISQ*W + IF( C.EQ.ZERO ) THEN + IF( A.EQ.ZERO ) THEN + IF( ORGATI ) THEN + A = Z( I )*Z( I ) + DTIPSQ*DTIPSQ*( DPSI+DPHI ) + ELSE + A = Z( IP1 )*Z( IP1 ) + DTISQ*DTISQ*( DPSI+DPHI ) + END IF + END IF + ETA = B / A + ELSE IF( A.LE.ZERO ) THEN + ETA = ( A-SQRT( ABS( A*A-FOUR*B*C ) ) ) / ( TWO*C ) + ELSE + ETA = TWO*B / ( A+SQRT( ABS( A*A-FOUR*B*C ) ) ) + END IF + ELSE +* +* Interpolation using THREE most relevant poles +* + DTIIM = WORK( IIM1 )*DELTA( IIM1 ) + DTIIP = WORK( IIP1 )*DELTA( IIP1 ) + TEMP = RHOINV + PSI + PHI + IF( ORGATI ) THEN + TEMP1 = Z( IIM1 ) / DTIIM + TEMP1 = TEMP1*TEMP1 + C = ( TEMP - DTIIP*( DPSI+DPHI ) ) - + $ ( D( IIM1 )-D( IIP1 ) )*( D( IIM1 )+D( IIP1 ) )*TEMP1 + ZZ( 1 ) = Z( IIM1 )*Z( IIM1 ) + IF( DPSI.LT.TEMP1 ) THEN + ZZ( 3 ) = DTIIP*DTIIP*DPHI + ELSE + ZZ( 3 ) = DTIIP*DTIIP*( ( DPSI-TEMP1 )+DPHI ) + END IF + ELSE + TEMP1 = Z( IIP1 ) / DTIIP + TEMP1 = TEMP1*TEMP1 + C = ( TEMP - DTIIM*( DPSI+DPHI ) ) - + $ ( D( IIP1 )-D( IIM1 ) )*( D( IIM1 )+D( IIP1 ) )*TEMP1 + IF( DPHI.LT.TEMP1 ) THEN + ZZ( 1 ) = DTIIM*DTIIM*DPSI + ELSE + ZZ( 1 ) = DTIIM*DTIIM*( DPSI+( DPHI-TEMP1 ) ) + END IF + ZZ( 3 ) = Z( IIP1 )*Z( IIP1 ) + END IF + ZZ( 2 ) = Z( II )*Z( II ) + DD( 1 ) = DTIIM + DD( 2 ) = DELTA( II )*WORK( II ) + DD( 3 ) = DTIIP + CALL DLAED6( NITER, ORGATI, C, DD, ZZ, W, ETA, INFO ) + IF( INFO.NE.0 ) + $ GO TO 240 + END IF +* +* Note, eta should be positive if w is negative, and +* eta should be negative otherwise. However, +* if for some reason caused by roundoff, eta*w > 0, +* we simply use one Newton step instead. This way +* will guarantee eta*w < 0. +* + IF( W*ETA.GE.ZERO ) + $ ETA = -W / DW + IF( ORGATI ) THEN + TEMP1 = WORK( I )*DELTA( I ) + TEMP = ETA - TEMP1 + ELSE + TEMP1 = WORK( IP1 )*DELTA( IP1 ) + TEMP = ETA - TEMP1 + END IF + IF( TEMP.GT.SG2UB .OR. TEMP.LT.SG2LB ) THEN + IF( W.LT.ZERO ) THEN + ETA = ( SG2UB-TAU ) / TWO + ELSE + ETA = ( SG2LB-TAU ) / TWO + END IF + END IF +* + TAU = TAU + ETA + ETA = ETA / ( SIGMA+SQRT( SIGMA*SIGMA+ETA ) ) +* + PREW = W +* + SIGMA = SIGMA + ETA + DO 170 J = 1, N + WORK( J ) = WORK( J ) + ETA + DELTA( J ) = DELTA( J ) - ETA + 170 CONTINUE +* +* Evaluate PSI and the derivative DPSI +* + DPSI = ZERO + PSI = ZERO + ERRETM = ZERO + DO 180 J = 1, IIM1 + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PSI = PSI + Z( J )*TEMP + DPSI = DPSI + TEMP*TEMP + ERRETM = ERRETM + PSI + 180 CONTINUE + ERRETM = ABS( ERRETM ) +* +* Evaluate PHI and the derivative DPHI +* + DPHI = ZERO + PHI = ZERO + DO 190 J = N, IIP1, -1 + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PHI = PHI + Z( J )*TEMP + DPHI = DPHI + TEMP*TEMP + ERRETM = ERRETM + PHI + 190 CONTINUE +* + TEMP = Z( II ) / ( WORK( II )*DELTA( II ) ) + DW = DPSI + DPHI + TEMP*TEMP + TEMP = Z( II )*TEMP + W = RHOINV + PHI + PSI + TEMP + ERRETM = EIGHT*( PHI-PSI ) + ERRETM + TWO*RHOINV + + $ THREE*ABS( TEMP ) + ABS( TAU )*DW +* + IF( W.LE.ZERO ) THEN + SG2LB = MAX( SG2LB, TAU ) + ELSE + SG2UB = MIN( SG2UB, TAU ) + END IF +* + SWTCH = .FALSE. + IF( ORGATI ) THEN + IF( -W.GT.ABS( PREW ) / TEN ) + $ SWTCH = .TRUE. + ELSE + IF( W.GT.ABS( PREW ) / TEN ) + $ SWTCH = .TRUE. + END IF +* +* Main loop to update the values of the array DELTA and WORK +* + ITER = NITER + 1 +* + DO 230 NITER = ITER, MAXIT +* +* Test for convergence +* + IF( ABS( W ).LE.EPS*ERRETM ) THEN + GO TO 240 + END IF +* +* Calculate the new step +* + IF( .NOT.SWTCH3 ) THEN + DTIPSQ = WORK( IP1 )*DELTA( IP1 ) + DTISQ = WORK( I )*DELTA( I ) + IF( .NOT.SWTCH ) THEN + IF( ORGATI ) THEN + C = W - DTIPSQ*DW + DELSQ*( Z( I ) / DTISQ )**2 + ELSE + C = W - DTISQ*DW - DELSQ*( Z( IP1 ) / DTIPSQ )**2 + END IF + ELSE + TEMP = Z( II ) / ( WORK( II )*DELTA( II ) ) + IF( ORGATI ) THEN + DPSI = DPSI + TEMP*TEMP + ELSE + DPHI = DPHI + TEMP*TEMP + END IF + C = W - DTISQ*DPSI - DTIPSQ*DPHI + END IF + A = ( DTIPSQ+DTISQ )*W - DTIPSQ*DTISQ*DW + B = DTIPSQ*DTISQ*W + IF( C.EQ.ZERO ) THEN + IF( A.EQ.ZERO ) THEN + IF( .NOT.SWTCH ) THEN + IF( ORGATI ) THEN + A = Z( I )*Z( I ) + DTIPSQ*DTIPSQ* + $ ( DPSI+DPHI ) + ELSE + A = Z( IP1 )*Z( IP1 ) + + $ DTISQ*DTISQ*( DPSI+DPHI ) + END IF + ELSE + A = DTISQ*DTISQ*DPSI + DTIPSQ*DTIPSQ*DPHI + END IF + END IF + ETA = B / A + ELSE IF( A.LE.ZERO ) THEN + ETA = ( A-SQRT( ABS( A*A-FOUR*B*C ) ) ) / ( TWO*C ) + ELSE + ETA = TWO*B / ( A+SQRT( ABS( A*A-FOUR*B*C ) ) ) + END IF + ELSE +* +* Interpolation using THREE most relevant poles +* + DTIIM = WORK( IIM1 )*DELTA( IIM1 ) + DTIIP = WORK( IIP1 )*DELTA( IIP1 ) + TEMP = RHOINV + PSI + PHI + IF( SWTCH ) THEN + C = TEMP - DTIIM*DPSI - DTIIP*DPHI + ZZ( 1 ) = DTIIM*DTIIM*DPSI + ZZ( 3 ) = DTIIP*DTIIP*DPHI + ELSE + IF( ORGATI ) THEN + TEMP1 = Z( IIM1 ) / DTIIM + TEMP1 = TEMP1*TEMP1 + TEMP2 = ( D( IIM1 )-D( IIP1 ) )* + $ ( D( IIM1 )+D( IIP1 ) )*TEMP1 + C = TEMP - DTIIP*( DPSI+DPHI ) - TEMP2 + ZZ( 1 ) = Z( IIM1 )*Z( IIM1 ) + IF( DPSI.LT.TEMP1 ) THEN + ZZ( 3 ) = DTIIP*DTIIP*DPHI + ELSE + ZZ( 3 ) = DTIIP*DTIIP*( ( DPSI-TEMP1 )+DPHI ) + END IF + ELSE + TEMP1 = Z( IIP1 ) / DTIIP + TEMP1 = TEMP1*TEMP1 + TEMP2 = ( D( IIP1 )-D( IIM1 ) )* + $ ( D( IIM1 )+D( IIP1 ) )*TEMP1 + C = TEMP - DTIIM*( DPSI+DPHI ) - TEMP2 + IF( DPHI.LT.TEMP1 ) THEN + ZZ( 1 ) = DTIIM*DTIIM*DPSI + ELSE + ZZ( 1 ) = DTIIM*DTIIM*( DPSI+( DPHI-TEMP1 ) ) + END IF + ZZ( 3 ) = Z( IIP1 )*Z( IIP1 ) + END IF + END IF + DD( 1 ) = DTIIM + DD( 2 ) = DELTA( II )*WORK( II ) + DD( 3 ) = DTIIP + CALL DLAED6( NITER, ORGATI, C, DD, ZZ, W, ETA, INFO ) + IF( INFO.NE.0 ) + $ GO TO 240 + END IF +* +* Note, eta should be positive if w is negative, and +* eta should be negative otherwise. However, +* if for some reason caused by roundoff, eta*w > 0, +* we simply use one Newton step instead. This way +* will guarantee eta*w < 0. +* + IF( W*ETA.GE.ZERO ) + $ ETA = -W / DW + IF( ORGATI ) THEN + TEMP1 = WORK( I )*DELTA( I ) + TEMP = ETA - TEMP1 + ELSE + TEMP1 = WORK( IP1 )*DELTA( IP1 ) + TEMP = ETA - TEMP1 + END IF + IF( TEMP.GT.SG2UB .OR. TEMP.LT.SG2LB ) THEN + IF( W.LT.ZERO ) THEN + ETA = ( SG2UB-TAU ) / TWO + ELSE + ETA = ( SG2LB-TAU ) / TWO + END IF + END IF +* + TAU = TAU + ETA + ETA = ETA / ( SIGMA+SQRT( SIGMA*SIGMA+ETA ) ) +* + SIGMA = SIGMA + ETA + DO 200 J = 1, N + WORK( J ) = WORK( J ) + ETA + DELTA( J ) = DELTA( J ) - ETA + 200 CONTINUE +* + PREW = W +* +* Evaluate PSI and the derivative DPSI +* + DPSI = ZERO + PSI = ZERO + ERRETM = ZERO + DO 210 J = 1, IIM1 + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PSI = PSI + Z( J )*TEMP + DPSI = DPSI + TEMP*TEMP + ERRETM = ERRETM + PSI + 210 CONTINUE + ERRETM = ABS( ERRETM ) +* +* Evaluate PHI and the derivative DPHI +* + DPHI = ZERO + PHI = ZERO + DO 220 J = N, IIP1, -1 + TEMP = Z( J ) / ( WORK( J )*DELTA( J ) ) + PHI = PHI + Z( J )*TEMP + DPHI = DPHI + TEMP*TEMP + ERRETM = ERRETM + PHI + 220 CONTINUE +* + TEMP = Z( II ) / ( WORK( II )*DELTA( II ) ) + DW = DPSI + DPHI + TEMP*TEMP + TEMP = Z( II )*TEMP + W = RHOINV + PHI + PSI + TEMP + ERRETM = EIGHT*( PHI-PSI ) + ERRETM + TWO*RHOINV + + $ THREE*ABS( TEMP ) + ABS( TAU )*DW + IF( W*PREW.GT.ZERO .AND. ABS( W ).GT.ABS( PREW ) / TEN ) + $ SWTCH = .NOT.SWTCH +* + IF( W.LE.ZERO ) THEN + SG2LB = MAX( SG2LB, TAU ) + ELSE + SG2UB = MIN( SG2UB, TAU ) + END IF +* + 230 CONTINUE +* +* Return with INFO = 1, NITER = MAXIT and not converged +* + INFO = 1 +* + END IF +* + 240 CONTINUE + RETURN +* +* End of DLASD4 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd5.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd5.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,163 @@ + SUBROUTINE DLASD5( I, D, Z, DELTA, RHO, DSIGMA, WORK ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER I + DOUBLE PRECISION DSIGMA, RHO +* .. +* .. Array Arguments .. + DOUBLE PRECISION D( 2 ), DELTA( 2 ), WORK( 2 ), Z( 2 ) +* .. +* +* Purpose +* ======= +* +* This subroutine computes the square root of the I-th eigenvalue +* of a positive symmetric rank-one modification of a 2-by-2 diagonal +* matrix +* +* diag( D ) * diag( D ) + RHO * Z * transpose(Z) . +* +* The diagonal entries in the array D are assumed to satisfy +* +* 0 <= D(i) < D(j) for i < j . +* +* We also assume RHO > 0 and that the Euclidean norm of the vector +* Z is one. +* +* Arguments +* ========= +* +* I (input) INTEGER +* The index of the eigenvalue to be computed. I = 1 or I = 2. +* +* D (input) DOUBLE PRECISION array, dimension ( 2 ) +* The original eigenvalues. We assume 0 <= D(1) < D(2). +* +* Z (input) DOUBLE PRECISION array, dimension ( 2 ) +* The components of the updating vector. +* +* DELTA (output) DOUBLE PRECISION array, dimension ( 2 ) +* Contains (D(j) - sigma_I) in its j-th component. +* The vector DELTA contains the information necessary +* to construct the eigenvectors. +* +* RHO (input) DOUBLE PRECISION +* The scalar in the symmetric updating formula. +* +* DSIGMA (output) DOUBLE PRECISION +* The computed sigma_I, the I-th updated eigenvalue. +* +* WORK (workspace) DOUBLE PRECISION array, dimension ( 2 ) +* WORK contains (D(j) + sigma_I) in its j-th component. +* +* Further Details +* =============== +* +* Based on contributions by +* Ren-Cang Li, Computer Science Division, University of California +* at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE, TWO, THREE, FOUR + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0, + $ THREE = 3.0D+0, FOUR = 4.0D+0 ) +* .. +* .. Local Scalars .. + DOUBLE PRECISION B, C, DEL, DELSQ, TAU, W +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, SQRT +* .. +* .. Executable Statements .. +* + DEL = D( 2 ) - D( 1 ) + DELSQ = DEL*( D( 2 )+D( 1 ) ) + IF( I.EQ.1 ) THEN + W = ONE + FOUR*RHO*( Z( 2 )*Z( 2 ) / ( D( 1 )+THREE*D( 2 ) )- + $ Z( 1 )*Z( 1 ) / ( THREE*D( 1 )+D( 2 ) ) ) / DEL + IF( W.GT.ZERO ) THEN + B = DELSQ + RHO*( Z( 1 )*Z( 1 )+Z( 2 )*Z( 2 ) ) + C = RHO*Z( 1 )*Z( 1 )*DELSQ +* +* B > ZERO, always +* +* The following TAU is DSIGMA * DSIGMA - D( 1 ) * D( 1 ) +* + TAU = TWO*C / ( B+SQRT( ABS( B*B-FOUR*C ) ) ) +* +* The following TAU is DSIGMA - D( 1 ) +* + TAU = TAU / ( D( 1 )+SQRT( D( 1 )*D( 1 )+TAU ) ) + DSIGMA = D( 1 ) + TAU + DELTA( 1 ) = -TAU + DELTA( 2 ) = DEL - TAU + WORK( 1 ) = TWO*D( 1 ) + TAU + WORK( 2 ) = ( D( 1 )+TAU ) + D( 2 ) +* DELTA( 1 ) = -Z( 1 ) / TAU +* DELTA( 2 ) = Z( 2 ) / ( DEL-TAU ) + ELSE + B = -DELSQ + RHO*( Z( 1 )*Z( 1 )+Z( 2 )*Z( 2 ) ) + C = RHO*Z( 2 )*Z( 2 )*DELSQ +* +* The following TAU is DSIGMA * DSIGMA - D( 2 ) * D( 2 ) +* + IF( B.GT.ZERO ) THEN + TAU = -TWO*C / ( B+SQRT( B*B+FOUR*C ) ) + ELSE + TAU = ( B-SQRT( B*B+FOUR*C ) ) / TWO + END IF +* +* The following TAU is DSIGMA - D( 2 ) +* + TAU = TAU / ( D( 2 )+SQRT( ABS( D( 2 )*D( 2 )+TAU ) ) ) + DSIGMA = D( 2 ) + TAU + DELTA( 1 ) = -( DEL+TAU ) + DELTA( 2 ) = -TAU + WORK( 1 ) = D( 1 ) + TAU + D( 2 ) + WORK( 2 ) = TWO*D( 2 ) + TAU +* DELTA( 1 ) = -Z( 1 ) / ( DEL+TAU ) +* DELTA( 2 ) = -Z( 2 ) / TAU + END IF +* TEMP = SQRT( DELTA( 1 )*DELTA( 1 )+DELTA( 2 )*DELTA( 2 ) ) +* DELTA( 1 ) = DELTA( 1 ) / TEMP +* DELTA( 2 ) = DELTA( 2 ) / TEMP + ELSE +* +* Now I=2 +* + B = -DELSQ + RHO*( Z( 1 )*Z( 1 )+Z( 2 )*Z( 2 ) ) + C = RHO*Z( 2 )*Z( 2 )*DELSQ +* +* The following TAU is DSIGMA * DSIGMA - D( 2 ) * D( 2 ) +* + IF( B.GT.ZERO ) THEN + TAU = ( B+SQRT( B*B+FOUR*C ) ) / TWO + ELSE + TAU = TWO*C / ( -B+SQRT( B*B+FOUR*C ) ) + END IF +* +* The following TAU is DSIGMA - D( 2 ) +* + TAU = TAU / ( D( 2 )+SQRT( D( 2 )*D( 2 )+TAU ) ) + DSIGMA = D( 2 ) + TAU + DELTA( 1 ) = -( DEL+TAU ) + DELTA( 2 ) = -TAU + WORK( 1 ) = D( 1 ) + TAU + D( 2 ) + WORK( 2 ) = TWO*D( 2 ) + TAU +* DELTA( 1 ) = -Z( 1 ) / ( DEL+TAU ) +* DELTA( 2 ) = -Z( 2 ) / TAU +* TEMP = SQRT( DELTA( 1 )*DELTA( 1 )+DELTA( 2 )*DELTA( 2 ) ) +* DELTA( 1 ) = DELTA( 1 ) / TEMP +* DELTA( 2 ) = DELTA( 2 ) / TEMP + END IF + RETURN +* +* End of DLASD5 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd6.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd6.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,305 @@ + SUBROUTINE DLASD6( ICOMPQ, NL, NR, SQRE, D, VF, VL, ALPHA, BETA, + $ IDXQ, PERM, GIVPTR, GIVCOL, LDGCOL, GIVNUM, + $ LDGNUM, POLES, DIFL, DIFR, Z, K, C, S, WORK, + $ IWORK, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER GIVPTR, ICOMPQ, INFO, K, LDGCOL, LDGNUM, NL, + $ NR, SQRE + DOUBLE PRECISION ALPHA, BETA, C, S +* .. +* .. Array Arguments .. + INTEGER GIVCOL( LDGCOL, * ), IDXQ( * ), IWORK( * ), + $ PERM( * ) + DOUBLE PRECISION D( * ), DIFL( * ), DIFR( * ), + $ GIVNUM( LDGNUM, * ), POLES( LDGNUM, * ), + $ VF( * ), VL( * ), WORK( * ), Z( * ) +* .. +* +* Purpose +* ======= +* +* DLASD6 computes the SVD of an updated upper bidiagonal matrix B +* obtained by merging two smaller ones by appending a row. This +* routine is used only for the problem which requires all singular +* values and optionally singular vector matrices in factored form. +* B is an N-by-M matrix with N = NL + NR + 1 and M = N + SQRE. +* A related subroutine, DLASD1, handles the case in which all singular +* values and singular vectors of the bidiagonal matrix are desired. +* +* DLASD6 computes the SVD as follows: +* +* ( D1(in) 0 0 0 ) +* B = U(in) * ( Z1' a Z2' b ) * VT(in) +* ( 0 0 D2(in) 0 ) +* +* = U(out) * ( D(out) 0) * VT(out) +* +* where Z' = (Z1' a Z2' b) = u' VT', and u is a vector of dimension M +* with ALPHA and BETA in the NL+1 and NL+2 th entries and zeros +* elsewhere; and the entry b is empty if SQRE = 0. +* +* The singular values of B can be computed using D1, D2, the first +* components of all the right singular vectors of the lower block, and +* the last components of all the right singular vectors of the upper +* block. These components are stored and updated in VF and VL, +* respectively, in DLASD6. Hence U and VT are not explicitly +* referenced. +* +* The singular values are stored in D. The algorithm consists of two +* stages: +* +* The first stage consists of deflating the size of the problem +* when there are multiple singular values or if there is a zero +* in the Z vector. For each such occurence the dimension of the +* secular equation problem is reduced by one. This stage is +* performed by the routine DLASD7. +* +* The second stage consists of calculating the updated +* singular values. This is done by finding the roots of the +* secular equation via the routine DLASD4 (as called by DLASD8). +* This routine also updates VF and VL and computes the distances +* between the updated singular values and the old singular +* values. +* +* DLASD6 is called from DLASDA. +* +* Arguments +* ========= +* +* ICOMPQ (input) INTEGER +* Specifies whether singular vectors are to be computed in +* factored form: +* = 0: Compute singular values only. +* = 1: Compute singular vectors in factored form as well. +* +* NL (input) INTEGER +* The row dimension of the upper block. NL >= 1. +* +* NR (input) INTEGER +* The row dimension of the lower block. NR >= 1. +* +* SQRE (input) INTEGER +* = 0: the lower block is an NR-by-NR square matrix. +* = 1: the lower block is an NR-by-(NR+1) rectangular matrix. +* +* The bidiagonal matrix has row dimension N = NL + NR + 1, +* and column dimension M = N + SQRE. +* +* D (input/output) DOUBLE PRECISION array, dimension ( NL+NR+1 ). +* On entry D(1:NL,1:NL) contains the singular values of the +* upper block, and D(NL+2:N) contains the singular values +* of the lower block. On exit D(1:N) contains the singular +* values of the modified matrix. +* +* VF (input/output) DOUBLE PRECISION array, dimension ( M ) +* On entry, VF(1:NL+1) contains the first components of all +* right singular vectors of the upper block; and VF(NL+2:M) +* contains the first components of all right singular vectors +* of the lower block. On exit, VF contains the first components +* of all right singular vectors of the bidiagonal matrix. +* +* VL (input/output) DOUBLE PRECISION array, dimension ( M ) +* On entry, VL(1:NL+1) contains the last components of all +* right singular vectors of the upper block; and VL(NL+2:M) +* contains the last components of all right singular vectors of +* the lower block. On exit, VL contains the last components of +* all right singular vectors of the bidiagonal matrix. +* +* ALPHA (input/output) DOUBLE PRECISION +* Contains the diagonal element associated with the added row. +* +* BETA (input/output) DOUBLE PRECISION +* Contains the off-diagonal element associated with the added +* row. +* +* IDXQ (output) INTEGER array, dimension ( N ) +* This contains the permutation which will reintegrate the +* subproblem just solved back into sorted order, i.e. +* D( IDXQ( I = 1, N ) ) will be in ascending order. +* +* PERM (output) INTEGER array, dimension ( N ) +* The permutations (from deflation and sorting) to be applied +* to each block. Not referenced if ICOMPQ = 0. +* +* GIVPTR (output) INTEGER +* The number of Givens rotations which took place in this +* subproblem. Not referenced if ICOMPQ = 0. +* +* GIVCOL (output) INTEGER array, dimension ( LDGCOL, 2 ) +* Each pair of numbers indicates a pair of columns to take place +* in a Givens rotation. Not referenced if ICOMPQ = 0. +* +* LDGCOL (input) INTEGER +* leading dimension of GIVCOL, must be at least N. +* +* GIVNUM (output) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ) +* Each number indicates the C or S value to be used in the +* corresponding Givens rotation. Not referenced if ICOMPQ = 0. +* +* LDGNUM (input) INTEGER +* The leading dimension of GIVNUM and POLES, must be at least N. +* +* POLES (output) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ) +* On exit, POLES(1,*) is an array containing the new singular +* values obtained from solving the secular equation, and +* POLES(2,*) is an array containing the poles in the secular +* equation. Not referenced if ICOMPQ = 0. +* +* DIFL (output) DOUBLE PRECISION array, dimension ( N ) +* On exit, DIFL(I) is the distance between I-th updated +* (undeflated) singular value and the I-th (undeflated) old +* singular value. +* +* DIFR (output) DOUBLE PRECISION array, +* dimension ( LDGNUM, 2 ) if ICOMPQ = 1 and +* dimension ( N ) if ICOMPQ = 0. +* On exit, DIFR(I, 1) is the distance between I-th updated +* (undeflated) singular value and the I+1-th (undeflated) old +* singular value. +* +* If ICOMPQ = 1, DIFR(1:K,2) is an array containing the +* normalizing factors for the right singular vector matrix. +* +* See DLASD8 for details on DIFL and DIFR. +* +* Z (output) DOUBLE PRECISION array, dimension ( M ) +* The first elements of this array contain the components +* of the deflation-adjusted updating row vector. +* +* K (output) INTEGER +* Contains the dimension of the non-deflated matrix, +* This is the order of the related secular equation. 1 <= K <=N. +* +* C (output) DOUBLE PRECISION +* C contains garbage if SQRE =0 and the C-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* S (output) DOUBLE PRECISION +* S contains garbage if SQRE =0 and the S-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* WORK (workspace) DOUBLE PRECISION array, dimension ( 4 * M ) +* +* IWORK (workspace) INTEGER array, dimension ( 3 * N ) +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: if INFO = 1, an singular value did not converge +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ONE, ZERO + PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 ) +* .. +* .. Local Scalars .. + INTEGER I, IDX, IDXC, IDXP, ISIGMA, IVFW, IVLW, IW, M, + $ N, N1, N2 + DOUBLE PRECISION ORGNRM +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DLAMRG, DLASCL, DLASD7, DLASD8, XERBLA +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, MAX +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 + N = NL + NR + 1 + M = N + SQRE +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( NL.LT.1 ) THEN + INFO = -2 + ELSE IF( NR.LT.1 ) THEN + INFO = -3 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -4 + ELSE IF( LDGCOL.LT.N ) THEN + INFO = -14 + ELSE IF( LDGNUM.LT.N ) THEN + INFO = -16 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASD6', -INFO ) + RETURN + END IF +* +* The following values are for bookkeeping purposes only. They are +* integer pointers which indicate the portion of the workspace +* used by a particular array in DLASD7 and DLASD8. +* + ISIGMA = 1 + IW = ISIGMA + N + IVFW = IW + M + IVLW = IVFW + M +* + IDX = 1 + IDXC = IDX + N + IDXP = IDXC + N +* +* Scale. +* + ORGNRM = MAX( ABS( ALPHA ), ABS( BETA ) ) + D( NL+1 ) = ZERO + DO 10 I = 1, N + IF( ABS( D( I ) ).GT.ORGNRM ) THEN + ORGNRM = ABS( D( I ) ) + END IF + 10 CONTINUE + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, N, 1, D, N, INFO ) + ALPHA = ALPHA / ORGNRM + BETA = BETA / ORGNRM +* +* Sort and Deflate singular values. +* + CALL DLASD7( ICOMPQ, NL, NR, SQRE, K, D, Z, WORK( IW ), VF, + $ WORK( IVFW ), VL, WORK( IVLW ), ALPHA, BETA, + $ WORK( ISIGMA ), IWORK( IDX ), IWORK( IDXP ), IDXQ, + $ PERM, GIVPTR, GIVCOL, LDGCOL, GIVNUM, LDGNUM, C, S, + $ INFO ) +* +* Solve Secular Equation, compute DIFL, DIFR, and update VF, VL. +* + CALL DLASD8( ICOMPQ, K, D, Z, VF, VL, DIFL, DIFR, LDGNUM, + $ WORK( ISIGMA ), WORK( IW ), INFO ) +* +* Save the poles if ICOMPQ = 1. +* + IF( ICOMPQ.EQ.1 ) THEN + CALL DCOPY( K, D, 1, POLES( 1, 1 ), 1 ) + CALL DCOPY( K, WORK( ISIGMA ), 1, POLES( 1, 2 ), 1 ) + END IF +* +* Unscale. +* + CALL DLASCL( 'G', 0, 0, ONE, ORGNRM, N, 1, D, N, INFO ) +* +* Prepare the IDXQ sorting permutation. +* + N1 = K + N2 = N - K + CALL DLAMRG( N1, N2, D, 1, -1, IDXQ ) +* + RETURN +* +* End of DLASD6 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd7.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd7.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,444 @@ + SUBROUTINE DLASD7( ICOMPQ, NL, NR, SQRE, K, D, Z, ZW, VF, VFW, VL, + $ VLW, ALPHA, BETA, DSIGMA, IDX, IDXP, IDXQ, + $ PERM, GIVPTR, GIVCOL, LDGCOL, GIVNUM, LDGNUM, + $ C, S, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER GIVPTR, ICOMPQ, INFO, K, LDGCOL, LDGNUM, NL, + $ NR, SQRE + DOUBLE PRECISION ALPHA, BETA, C, S +* .. +* .. Array Arguments .. + INTEGER GIVCOL( LDGCOL, * ), IDX( * ), IDXP( * ), + $ IDXQ( * ), PERM( * ) + DOUBLE PRECISION D( * ), DSIGMA( * ), GIVNUM( LDGNUM, * ), + $ VF( * ), VFW( * ), VL( * ), VLW( * ), Z( * ), + $ ZW( * ) +* .. +* +* Purpose +* ======= +* +* DLASD7 merges the two sets of singular values together into a single +* sorted set. Then it tries to deflate the size of the problem. There +* are two ways in which deflation can occur: when two or more singular +* values are close together or if there is a tiny entry in the Z +* vector. For each such occurrence the order of the related +* secular equation problem is reduced by one. +* +* DLASD7 is called from DLASD6. +* +* Arguments +* ========= +* +* ICOMPQ (input) INTEGER +* Specifies whether singular vectors are to be computed +* in compact form, as follows: +* = 0: Compute singular values only. +* = 1: Compute singular vectors of upper +* bidiagonal matrix in compact form. +* +* NL (input) INTEGER +* The row dimension of the upper block. NL >= 1. +* +* NR (input) INTEGER +* The row dimension of the lower block. NR >= 1. +* +* SQRE (input) INTEGER +* = 0: the lower block is an NR-by-NR square matrix. +* = 1: the lower block is an NR-by-(NR+1) rectangular matrix. +* +* The bidiagonal matrix has +* N = NL + NR + 1 rows and +* M = N + SQRE >= N columns. +* +* K (output) INTEGER +* Contains the dimension of the non-deflated matrix, this is +* the order of the related secular equation. 1 <= K <=N. +* +* D (input/output) DOUBLE PRECISION array, dimension ( N ) +* On entry D contains the singular values of the two submatrices +* to be combined. On exit D contains the trailing (N-K) updated +* singular values (those which were deflated) sorted into +* increasing order. +* +* Z (output) DOUBLE PRECISION array, dimension ( M ) +* On exit Z contains the updating row vector in the secular +* equation. +* +* ZW (workspace) DOUBLE PRECISION array, dimension ( M ) +* Workspace for Z. +* +* VF (input/output) DOUBLE PRECISION array, dimension ( M ) +* On entry, VF(1:NL+1) contains the first components of all +* right singular vectors of the upper block; and VF(NL+2:M) +* contains the first components of all right singular vectors +* of the lower block. On exit, VF contains the first components +* of all right singular vectors of the bidiagonal matrix. +* +* VFW (workspace) DOUBLE PRECISION array, dimension ( M ) +* Workspace for VF. +* +* VL (input/output) DOUBLE PRECISION array, dimension ( M ) +* On entry, VL(1:NL+1) contains the last components of all +* right singular vectors of the upper block; and VL(NL+2:M) +* contains the last components of all right singular vectors +* of the lower block. On exit, VL contains the last components +* of all right singular vectors of the bidiagonal matrix. +* +* VLW (workspace) DOUBLE PRECISION array, dimension ( M ) +* Workspace for VL. +* +* ALPHA (input) DOUBLE PRECISION +* Contains the diagonal element associated with the added row. +* +* BETA (input) DOUBLE PRECISION +* Contains the off-diagonal element associated with the added +* row. +* +* DSIGMA (output) DOUBLE PRECISION array, dimension ( N ) +* Contains a copy of the diagonal elements (K-1 singular values +* and one zero) in the secular equation. +* +* IDX (workspace) INTEGER array, dimension ( N ) +* This will contain the permutation used to sort the contents of +* D into ascending order. +* +* IDXP (workspace) INTEGER array, dimension ( N ) +* This will contain the permutation used to place deflated +* values of D at the end of the array. On output IDXP(2:K) +* points to the nondeflated D-values and IDXP(K+1:N) +* points to the deflated singular values. +* +* IDXQ (input) INTEGER array, dimension ( N ) +* This contains the permutation which separately sorts the two +* sub-problems in D into ascending order. Note that entries in +* the first half of this permutation must first be moved one +* position backward; and entries in the second half +* must first have NL+1 added to their values. +* +* PERM (output) INTEGER array, dimension ( N ) +* The permutations (from deflation and sorting) to be applied +* to each singular block. Not referenced if ICOMPQ = 0. +* +* GIVPTR (output) INTEGER +* The number of Givens rotations which took place in this +* subproblem. Not referenced if ICOMPQ = 0. +* +* GIVCOL (output) INTEGER array, dimension ( LDGCOL, 2 ) +* Each pair of numbers indicates a pair of columns to take place +* in a Givens rotation. Not referenced if ICOMPQ = 0. +* +* LDGCOL (input) INTEGER +* The leading dimension of GIVCOL, must be at least N. +* +* GIVNUM (output) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ) +* Each number indicates the C or S value to be used in the +* corresponding Givens rotation. Not referenced if ICOMPQ = 0. +* +* LDGNUM (input) INTEGER +* The leading dimension of GIVNUM, must be at least N. +* +* C (output) DOUBLE PRECISION +* C contains garbage if SQRE =0 and the C-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* S (output) DOUBLE PRECISION +* S contains garbage if SQRE =0 and the S-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE, TWO, EIGHT + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0, + $ EIGHT = 8.0D+0 ) +* .. +* .. Local Scalars .. +* + INTEGER I, IDXI, IDXJ, IDXJP, J, JP, JPREV, K2, M, N, + $ NLP1, NLP2 + DOUBLE PRECISION EPS, HLFTOL, TAU, TOL, Z1 +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DLAMRG, DROT, XERBLA +* .. +* .. External Functions .. + DOUBLE PRECISION DLAMCH, DLAPY2 + EXTERNAL DLAMCH, DLAPY2 +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, MAX +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 + N = NL + NR + 1 + M = N + SQRE +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( NL.LT.1 ) THEN + INFO = -2 + ELSE IF( NR.LT.1 ) THEN + INFO = -3 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -4 + ELSE IF( LDGCOL.LT.N ) THEN + INFO = -22 + ELSE IF( LDGNUM.LT.N ) THEN + INFO = -24 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASD7', -INFO ) + RETURN + END IF +* + NLP1 = NL + 1 + NLP2 = NL + 2 + IF( ICOMPQ.EQ.1 ) THEN + GIVPTR = 0 + END IF +* +* Generate the first part of the vector Z and move the singular +* values in the first part of D one position backward. +* + Z1 = ALPHA*VL( NLP1 ) + VL( NLP1 ) = ZERO + TAU = VF( NLP1 ) + DO 10 I = NL, 1, -1 + Z( I+1 ) = ALPHA*VL( I ) + VL( I ) = ZERO + VF( I+1 ) = VF( I ) + D( I+1 ) = D( I ) + IDXQ( I+1 ) = IDXQ( I ) + 1 + 10 CONTINUE + VF( 1 ) = TAU +* +* Generate the second part of the vector Z. +* + DO 20 I = NLP2, M + Z( I ) = BETA*VF( I ) + VF( I ) = ZERO + 20 CONTINUE +* +* Sort the singular values into increasing order +* + DO 30 I = NLP2, N + IDXQ( I ) = IDXQ( I ) + NLP1 + 30 CONTINUE +* +* DSIGMA, IDXC, IDXC, and ZW are used as storage space. +* + DO 40 I = 2, N + DSIGMA( I ) = D( IDXQ( I ) ) + ZW( I ) = Z( IDXQ( I ) ) + VFW( I ) = VF( IDXQ( I ) ) + VLW( I ) = VL( IDXQ( I ) ) + 40 CONTINUE +* + CALL DLAMRG( NL, NR, DSIGMA( 2 ), 1, 1, IDX( 2 ) ) +* + DO 50 I = 2, N + IDXI = 1 + IDX( I ) + D( I ) = DSIGMA( IDXI ) + Z( I ) = ZW( IDXI ) + VF( I ) = VFW( IDXI ) + VL( I ) = VLW( IDXI ) + 50 CONTINUE +* +* Calculate the allowable deflation tolerence +* + EPS = DLAMCH( 'Epsilon' ) + TOL = MAX( ABS( ALPHA ), ABS( BETA ) ) + TOL = EIGHT*EIGHT*EPS*MAX( ABS( D( N ) ), TOL ) +* +* There are 2 kinds of deflation -- first a value in the z-vector +* is small, second two (or more) singular values are very close +* together (their difference is small). +* +* If the value in the z-vector is small, we simply permute the +* array so that the corresponding singular value is moved to the +* end. +* +* If two values in the D-vector are close, we perform a two-sided +* rotation designed to make one of the corresponding z-vector +* entries zero, and then permute the array so that the deflated +* singular value is moved to the end. +* +* If there are multiple singular values then the problem deflates. +* Here the number of equal singular values are found. As each equal +* singular value is found, an elementary reflector is computed to +* rotate the corresponding singular subspace so that the +* corresponding components of Z are zero in this new basis. +* + K = 1 + K2 = N + 1 + DO 60 J = 2, N + IF( ABS( Z( J ) ).LE.TOL ) THEN +* +* Deflate due to small z component. +* + K2 = K2 - 1 + IDXP( K2 ) = J + IF( J.EQ.N ) + $ GO TO 100 + ELSE + JPREV = J + GO TO 70 + END IF + 60 CONTINUE + 70 CONTINUE + J = JPREV + 80 CONTINUE + J = J + 1 + IF( J.GT.N ) + $ GO TO 90 + IF( ABS( Z( J ) ).LE.TOL ) THEN +* +* Deflate due to small z component. +* + K2 = K2 - 1 + IDXP( K2 ) = J + ELSE +* +* Check if singular values are close enough to allow deflation. +* + IF( ABS( D( J )-D( JPREV ) ).LE.TOL ) THEN +* +* Deflation is possible. +* + S = Z( JPREV ) + C = Z( J ) +* +* Find sqrt(a**2+b**2) without overflow or +* destructive underflow. +* + TAU = DLAPY2( C, S ) + Z( J ) = TAU + Z( JPREV ) = ZERO + C = C / TAU + S = -S / TAU +* +* Record the appropriate Givens rotation +* + IF( ICOMPQ.EQ.1 ) THEN + GIVPTR = GIVPTR + 1 + IDXJP = IDXQ( IDX( JPREV )+1 ) + IDXJ = IDXQ( IDX( J )+1 ) + IF( IDXJP.LE.NLP1 ) THEN + IDXJP = IDXJP - 1 + END IF + IF( IDXJ.LE.NLP1 ) THEN + IDXJ = IDXJ - 1 + END IF + GIVCOL( GIVPTR, 2 ) = IDXJP + GIVCOL( GIVPTR, 1 ) = IDXJ + GIVNUM( GIVPTR, 2 ) = C + GIVNUM( GIVPTR, 1 ) = S + END IF + CALL DROT( 1, VF( JPREV ), 1, VF( J ), 1, C, S ) + CALL DROT( 1, VL( JPREV ), 1, VL( J ), 1, C, S ) + K2 = K2 - 1 + IDXP( K2 ) = JPREV + JPREV = J + ELSE + K = K + 1 + ZW( K ) = Z( JPREV ) + DSIGMA( K ) = D( JPREV ) + IDXP( K ) = JPREV + JPREV = J + END IF + END IF + GO TO 80 + 90 CONTINUE +* +* Record the last singular value. +* + K = K + 1 + ZW( K ) = Z( JPREV ) + DSIGMA( K ) = D( JPREV ) + IDXP( K ) = JPREV +* + 100 CONTINUE +* +* Sort the singular values into DSIGMA. The singular values which +* were not deflated go into the first K slots of DSIGMA, except +* that DSIGMA(1) is treated separately. +* + DO 110 J = 2, N + JP = IDXP( J ) + DSIGMA( J ) = D( JP ) + VFW( J ) = VF( JP ) + VLW( J ) = VL( JP ) + 110 CONTINUE + IF( ICOMPQ.EQ.1 ) THEN + DO 120 J = 2, N + JP = IDXP( J ) + PERM( J ) = IDXQ( IDX( JP )+1 ) + IF( PERM( J ).LE.NLP1 ) THEN + PERM( J ) = PERM( J ) - 1 + END IF + 120 CONTINUE + END IF +* +* The deflated singular values go back into the last N - K slots of +* D. +* + CALL DCOPY( N-K, DSIGMA( K+1 ), 1, D( K+1 ), 1 ) +* +* Determine DSIGMA(1), DSIGMA(2), Z(1), VF(1), VL(1), VF(M), and +* VL(M). +* + DSIGMA( 1 ) = ZERO + HLFTOL = TOL / TWO + IF( ABS( DSIGMA( 2 ) ).LE.HLFTOL ) + $ DSIGMA( 2 ) = HLFTOL + IF( M.GT.N ) THEN + Z( 1 ) = DLAPY2( Z1, Z( M ) ) + IF( Z( 1 ).LE.TOL ) THEN + C = ONE + S = ZERO + Z( 1 ) = TOL + ELSE + C = Z1 / Z( 1 ) + S = -Z( M ) / Z( 1 ) + END IF + CALL DROT( 1, VF( M ), 1, VF( 1 ), 1, C, S ) + CALL DROT( 1, VL( M ), 1, VL( 1 ), 1, C, S ) + ELSE + IF( ABS( Z1 ).LE.TOL ) THEN + Z( 1 ) = TOL + ELSE + Z( 1 ) = Z1 + END IF + END IF +* +* Restore Z, VF, and VL. +* + CALL DCOPY( K-1, ZW( 2 ), 1, Z( 2 ), 1 ) + CALL DCOPY( N-1, VFW( 2 ), 1, VF( 2 ), 1 ) + CALL DCOPY( N-1, VLW( 2 ), 1, VL( 2 ), 1 ) +* + RETURN +* +* End of DLASD7 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasd8.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasd8.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,253 @@ + SUBROUTINE DLASD8( ICOMPQ, K, D, Z, VF, VL, DIFL, DIFR, LDDIFR, + $ DSIGMA, WORK, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER ICOMPQ, INFO, K, LDDIFR +* .. +* .. Array Arguments .. + DOUBLE PRECISION D( * ), DIFL( * ), DIFR( LDDIFR, * ), + $ DSIGMA( * ), VF( * ), VL( * ), WORK( * ), + $ Z( * ) +* .. +* +* Purpose +* ======= +* +* DLASD8 finds the square roots of the roots of the secular equation, +* as defined by the values in DSIGMA and Z. It makes the appropriate +* calls to DLASD4, and stores, for each element in D, the distance +* to its two nearest poles (elements in DSIGMA). It also updates +* the arrays VF and VL, the first and last components of all the +* right singular vectors of the original bidiagonal matrix. +* +* DLASD8 is called from DLASD6. +* +* Arguments +* ========= +* +* ICOMPQ (input) INTEGER +* Specifies whether singular vectors are to be computed in +* factored form in the calling routine: +* = 0: Compute singular values only. +* = 1: Compute singular vectors in factored form as well. +* +* K (input) INTEGER +* The number of terms in the rational function to be solved +* by DLASD4. K >= 1. +* +* D (output) DOUBLE PRECISION array, dimension ( K ) +* On output, D contains the updated singular values. +* +* Z (input) DOUBLE PRECISION array, dimension ( K ) +* The first K elements of this array contain the components +* of the deflation-adjusted updating row vector. +* +* VF (input/output) DOUBLE PRECISION array, dimension ( K ) +* On entry, VF contains information passed through DBEDE8. +* On exit, VF contains the first K components of the first +* components of all right singular vectors of the bidiagonal +* matrix. +* +* VL (input/output) DOUBLE PRECISION array, dimension ( K ) +* On entry, VL contains information passed through DBEDE8. +* On exit, VL contains the first K components of the last +* components of all right singular vectors of the bidiagonal +* matrix. +* +* DIFL (output) DOUBLE PRECISION array, dimension ( K ) +* On exit, DIFL(I) = D(I) - DSIGMA(I). +* +* DIFR (output) DOUBLE PRECISION array, +* dimension ( LDDIFR, 2 ) if ICOMPQ = 1 and +* dimension ( K ) if ICOMPQ = 0. +* On exit, DIFR(I,1) = D(I) - DSIGMA(I+1), DIFR(K,1) is not +* defined and will not be referenced. +* +* If ICOMPQ = 1, DIFR(1:K,2) is an array containing the +* normalizing factors for the right singular vector matrix. +* +* LDDIFR (input) INTEGER +* The leading dimension of DIFR, must be at least K. +* +* DSIGMA (input) DOUBLE PRECISION array, dimension ( K ) +* The first K elements of this array contain the old roots +* of the deflated updating problem. These are the poles +* of the secular equation. +* +* WORK (workspace) DOUBLE PRECISION array, dimension at least 3 * K +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: if INFO = 1, an singular value did not converge +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ONE + PARAMETER ( ONE = 1.0D+0 ) +* .. +* .. Local Scalars .. + INTEGER I, IWK1, IWK2, IWK2I, IWK3, IWK3I, J + DOUBLE PRECISION DIFLJ, DIFRJ, DJ, DSIGJ, DSIGJP, RHO, TEMP +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DLASCL, DLASD4, DLASET, XERBLA +* .. +* .. External Functions .. + DOUBLE PRECISION DDOT, DLAMC3, DNRM2 + EXTERNAL DDOT, DLAMC3, DNRM2 +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, SIGN, SQRT +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( K.LT.1 ) THEN + INFO = -2 + ELSE IF( LDDIFR.LT.K ) THEN + INFO = -9 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASD8', -INFO ) + RETURN + END IF +* +* Quick return if possible +* + IF( K.EQ.1 ) THEN + D( 1 ) = ABS( Z( 1 ) ) + DIFL( 1 ) = D( 1 ) + IF( ICOMPQ.EQ.1 ) THEN + DIFL( 2 ) = ONE + DIFR( 1, 2 ) = ONE + END IF + RETURN + END IF +* +* Modify values DSIGMA(i) to make sure all DSIGMA(i)-DSIGMA(j) can +* be computed with high relative accuracy (barring over/underflow). +* This is a problem on machines without a guard digit in +* add/subtract (Cray XMP, Cray YMP, Cray C 90 and Cray 2). +* The following code replaces DSIGMA(I) by 2*DSIGMA(I)-DSIGMA(I), +* which on any of these machines zeros out the bottommost +* bit of DSIGMA(I) if it is 1; this makes the subsequent +* subtractions DSIGMA(I)-DSIGMA(J) unproblematic when cancellation +* occurs. On binary machines with a guard digit (almost all +* machines) it does not change DSIGMA(I) at all. On hexadecimal +* and decimal machines with a guard digit, it slightly +* changes the bottommost bits of DSIGMA(I). It does not account +* for hexadecimal or decimal machines without guard digits +* (we know of none). We use a subroutine call to compute +* 2*DSIGMA(I) to prevent optimizing compilers from eliminating +* this code. +* + DO 10 I = 1, K + DSIGMA( I ) = DLAMC3( DSIGMA( I ), DSIGMA( I ) ) - DSIGMA( I ) + 10 CONTINUE +* +* Book keeping. +* + IWK1 = 1 + IWK2 = IWK1 + K + IWK3 = IWK2 + K + IWK2I = IWK2 - 1 + IWK3I = IWK3 - 1 +* +* Normalize Z. +* + RHO = DNRM2( K, Z, 1 ) + CALL DLASCL( 'G', 0, 0, RHO, ONE, K, 1, Z, K, INFO ) + RHO = RHO*RHO +* +* Initialize WORK(IWK3). +* + CALL DLASET( 'A', K, 1, ONE, ONE, WORK( IWK3 ), K ) +* +* Compute the updated singular values, the arrays DIFL, DIFR, +* and the updated Z. +* + DO 40 J = 1, K + CALL DLASD4( K, J, DSIGMA, Z, WORK( IWK1 ), RHO, D( J ), + $ WORK( IWK2 ), INFO ) +* +* If the root finder fails, the computation is terminated. +* + IF( INFO.NE.0 ) THEN + RETURN + END IF + WORK( IWK3I+J ) = WORK( IWK3I+J )*WORK( J )*WORK( IWK2I+J ) + DIFL( J ) = -WORK( J ) + DIFR( J, 1 ) = -WORK( J+1 ) + DO 20 I = 1, J - 1 + WORK( IWK3I+I ) = WORK( IWK3I+I )*WORK( I )* + $ WORK( IWK2I+I ) / ( DSIGMA( I )- + $ DSIGMA( J ) ) / ( DSIGMA( I )+ + $ DSIGMA( J ) ) + 20 CONTINUE + DO 30 I = J + 1, K + WORK( IWK3I+I ) = WORK( IWK3I+I )*WORK( I )* + $ WORK( IWK2I+I ) / ( DSIGMA( I )- + $ DSIGMA( J ) ) / ( DSIGMA( I )+ + $ DSIGMA( J ) ) + 30 CONTINUE + 40 CONTINUE +* +* Compute updated Z. +* + DO 50 I = 1, K + Z( I ) = SIGN( SQRT( ABS( WORK( IWK3I+I ) ) ), Z( I ) ) + 50 CONTINUE +* +* Update VF and VL. +* + DO 80 J = 1, K + DIFLJ = DIFL( J ) + DJ = D( J ) + DSIGJ = -DSIGMA( J ) + IF( J.LT.K ) THEN + DIFRJ = -DIFR( J, 1 ) + DSIGJP = -DSIGMA( J+1 ) + END IF + WORK( J ) = -Z( J ) / DIFLJ / ( DSIGMA( J )+DJ ) + DO 60 I = 1, J - 1 + WORK( I ) = Z( I ) / ( DLAMC3( DSIGMA( I ), DSIGJ )-DIFLJ ) + $ / ( DSIGMA( I )+DJ ) + 60 CONTINUE + DO 70 I = J + 1, K + WORK( I ) = Z( I ) / ( DLAMC3( DSIGMA( I ), DSIGJP )+DIFRJ ) + $ / ( DSIGMA( I )+DJ ) + 70 CONTINUE + TEMP = DNRM2( K, WORK, 1 ) + WORK( IWK2I+J ) = DDOT( K, WORK, 1, VF, 1 ) / TEMP + WORK( IWK3I+J ) = DDOT( K, WORK, 1, VL, 1 ) / TEMP + IF( ICOMPQ.EQ.1 ) THEN + DIFR( J, 2 ) = TEMP + END IF + 80 CONTINUE +* + CALL DCOPY( K, WORK( IWK2 ), 1, VF, 1 ) + CALL DCOPY( K, WORK( IWK3 ), 1, VL, 1 ) +* + RETURN +* +* End of DLASD8 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasda.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasda.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,390 @@ + SUBROUTINE DLASDA( ICOMPQ, SMLSIZ, N, SQRE, D, E, U, LDU, VT, K, + $ DIFL, DIFR, Z, POLES, GIVPTR, GIVCOL, LDGCOL, + $ PERM, GIVNUM, C, S, WORK, IWORK, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER ICOMPQ, INFO, LDGCOL, LDU, N, SMLSIZ, SQRE +* .. +* .. Array Arguments .. + INTEGER GIVCOL( LDGCOL, * ), GIVPTR( * ), IWORK( * ), + $ K( * ), PERM( LDGCOL, * ) + DOUBLE PRECISION C( * ), D( * ), DIFL( LDU, * ), DIFR( LDU, * ), + $ E( * ), GIVNUM( LDU, * ), POLES( LDU, * ), + $ S( * ), U( LDU, * ), VT( LDU, * ), WORK( * ), + $ Z( LDU, * ) +* .. +* +* Purpose +* ======= +* +* Using a divide and conquer approach, DLASDA computes the singular +* value decomposition (SVD) of a real upper bidiagonal N-by-M matrix +* B with diagonal D and offdiagonal E, where M = N + SQRE. The +* algorithm computes the singular values in the SVD B = U * S * VT. +* The orthogonal matrices U and VT are optionally computed in +* compact form. +* +* A related subroutine, DLASD0, computes the singular values and +* the singular vectors in explicit form. +* +* Arguments +* ========= +* +* ICOMPQ (input) INTEGER +* Specifies whether singular vectors are to be computed +* in compact form, as follows +* = 0: Compute singular values only. +* = 1: Compute singular vectors of upper bidiagonal +* matrix in compact form. +* +* SMLSIZ (input) INTEGER +* The maximum size of the subproblems at the bottom of the +* computation tree. +* +* N (input) INTEGER +* The row dimension of the upper bidiagonal matrix. This is +* also the dimension of the main diagonal array D. +* +* SQRE (input) INTEGER +* Specifies the column dimension of the bidiagonal matrix. +* = 0: The bidiagonal matrix has column dimension M = N; +* = 1: The bidiagonal matrix has column dimension M = N + 1. +* +* D (input/output) DOUBLE PRECISION array, dimension ( N ) +* On entry D contains the main diagonal of the bidiagonal +* matrix. On exit D, if INFO = 0, contains its singular values. +* +* E (input) DOUBLE PRECISION array, dimension ( M-1 ) +* Contains the subdiagonal entries of the bidiagonal matrix. +* On exit, E has been destroyed. +* +* U (output) DOUBLE PRECISION array, +* dimension ( LDU, SMLSIZ ) if ICOMPQ = 1, and not referenced +* if ICOMPQ = 0. If ICOMPQ = 1, on exit, U contains the left +* singular vector matrices of all subproblems at the bottom +* level. +* +* LDU (input) INTEGER, LDU = > N. +* The leading dimension of arrays U, VT, DIFL, DIFR, POLES, +* GIVNUM, and Z. +* +* VT (output) DOUBLE PRECISION array, +* dimension ( LDU, SMLSIZ+1 ) if ICOMPQ = 1, and not referenced +* if ICOMPQ = 0. If ICOMPQ = 1, on exit, VT' contains the right +* singular vector matrices of all subproblems at the bottom +* level. +* +* K (output) INTEGER array, +* dimension ( N ) if ICOMPQ = 1 and dimension 1 if ICOMPQ = 0. +* If ICOMPQ = 1, on exit, K(I) is the dimension of the I-th +* secular equation on the computation tree. +* +* DIFL (output) DOUBLE PRECISION array, dimension ( LDU, NLVL ), +* where NLVL = floor(log_2 (N/SMLSIZ))). +* +* DIFR (output) DOUBLE PRECISION array, +* dimension ( LDU, 2 * NLVL ) if ICOMPQ = 1 and +* dimension ( N ) if ICOMPQ = 0. +* If ICOMPQ = 1, on exit, DIFL(1:N, I) and DIFR(1:N, 2 * I - 1) +* record distances between singular values on the I-th +* level and singular values on the (I -1)-th level, and +* DIFR(1:N, 2 * I ) contains the normalizing factors for +* the right singular vector matrix. See DLASD8 for details. +* +* Z (output) DOUBLE PRECISION array, +* dimension ( LDU, NLVL ) if ICOMPQ = 1 and +* dimension ( N ) if ICOMPQ = 0. +* The first K elements of Z(1, I) contain the components of +* the deflation-adjusted updating row vector for subproblems +* on the I-th level. +* +* POLES (output) DOUBLE PRECISION array, +* dimension ( LDU, 2 * NLVL ) if ICOMPQ = 1, and not referenced +* if ICOMPQ = 0. If ICOMPQ = 1, on exit, POLES(1, 2*I - 1) and +* POLES(1, 2*I) contain the new and old singular values +* involved in the secular equations on the I-th level. +* +* GIVPTR (output) INTEGER array, +* dimension ( N ) if ICOMPQ = 1, and not referenced if +* ICOMPQ = 0. If ICOMPQ = 1, on exit, GIVPTR( I ) records +* the number of Givens rotations performed on the I-th +* problem on the computation tree. +* +* GIVCOL (output) INTEGER array, +* dimension ( LDGCOL, 2 * NLVL ) if ICOMPQ = 1, and not +* referenced if ICOMPQ = 0. If ICOMPQ = 1, on exit, for each I, +* GIVCOL(1, 2 *I - 1) and GIVCOL(1, 2 *I) record the locations +* of Givens rotations performed on the I-th level on the +* computation tree. +* +* LDGCOL (input) INTEGER, LDGCOL = > N. +* The leading dimension of arrays GIVCOL and PERM. +* +* PERM (output) INTEGER array, +* dimension ( LDGCOL, NLVL ) if ICOMPQ = 1, and not referenced +* if ICOMPQ = 0. If ICOMPQ = 1, on exit, PERM(1, I) records +* permutations done on the I-th level of the computation tree. +* +* GIVNUM (output) DOUBLE PRECISION array, +* dimension ( LDU, 2 * NLVL ) if ICOMPQ = 1, and not +* referenced if ICOMPQ = 0. If ICOMPQ = 1, on exit, for each I, +* GIVNUM(1, 2 *I - 1) and GIVNUM(1, 2 *I) record the C- and S- +* values of Givens rotations performed on the I-th level on +* the computation tree. +* +* C (output) DOUBLE PRECISION array, +* dimension ( N ) if ICOMPQ = 1, and dimension 1 if ICOMPQ = 0. +* If ICOMPQ = 1 and the I-th subproblem is not square, on exit, +* C( I ) contains the C-value of a Givens rotation related to +* the right null space of the I-th subproblem. +* +* S (output) DOUBLE PRECISION array, dimension ( N ) if +* ICOMPQ = 1, and dimension 1 if ICOMPQ = 0. If ICOMPQ = 1 +* and the I-th subproblem is not square, on exit, S( I ) +* contains the S-value of a Givens rotation related to +* the right null space of the I-th subproblem. +* +* WORK (workspace) DOUBLE PRECISION array, dimension +* (6 * N + (SMLSIZ + 1)*(SMLSIZ + 1)). +* +* IWORK (workspace) INTEGER array. +* Dimension must be at least (7 * N). +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: if INFO = 1, an singular value did not converge +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) +* .. +* .. Local Scalars .. + INTEGER I, I1, IC, IDXQ, IDXQI, IM1, INODE, ITEMP, IWK, + $ J, LF, LL, LVL, LVL2, M, NCC, ND, NDB1, NDIML, + $ NDIMR, NL, NLF, NLP1, NLVL, NR, NRF, NRP1, NRU, + $ NWORK1, NWORK2, SMLSZP, SQREI, VF, VFI, VL, VLI + DOUBLE PRECISION ALPHA, BETA +* .. +* .. External Subroutines .. + EXTERNAL DCOPY, DLASD6, DLASDQ, DLASDT, DLASET, XERBLA +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( SMLSIZ.LT.3 ) THEN + INFO = -2 + ELSE IF( N.LT.0 ) THEN + INFO = -3 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -4 + ELSE IF( LDU.LT.( N+SQRE ) ) THEN + INFO = -8 + ELSE IF( LDGCOL.LT.N ) THEN + INFO = -17 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASDA', -INFO ) + RETURN + END IF +* + M = N + SQRE +* +* If the input matrix is too small, call DLASDQ to find the SVD. +* + IF( N.LE.SMLSIZ ) THEN + IF( ICOMPQ.EQ.0 ) THEN + CALL DLASDQ( 'U', SQRE, N, 0, 0, 0, D, E, VT, LDU, U, LDU, + $ U, LDU, WORK, INFO ) + ELSE + CALL DLASDQ( 'U', SQRE, N, M, N, 0, D, E, VT, LDU, U, LDU, + $ U, LDU, WORK, INFO ) + END IF + RETURN + END IF +* +* Book-keeping and set up the computation tree. +* + INODE = 1 + NDIML = INODE + N + NDIMR = NDIML + N + IDXQ = NDIMR + N + IWK = IDXQ + N +* + NCC = 0 + NRU = 0 +* + SMLSZP = SMLSIZ + 1 + VF = 1 + VL = VF + M + NWORK1 = VL + M + NWORK2 = NWORK1 + SMLSZP*SMLSZP +* + CALL DLASDT( N, NLVL, ND, IWORK( INODE ), IWORK( NDIML ), + $ IWORK( NDIMR ), SMLSIZ ) +* +* for the nodes on bottom level of the tree, solve +* their subproblems by DLASDQ. +* + NDB1 = ( ND+1 ) / 2 + DO 30 I = NDB1, ND +* +* IC : center row of each node +* NL : number of rows of left subproblem +* NR : number of rows of right subproblem +* NLF: starting row of the left subproblem +* NRF: starting row of the right subproblem +* + I1 = I - 1 + IC = IWORK( INODE+I1 ) + NL = IWORK( NDIML+I1 ) + NLP1 = NL + 1 + NR = IWORK( NDIMR+I1 ) + NLF = IC - NL + NRF = IC + 1 + IDXQI = IDXQ + NLF - 2 + VFI = VF + NLF - 1 + VLI = VL + NLF - 1 + SQREI = 1 + IF( ICOMPQ.EQ.0 ) THEN + CALL DLASET( 'A', NLP1, NLP1, ZERO, ONE, WORK( NWORK1 ), + $ SMLSZP ) + CALL DLASDQ( 'U', SQREI, NL, NLP1, NRU, NCC, D( NLF ), + $ E( NLF ), WORK( NWORK1 ), SMLSZP, + $ WORK( NWORK2 ), NL, WORK( NWORK2 ), NL, + $ WORK( NWORK2 ), INFO ) + ITEMP = NWORK1 + NL*SMLSZP + CALL DCOPY( NLP1, WORK( NWORK1 ), 1, WORK( VFI ), 1 ) + CALL DCOPY( NLP1, WORK( ITEMP ), 1, WORK( VLI ), 1 ) + ELSE + CALL DLASET( 'A', NL, NL, ZERO, ONE, U( NLF, 1 ), LDU ) + CALL DLASET( 'A', NLP1, NLP1, ZERO, ONE, VT( NLF, 1 ), LDU ) + CALL DLASDQ( 'U', SQREI, NL, NLP1, NL, NCC, D( NLF ), + $ E( NLF ), VT( NLF, 1 ), LDU, U( NLF, 1 ), LDU, + $ U( NLF, 1 ), LDU, WORK( NWORK1 ), INFO ) + CALL DCOPY( NLP1, VT( NLF, 1 ), 1, WORK( VFI ), 1 ) + CALL DCOPY( NLP1, VT( NLF, NLP1 ), 1, WORK( VLI ), 1 ) + END IF + IF( INFO.NE.0 ) THEN + RETURN + END IF + DO 10 J = 1, NL + IWORK( IDXQI+J ) = J + 10 CONTINUE + IF( ( I.EQ.ND ) .AND. ( SQRE.EQ.0 ) ) THEN + SQREI = 0 + ELSE + SQREI = 1 + END IF + IDXQI = IDXQI + NLP1 + VFI = VFI + NLP1 + VLI = VLI + NLP1 + NRP1 = NR + SQREI + IF( ICOMPQ.EQ.0 ) THEN + CALL DLASET( 'A', NRP1, NRP1, ZERO, ONE, WORK( NWORK1 ), + $ SMLSZP ) + CALL DLASDQ( 'U', SQREI, NR, NRP1, NRU, NCC, D( NRF ), + $ E( NRF ), WORK( NWORK1 ), SMLSZP, + $ WORK( NWORK2 ), NR, WORK( NWORK2 ), NR, + $ WORK( NWORK2 ), INFO ) + ITEMP = NWORK1 + ( NRP1-1 )*SMLSZP + CALL DCOPY( NRP1, WORK( NWORK1 ), 1, WORK( VFI ), 1 ) + CALL DCOPY( NRP1, WORK( ITEMP ), 1, WORK( VLI ), 1 ) + ELSE + CALL DLASET( 'A', NR, NR, ZERO, ONE, U( NRF, 1 ), LDU ) + CALL DLASET( 'A', NRP1, NRP1, ZERO, ONE, VT( NRF, 1 ), LDU ) + CALL DLASDQ( 'U', SQREI, NR, NRP1, NR, NCC, D( NRF ), + $ E( NRF ), VT( NRF, 1 ), LDU, U( NRF, 1 ), LDU, + $ U( NRF, 1 ), LDU, WORK( NWORK1 ), INFO ) + CALL DCOPY( NRP1, VT( NRF, 1 ), 1, WORK( VFI ), 1 ) + CALL DCOPY( NRP1, VT( NRF, NRP1 ), 1, WORK( VLI ), 1 ) + END IF + IF( INFO.NE.0 ) THEN + RETURN + END IF + DO 20 J = 1, NR + IWORK( IDXQI+J ) = J + 20 CONTINUE + 30 CONTINUE +* +* Now conquer each subproblem bottom-up. +* + J = 2**NLVL + DO 50 LVL = NLVL, 1, -1 + LVL2 = LVL*2 - 1 +* +* Find the first node LF and last node LL on +* the current level LVL. +* + IF( LVL.EQ.1 ) THEN + LF = 1 + LL = 1 + ELSE + LF = 2**( LVL-1 ) + LL = 2*LF - 1 + END IF + DO 40 I = LF, LL + IM1 = I - 1 + IC = IWORK( INODE+IM1 ) + NL = IWORK( NDIML+IM1 ) + NR = IWORK( NDIMR+IM1 ) + NLF = IC - NL + NRF = IC + 1 + IF( I.EQ.LL ) THEN + SQREI = SQRE + ELSE + SQREI = 1 + END IF + VFI = VF + NLF - 1 + VLI = VL + NLF - 1 + IDXQI = IDXQ + NLF - 1 + ALPHA = D( IC ) + BETA = E( IC ) + IF( ICOMPQ.EQ.0 ) THEN + CALL DLASD6( ICOMPQ, NL, NR, SQREI, D( NLF ), + $ WORK( VFI ), WORK( VLI ), ALPHA, BETA, + $ IWORK( IDXQI ), PERM, GIVPTR( 1 ), GIVCOL, + $ LDGCOL, GIVNUM, LDU, POLES, DIFL, DIFR, Z, + $ K( 1 ), C( 1 ), S( 1 ), WORK( NWORK1 ), + $ IWORK( IWK ), INFO ) + ELSE + J = J - 1 + CALL DLASD6( ICOMPQ, NL, NR, SQREI, D( NLF ), + $ WORK( VFI ), WORK( VLI ), ALPHA, BETA, + $ IWORK( IDXQI ), PERM( NLF, LVL ), + $ GIVPTR( J ), GIVCOL( NLF, LVL2 ), LDGCOL, + $ GIVNUM( NLF, LVL2 ), LDU, + $ POLES( NLF, LVL2 ), DIFL( NLF, LVL ), + $ DIFR( NLF, LVL2 ), Z( NLF, LVL ), K( J ), + $ C( J ), S( J ), WORK( NWORK1 ), + $ IWORK( IWK ), INFO ) + END IF + IF( INFO.NE.0 ) THEN + RETURN + END IF + 40 CONTINUE + 50 CONTINUE +* + RETURN +* +* End of DLASDA +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasdq.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasdq.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,316 @@ + SUBROUTINE DLASDQ( UPLO, SQRE, N, NCVT, NRU, NCC, D, E, VT, LDVT, + $ U, LDU, C, LDC, WORK, INFO ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + CHARACTER UPLO + INTEGER INFO, LDC, LDU, LDVT, N, NCC, NCVT, NRU, SQRE +* .. +* .. Array Arguments .. + DOUBLE PRECISION C( LDC, * ), D( * ), E( * ), U( LDU, * ), + $ VT( LDVT, * ), WORK( * ) +* .. +* +* Purpose +* ======= +* +* DLASDQ computes the singular value decomposition (SVD) of a real +* (upper or lower) bidiagonal matrix with diagonal D and offdiagonal +* E, accumulating the transformations if desired. Letting B denote +* the input bidiagonal matrix, the algorithm computes orthogonal +* matrices Q and P such that B = Q * S * P' (P' denotes the transpose +* of P). The singular values S are overwritten on D. +* +* The input matrix U is changed to U * Q if desired. +* The input matrix VT is changed to P' * VT if desired. +* The input matrix C is changed to Q' * C if desired. +* +* See "Computing Small Singular Values of Bidiagonal Matrices With +* Guaranteed High Relative Accuracy," by J. Demmel and W. Kahan, +* LAPACK Working Note #3, for a detailed description of the algorithm. +* +* Arguments +* ========= +* +* UPLO (input) CHARACTER*1 +* On entry, UPLO specifies whether the input bidiagonal matrix +* is upper or lower bidiagonal, and wether it is square are +* not. +* UPLO = 'U' or 'u' B is upper bidiagonal. +* UPLO = 'L' or 'l' B is lower bidiagonal. +* +* SQRE (input) INTEGER +* = 0: then the input matrix is N-by-N. +* = 1: then the input matrix is N-by-(N+1) if UPLU = 'U' and +* (N+1)-by-N if UPLU = 'L'. +* +* The bidiagonal matrix has +* N = NL + NR + 1 rows and +* M = N + SQRE >= N columns. +* +* N (input) INTEGER +* On entry, N specifies the number of rows and columns +* in the matrix. N must be at least 0. +* +* NCVT (input) INTEGER +* On entry, NCVT specifies the number of columns of +* the matrix VT. NCVT must be at least 0. +* +* NRU (input) INTEGER +* On entry, NRU specifies the number of rows of +* the matrix U. NRU must be at least 0. +* +* NCC (input) INTEGER +* On entry, NCC specifies the number of columns of +* the matrix C. NCC must be at least 0. +* +* D (input/output) DOUBLE PRECISION array, dimension (N) +* On entry, D contains the diagonal entries of the +* bidiagonal matrix whose SVD is desired. On normal exit, +* D contains the singular values in ascending order. +* +* E (input/output) DOUBLE PRECISION array. +* dimension is (N-1) if SQRE = 0 and N if SQRE = 1. +* On entry, the entries of E contain the offdiagonal entries +* of the bidiagonal matrix whose SVD is desired. On normal +* exit, E will contain 0. If the algorithm does not converge, +* D and E will contain the diagonal and superdiagonal entries +* of a bidiagonal matrix orthogonally equivalent to the one +* given as input. +* +* VT (input/output) DOUBLE PRECISION array, dimension (LDVT, NCVT) +* On entry, contains a matrix which on exit has been +* premultiplied by P', dimension N-by-NCVT if SQRE = 0 +* and (N+1)-by-NCVT if SQRE = 1 (not referenced if NCVT=0). +* +* LDVT (input) INTEGER +* On entry, LDVT specifies the leading dimension of VT as +* declared in the calling (sub) program. LDVT must be at +* least 1. If NCVT is nonzero LDVT must also be at least N. +* +* U (input/output) DOUBLE PRECISION array, dimension (LDU, N) +* On entry, contains a matrix which on exit has been +* postmultiplied by Q, dimension NRU-by-N if SQRE = 0 +* and NRU-by-(N+1) if SQRE = 1 (not referenced if NRU=0). +* +* LDU (input) INTEGER +* On entry, LDU specifies the leading dimension of U as +* declared in the calling (sub) program. LDU must be at +* least max( 1, NRU ) . +* +* C (input/output) DOUBLE PRECISION array, dimension (LDC, NCC) +* On entry, contains an N-by-NCC matrix which on exit +* has been premultiplied by Q' dimension N-by-NCC if SQRE = 0 +* and (N+1)-by-NCC if SQRE = 1 (not referenced if NCC=0). +* +* LDC (input) INTEGER +* On entry, LDC specifies the leading dimension of C as +* declared in the calling (sub) program. LDC must be at +* least 1. If NCC is nonzero, LDC must also be at least N. +* +* WORK (workspace) DOUBLE PRECISION array, dimension (4*N) +* Workspace. Only referenced if one of NCVT, NRU, or NCC is +* nonzero, and if N is at least 2. +* +* INFO (output) INTEGER +* On exit, a value of 0 indicates a successful exit. +* If INFO < 0, argument number -INFO is illegal. +* If INFO > 0, the algorithm did not converge, and INFO +* specifies how many superdiagonals did not converge. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO + PARAMETER ( ZERO = 0.0D+0 ) +* .. +* .. Local Scalars .. + LOGICAL ROTATE + INTEGER I, ISUB, IUPLO, J, NP1, SQRE1 + DOUBLE PRECISION CS, R, SMIN, SN +* .. +* .. External Subroutines .. + EXTERNAL DBDSQR, DLARTG, DLASR, DSWAP, XERBLA +* .. +* .. External Functions .. + LOGICAL LSAME + EXTERNAL LSAME +* .. +* .. Intrinsic Functions .. + INTRINSIC MAX +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 + IUPLO = 0 + IF( LSAME( UPLO, 'U' ) ) + $ IUPLO = 1 + IF( LSAME( UPLO, 'L' ) ) + $ IUPLO = 2 + IF( IUPLO.EQ.0 ) THEN + INFO = -1 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -2 + ELSE IF( N.LT.0 ) THEN + INFO = -3 + ELSE IF( NCVT.LT.0 ) THEN + INFO = -4 + ELSE IF( NRU.LT.0 ) THEN + INFO = -5 + ELSE IF( NCC.LT.0 ) THEN + INFO = -6 + ELSE IF( ( NCVT.EQ.0 .AND. LDVT.LT.1 ) .OR. + $ ( NCVT.GT.0 .AND. LDVT.LT.MAX( 1, N ) ) ) THEN + INFO = -10 + ELSE IF( LDU.LT.MAX( 1, NRU ) ) THEN + INFO = -12 + ELSE IF( ( NCC.EQ.0 .AND. LDC.LT.1 ) .OR. + $ ( NCC.GT.0 .AND. LDC.LT.MAX( 1, N ) ) ) THEN + INFO = -14 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'DLASDQ', -INFO ) + RETURN + END IF + IF( N.EQ.0 ) + $ RETURN +* +* ROTATE is true if any singular vectors desired, false otherwise +* + ROTATE = ( NCVT.GT.0 ) .OR. ( NRU.GT.0 ) .OR. ( NCC.GT.0 ) + NP1 = N + 1 + SQRE1 = SQRE +* +* If matrix non-square upper bidiagonal, rotate to be lower +* bidiagonal. The rotations are on the right. +* + IF( ( IUPLO.EQ.1 ) .AND. ( SQRE1.EQ.1 ) ) THEN + DO 10 I = 1, N - 1 + CALL DLARTG( D( I ), E( I ), CS, SN, R ) + D( I ) = R + E( I ) = SN*D( I+1 ) + D( I+1 ) = CS*D( I+1 ) + IF( ROTATE ) THEN + WORK( I ) = CS + WORK( N+I ) = SN + END IF + 10 CONTINUE + CALL DLARTG( D( N ), E( N ), CS, SN, R ) + D( N ) = R + E( N ) = ZERO + IF( ROTATE ) THEN + WORK( N ) = CS + WORK( N+N ) = SN + END IF + IUPLO = 2 + SQRE1 = 0 +* +* Update singular vectors if desired. +* + IF( NCVT.GT.0 ) + $ CALL DLASR( 'L', 'V', 'F', NP1, NCVT, WORK( 1 ), + $ WORK( NP1 ), VT, LDVT ) + END IF +* +* If matrix lower bidiagonal, rotate to be upper bidiagonal +* by applying Givens rotations on the left. +* + IF( IUPLO.EQ.2 ) THEN + DO 20 I = 1, N - 1 + CALL DLARTG( D( I ), E( I ), CS, SN, R ) + D( I ) = R + E( I ) = SN*D( I+1 ) + D( I+1 ) = CS*D( I+1 ) + IF( ROTATE ) THEN + WORK( I ) = CS + WORK( N+I ) = SN + END IF + 20 CONTINUE +* +* If matrix (N+1)-by-N lower bidiagonal, one additional +* rotation is needed. +* + IF( SQRE1.EQ.1 ) THEN + CALL DLARTG( D( N ), E( N ), CS, SN, R ) + D( N ) = R + IF( ROTATE ) THEN + WORK( N ) = CS + WORK( N+N ) = SN + END IF + END IF +* +* Update singular vectors if desired. +* + IF( NRU.GT.0 ) THEN + IF( SQRE1.EQ.0 ) THEN + CALL DLASR( 'R', 'V', 'F', NRU, N, WORK( 1 ), + $ WORK( NP1 ), U, LDU ) + ELSE + CALL DLASR( 'R', 'V', 'F', NRU, NP1, WORK( 1 ), + $ WORK( NP1 ), U, LDU ) + END IF + END IF + IF( NCC.GT.0 ) THEN + IF( SQRE1.EQ.0 ) THEN + CALL DLASR( 'L', 'V', 'F', N, NCC, WORK( 1 ), + $ WORK( NP1 ), C, LDC ) + ELSE + CALL DLASR( 'L', 'V', 'F', NP1, NCC, WORK( 1 ), + $ WORK( NP1 ), C, LDC ) + END IF + END IF + END IF +* +* Call DBDSQR to compute the SVD of the reduced real +* N-by-N upper bidiagonal matrix. +* + CALL DBDSQR( 'U', N, NCVT, NRU, NCC, D, E, VT, LDVT, U, LDU, C, + $ LDC, WORK, INFO ) +* +* Sort the singular values into ascending order (insertion sort on +* singular values, but only one transposition per singular vector) +* + DO 40 I = 1, N +* +* Scan for smallest D(I). +* + ISUB = I + SMIN = D( I ) + DO 30 J = I + 1, N + IF( D( J ).LT.SMIN ) THEN + ISUB = J + SMIN = D( J ) + END IF + 30 CONTINUE + IF( ISUB.NE.I ) THEN +* +* Swap singular values and vectors. +* + D( ISUB ) = D( I ) + D( I ) = SMIN + IF( NCVT.GT.0 ) + $ CALL DSWAP( NCVT, VT( ISUB, 1 ), LDVT, VT( I, 1 ), LDVT ) + IF( NRU.GT.0 ) + $ CALL DSWAP( NRU, U( 1, ISUB ), 1, U( 1, I ), 1 ) + IF( NCC.GT.0 ) + $ CALL DSWAP( NCC, C( ISUB, 1 ), LDC, C( I, 1 ), LDC ) + END IF + 40 CONTINUE +* + RETURN +* +* End of DLASDQ +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/dlasdt.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/dlasdt.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,105 @@ + SUBROUTINE DLASDT( N, LVL, ND, INODE, NDIML, NDIMR, MSUB ) +* +* -- LAPACK auxiliary routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER LVL, MSUB, N, ND +* .. +* .. Array Arguments .. + INTEGER INODE( * ), NDIML( * ), NDIMR( * ) +* .. +* +* Purpose +* ======= +* +* DLASDT creates a tree of subproblems for bidiagonal divide and +* conquer. +* +* Arguments +* ========= +* +* N (input) INTEGER +* On entry, the number of diagonal elements of the +* bidiagonal matrix. +* +* LVL (output) INTEGER +* On exit, the number of levels on the computation tree. +* +* ND (output) INTEGER +* On exit, the number of nodes on the tree. +* +* INODE (output) INTEGER array, dimension ( N ) +* On exit, centers of subproblems. +* +* NDIML (output) INTEGER array, dimension ( N ) +* On exit, row dimensions of left children. +* +* NDIMR (output) INTEGER array, dimension ( N ) +* On exit, row dimensions of right children. +* +* MSUB (input) INTEGER. +* On entry, the maximum row dimension each subproblem at the +* bottom of the tree can be of. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Huan Ren, Computer Science Division, University of +* California at Berkeley, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION TWO + PARAMETER ( TWO = 2.0D+0 ) +* .. +* .. Local Scalars .. + INTEGER I, IL, IR, LLST, MAXN, NCRNT, NLVL + DOUBLE PRECISION TEMP +* .. +* .. Intrinsic Functions .. + INTRINSIC DBLE, INT, LOG, MAX +* .. +* .. Executable Statements .. +* +* Find the number of levels on the tree. +* + MAXN = MAX( 1, N ) + TEMP = LOG( DBLE( MAXN ) / DBLE( MSUB+1 ) ) / LOG( TWO ) + LVL = INT( TEMP ) + 1 +* + I = N / 2 + INODE( 1 ) = I + 1 + NDIML( 1 ) = I + NDIMR( 1 ) = N - I - 1 + IL = 0 + IR = 1 + LLST = 1 + DO 20 NLVL = 1, LVL - 1 +* +* Constructing the tree at (NLVL+1)-st level. The number of +* nodes created on this level is LLST * 2. +* + DO 10 I = 0, LLST - 1 + IL = IL + 2 + IR = IR + 2 + NCRNT = LLST + I + NDIML( IL ) = NDIML( NCRNT ) / 2 + NDIMR( IL ) = NDIML( NCRNT ) - NDIML( IL ) - 1 + INODE( IL ) = INODE( NCRNT ) - NDIMR( IL ) - 1 + NDIML( IR ) = NDIMR( NCRNT ) / 2 + NDIMR( IR ) = NDIMR( NCRNT ) - NDIML( IR ) - 1 + INODE( IR ) = INODE( NCRNT ) + NDIML( IR ) + 1 + 10 CONTINUE + LLST = LLST*2 + 20 CONTINUE + ND = LLST*2 - 1 +* + RETURN +* +* End of DLASDT +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/zgelsd.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/zgelsd.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,566 @@ + SUBROUTINE ZGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND, RANK, + $ WORK, LWORK, RWORK, IWORK, INFO ) +* +* -- LAPACK driver routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK + DOUBLE PRECISION RCOND +* .. +* .. Array Arguments .. + INTEGER IWORK( * ) + DOUBLE PRECISION RWORK( * ), S( * ) + COMPLEX*16 A( LDA, * ), B( LDB, * ), WORK( * ) +* .. +* +* Purpose +* ======= +* +* ZGELSD computes the minimum-norm solution to a real linear least +* squares problem: +* minimize 2-norm(| b - A*x |) +* using the singular value decomposition (SVD) of A. A is an M-by-N +* matrix which may be rank-deficient. +* +* Several right hand side vectors b and solution vectors x can be +* handled in a single call; they are stored as the columns of the +* M-by-NRHS right hand side matrix B and the N-by-NRHS solution +* matrix X. +* +* The problem is solved in three steps: +* (1) Reduce the coefficient matrix A to bidiagonal form with +* Householder tranformations, reducing the original problem +* into a "bidiagonal least squares problem" (BLS) +* (2) Solve the BLS using a divide and conquer approach. +* (3) Apply back all the Householder tranformations to solve +* the original least squares problem. +* +* The effective rank of A is determined by treating as zero those +* singular values which are less than RCOND times the largest singular +* value. +* +* The divide and conquer algorithm makes very mild assumptions about +* floating point arithmetic. It will work on machines with a guard +* digit in add/subtract, or on those binary machines without guard +* digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or +* Cray-2. It could conceivably fail on hexadecimal or decimal machines +* without guard digits, but we know of none. +* +* Arguments +* ========= +* +* M (input) INTEGER +* The number of rows of the matrix A. M >= 0. +* +* N (input) INTEGER +* The number of columns of the matrix A. N >= 0. +* +* NRHS (input) INTEGER +* The number of right hand sides, i.e., the number of columns +* of the matrices B and X. NRHS >= 0. +* +* A (input) COMPLEX*16 array, dimension (LDA,N) +* On entry, the M-by-N matrix A. +* On exit, A has been destroyed. +* +* LDA (input) INTEGER +* The leading dimension of the array A. LDA >= max(1,M). +* +* B (input/output) COMPLEX*16 array, dimension (LDB,NRHS) +* On entry, the M-by-NRHS right hand side matrix B. +* On exit, B is overwritten by the N-by-NRHS solution matrix X. +* If m >= n and RANK = n, the residual sum-of-squares for +* the solution in the i-th column is given by the sum of +* squares of the modulus of elements n+1:m in that column. +* +* LDB (input) INTEGER +* The leading dimension of the array B. LDB >= max(1,M,N). +* +* S (output) DOUBLE PRECISION array, dimension (min(M,N)) +* The singular values of A in decreasing order. +* The condition number of A in the 2-norm = S(1)/S(min(m,n)). +* +* RCOND (input) DOUBLE PRECISION +* RCOND is used to determine the effective rank of A. +* Singular values S(i) <= RCOND*S(1) are treated as zero. +* If RCOND < 0, machine precision is used instead. +* +* RANK (output) INTEGER +* The effective rank of A, i.e., the number of singular values +* which are greater than RCOND*S(1). +* +* WORK (workspace/output) COMPLEX*16 array, dimension (MAX(1,LWORK)) +* On exit, if INFO = 0, WORK(1) returns the optimal LWORK. +* +* LWORK (input) INTEGER +* The dimension of the array WORK. LWORK must be at least 1. +* The exact minimum amount of workspace needed depends on M, +* N and NRHS. As long as LWORK is at least +* 2*N + N*NRHS +* if M is greater than or equal to N or +* 2*M + M*NRHS +* if M is less than N, the code will execute correctly. +* For good performance, LWORK should generally be larger. +* +* If LWORK = -1, then a workspace query is assumed; the routine +* only calculates the optimal size of the array WORK and the +* minimum sizes of the arrays RWORK and IWORK, and returns +* these values as the first entries of the WORK, RWORK and +* IWORK arrays, and no error message related to LWORK is issued +* by XERBLA. +* +* RWORK (workspace) DOUBLE PRECISION array, dimension (MAX(1,LRWORK)) +* LRWORK >= +* 10*N + 2*N*SMLSIZ + 8*N*NLVL + 3*SMLSIZ*NRHS + +* (SMLSIZ+1)**2 +* if M is greater than or equal to N or +* 10*M + 2*M*SMLSIZ + 8*M*NLVL + 3*SMLSIZ*NRHS + +* (SMLSIZ+1)**2 +* if M is less than N, the code will execute correctly. +* SMLSIZ is returned by ILAENV and is equal to the maximum +* size of the subproblems at the bottom of the computation +* tree (usually about 25), and +* NLVL = MAX( 0, INT( LOG_2( MIN( M,N )/(SMLSIZ+1) ) ) + 1 ) +* On exit, if INFO = 0, RWORK(1) returns the minimum LRWORK. +* +* IWORK (workspace) INTEGER array, dimension (MAX(1,LIWORK)) +* LIWORK >= max(1, 3*MINMN*NLVL + 11*MINMN), +* where MINMN = MIN( M,N ). +* On exit, if INFO = 0, IWORK(1) returns the minimum LIWORK. +* +* INFO (output) INTEGER +* = 0: successful exit +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: the algorithm for computing the SVD failed to converge; +* if INFO = i, i off-diagonal elements of an intermediate +* bidiagonal form did not converge to zero. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE, TWO + PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0 ) + COMPLEX*16 CZERO + PARAMETER ( CZERO = ( 0.0D+0, 0.0D+0 ) ) +* .. +* .. Local Scalars .. + LOGICAL LQUERY + INTEGER IASCL, IBSCL, IE, IL, ITAU, ITAUP, ITAUQ, + $ LDWORK, LIWORK, LRWORK, MAXMN, MAXWRK, MINMN, + $ MINWRK, MM, MNTHR, NLVL, NRWORK, NWORK, SMLSIZ + DOUBLE PRECISION ANRM, BIGNUM, BNRM, EPS, SFMIN, SMLNUM +* .. +* .. External Subroutines .. + EXTERNAL DLABAD, DLASCL, DLASET, XERBLA, ZGEBRD, ZGELQF, + $ ZGEQRF, ZLACPY, ZLALSD, ZLASCL, ZLASET, ZUNMBR, + $ ZUNMLQ, ZUNMQR +* .. +* .. External Functions .. + INTEGER ILAENV + DOUBLE PRECISION DLAMCH, ZLANGE + EXTERNAL ILAENV, DLAMCH, ZLANGE +* .. +* .. Intrinsic Functions .. + INTRINSIC INT, LOG, MAX, MIN, DBLE +* .. +* .. Executable Statements .. +* +* Test the input arguments. +* + INFO = 0 + MINMN = MIN( M, N ) + MAXMN = MAX( M, N ) + LQUERY = ( LWORK.EQ.-1 ) + IF( M.LT.0 ) THEN + INFO = -1 + ELSE IF( N.LT.0 ) THEN + INFO = -2 + ELSE IF( NRHS.LT.0 ) THEN + INFO = -3 + ELSE IF( LDA.LT.MAX( 1, M ) ) THEN + INFO = -5 + ELSE IF( LDB.LT.MAX( 1, MAXMN ) ) THEN + INFO = -7 + END IF +* +* Compute workspace. +* (Note: Comments in the code beginning "Workspace:" describe the +* minimal amount of workspace needed at that point in the code, +* as well as the preferred amount for good performance. +* NB refers to the optimal block size for the immediately +* following subroutine, as returned by ILAENV.) +* + IF( INFO.EQ.0 ) THEN + MINWRK = 1 + MAXWRK = 1 + LIWORK = 1 + LRWORK = 1 + IF( MINMN.GT.0 ) THEN + SMLSIZ = ILAENV( 9, 'ZGELSD', ' ', 0, 0, 0, 0 ) + MNTHR = ILAENV( 6, 'ZGELSD', ' ', M, N, NRHS, -1 ) + NLVL = MAX( INT( LOG( DBLE( MINMN ) / DBLE( SMLSIZ + 1 ) ) / + $ LOG( TWO ) ) + 1, 0 ) + LIWORK = 3*MINMN*NLVL + 11*MINMN + MM = M + IF( M.GE.N .AND. M.GE.MNTHR ) THEN +* +* Path 1a - overdetermined, with many more rows than +* columns. +* + MM = N + MAXWRK = MAX( MAXWRK, N*ILAENV( 1, 'ZGEQRF', ' ', M, N, + $ -1, -1 ) ) + MAXWRK = MAX( MAXWRK, NRHS*ILAENV( 1, 'ZUNMQR', 'LC', M, + $ NRHS, N, -1 ) ) + END IF + IF( M.GE.N ) THEN +* +* Path 1 - overdetermined or exactly determined. +* + LRWORK = 10*N + 2*N*SMLSIZ + 8*N*NLVL + 3*SMLSIZ*NRHS + + $ ( SMLSIZ + 1 )**2 + MAXWRK = MAX( MAXWRK, 2*N + ( MM + N )*ILAENV( 1, + $ 'ZGEBRD', ' ', MM, N, -1, -1 ) ) + MAXWRK = MAX( MAXWRK, 2*N + NRHS*ILAENV( 1, 'ZUNMBR', + $ 'QLC', MM, NRHS, N, -1 ) ) + MAXWRK = MAX( MAXWRK, 2*N + ( N - 1 )*ILAENV( 1, + $ 'ZUNMBR', 'PLN', N, NRHS, N, -1 ) ) + MAXWRK = MAX( MAXWRK, 2*N + N*NRHS ) + MINWRK = MAX( 2*N + MM, 2*N + N*NRHS ) + END IF + IF( N.GT.M ) THEN + LRWORK = 10*M + 2*M*SMLSIZ + 8*M*NLVL + 3*SMLSIZ*NRHS + + $ ( SMLSIZ + 1 )**2 + IF( N.GE.MNTHR ) THEN +* +* Path 2a - underdetermined, with many more columns +* than rows. +* + MAXWRK = M + M*ILAENV( 1, 'ZGELQF', ' ', M, N, -1, + $ -1 ) + MAXWRK = MAX( MAXWRK, M*M + 4*M + 2*M*ILAENV( 1, + $ 'ZGEBRD', ' ', M, M, -1, -1 ) ) + MAXWRK = MAX( MAXWRK, M*M + 4*M + NRHS*ILAENV( 1, + $ 'ZUNMBR', 'QLC', M, NRHS, M, -1 ) ) + MAXWRK = MAX( MAXWRK, M*M + 4*M + ( M - 1 )*ILAENV( 1, + $ 'ZUNMLQ', 'LC', N, NRHS, M, -1 ) ) + IF( NRHS.GT.1 ) THEN + MAXWRK = MAX( MAXWRK, M*M + M + M*NRHS ) + ELSE + MAXWRK = MAX( MAXWRK, M*M + 2*M ) + END IF + MAXWRK = MAX( MAXWRK, M*M + 4*M + M*NRHS ) + ELSE +* +* Path 2 - underdetermined. +* + MAXWRK = 2*M + ( N + M )*ILAENV( 1, 'ZGEBRD', ' ', M, + $ N, -1, -1 ) + MAXWRK = MAX( MAXWRK, 2*M + NRHS*ILAENV( 1, 'ZUNMBR', + $ 'QLC', M, NRHS, M, -1 ) ) + MAXWRK = MAX( MAXWRK, 2*M + M*ILAENV( 1, 'ZUNMBR', + $ 'PLN', N, NRHS, M, -1 ) ) + MAXWRK = MAX( MAXWRK, 2*M + M*NRHS ) + END IF + MINWRK = MAX( 2*M + N, 2*M + M*NRHS ) + END IF + END IF + MINWRK = MIN( MINWRK, MAXWRK ) + WORK( 1 ) = MAXWRK + IWORK( 1 ) = LIWORK + RWORK( 1 ) = LRWORK +* + IF( LWORK.LT.MINWRK .AND. .NOT.LQUERY ) THEN + INFO = -12 + END IF + END IF +* + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'ZGELSD', -INFO ) + RETURN + ELSE IF( LQUERY ) THEN + RETURN + END IF +* +* Quick return if possible. +* + IF( M.EQ.0 .OR. N.EQ.0 ) THEN + RANK = 0 + RETURN + END IF +* +* Get machine parameters. +* + EPS = DLAMCH( 'P' ) + SFMIN = DLAMCH( 'S' ) + SMLNUM = SFMIN / EPS + BIGNUM = ONE / SMLNUM + CALL DLABAD( SMLNUM, BIGNUM ) +* +* Scale A if max entry outside range [SMLNUM,BIGNUM]. +* + ANRM = ZLANGE( 'M', M, N, A, LDA, RWORK ) + IASCL = 0 + IF( ANRM.GT.ZERO .AND. ANRM.LT.SMLNUM ) THEN +* +* Scale matrix norm up to SMLNUM +* + CALL ZLASCL( 'G', 0, 0, ANRM, SMLNUM, M, N, A, LDA, INFO ) + IASCL = 1 + ELSE IF( ANRM.GT.BIGNUM ) THEN +* +* Scale matrix norm down to BIGNUM. +* + CALL ZLASCL( 'G', 0, 0, ANRM, BIGNUM, M, N, A, LDA, INFO ) + IASCL = 2 + ELSE IF( ANRM.EQ.ZERO ) THEN +* +* Matrix all zero. Return zero solution. +* + CALL ZLASET( 'F', MAX( M, N ), NRHS, CZERO, CZERO, B, LDB ) + CALL DLASET( 'F', MINMN, 1, ZERO, ZERO, S, 1 ) + RANK = 0 + GO TO 10 + END IF +* +* Scale B if max entry outside range [SMLNUM,BIGNUM]. +* + BNRM = ZLANGE( 'M', M, NRHS, B, LDB, RWORK ) + IBSCL = 0 + IF( BNRM.GT.ZERO .AND. BNRM.LT.SMLNUM ) THEN +* +* Scale matrix norm up to SMLNUM. +* + CALL ZLASCL( 'G', 0, 0, BNRM, SMLNUM, M, NRHS, B, LDB, INFO ) + IBSCL = 1 + ELSE IF( BNRM.GT.BIGNUM ) THEN +* +* Scale matrix norm down to BIGNUM. +* + CALL ZLASCL( 'G', 0, 0, BNRM, BIGNUM, M, NRHS, B, LDB, INFO ) + IBSCL = 2 + END IF +* +* If M < N make sure B(M+1:N,:) = 0 +* + IF( M.LT.N ) + $ CALL ZLASET( 'F', N-M, NRHS, CZERO, CZERO, B( M+1, 1 ), LDB ) +* +* Overdetermined case. +* + IF( M.GE.N ) THEN +* +* Path 1 - overdetermined or exactly determined. +* + MM = M + IF( M.GE.MNTHR ) THEN +* +* Path 1a - overdetermined, with many more rows than columns +* + MM = N + ITAU = 1 + NWORK = ITAU + N +* +* Compute A=Q*R. +* (RWorkspace: need N) +* (CWorkspace: need N, prefer N*NB) +* + CALL ZGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Multiply B by transpose(Q). +* (RWorkspace: need N) +* (CWorkspace: need NRHS, prefer NRHS*NB) +* + CALL ZUNMQR( 'L', 'C', M, NRHS, N, A, LDA, WORK( ITAU ), B, + $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* +* Zero out below R. +* + IF( N.GT.1 ) THEN + CALL ZLASET( 'L', N-1, N-1, CZERO, CZERO, A( 2, 1 ), + $ LDA ) + END IF + END IF +* + ITAUQ = 1 + ITAUP = ITAUQ + N + NWORK = ITAUP + N + IE = 1 + NRWORK = IE + N +* +* Bidiagonalize R in A. +* (RWorkspace: need N) +* (CWorkspace: need 2*N+MM, prefer 2*N+(MM+N)*NB) +* + CALL ZGEBRD( MM, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), + $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, + $ INFO ) +* +* Multiply B by transpose of left bidiagonalizing vectors of R. +* (CWorkspace: need 2*N+NRHS, prefer 2*N+NRHS*NB) +* + CALL ZUNMBR( 'Q', 'L', 'C', MM, NRHS, N, A, LDA, WORK( ITAUQ ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* +* Solve the bidiagonal least squares problem. +* + CALL ZLALSD( 'U', SMLSIZ, N, NRHS, S, RWORK( IE ), B, LDB, + $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ), + $ IWORK, INFO ) + IF( INFO.NE.0 ) THEN + GO TO 10 + END IF +* +* Multiply B by right bidiagonalizing vectors of R. +* + CALL ZUNMBR( 'P', 'L', 'N', N, NRHS, N, A, LDA, WORK( ITAUP ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* + ELSE IF( N.GE.MNTHR .AND. LWORK.GE.4*M+M*M+ + $ MAX( M, 2*M-4, NRHS, N-3*M ) ) THEN +* +* Path 2a - underdetermined, with many more columns than rows +* and sufficient workspace for an efficient algorithm. +* + LDWORK = M + IF( LWORK.GE.MAX( 4*M+M*LDA+MAX( M, 2*M-4, NRHS, N-3*M ), + $ M*LDA+M+M*NRHS ) )LDWORK = LDA + ITAU = 1 + NWORK = M + 1 +* +* Compute A=L*Q. +* (CWorkspace: need 2*M, prefer M+M*NB) +* + CALL ZGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) + IL = NWORK +* +* Copy L to WORK(IL), zeroing out above its diagonal. +* + CALL ZLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWORK ) + CALL ZLASET( 'U', M-1, M-1, CZERO, CZERO, WORK( IL+LDWORK ), + $ LDWORK ) + ITAUQ = IL + LDWORK*M + ITAUP = ITAUQ + M + NWORK = ITAUP + M + IE = 1 + NRWORK = IE + M +* +* Bidiagonalize L in WORK(IL). +* (RWorkspace: need M) +* (CWorkspace: need M*M+4*M, prefer M*M+4*M+2*M*NB) +* + CALL ZGEBRD( M, M, WORK( IL ), LDWORK, S, RWORK( IE ), + $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Multiply B by transpose of left bidiagonalizing vectors of L. +* (CWorkspace: need M*M+4*M+NRHS, prefer M*M+4*M+NRHS*NB) +* + CALL ZUNMBR( 'Q', 'L', 'C', M, NRHS, M, WORK( IL ), LDWORK, + $ WORK( ITAUQ ), B, LDB, WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Solve the bidiagonal least squares problem. +* + CALL ZLALSD( 'U', SMLSIZ, M, NRHS, S, RWORK( IE ), B, LDB, + $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ), + $ IWORK, INFO ) + IF( INFO.NE.0 ) THEN + GO TO 10 + END IF +* +* Multiply B by right bidiagonalizing vectors of L. +* + CALL ZUNMBR( 'P', 'L', 'N', M, NRHS, M, WORK( IL ), LDWORK, + $ WORK( ITAUP ), B, LDB, WORK( NWORK ), + $ LWORK-NWORK+1, INFO ) +* +* Zero out below first M rows of B. +* + CALL ZLASET( 'F', N-M, NRHS, CZERO, CZERO, B( M+1, 1 ), LDB ) + NWORK = ITAU + M +* +* Multiply transpose(Q) by B. +* (CWorkspace: need NRHS, prefer NRHS*NB) +* + CALL ZUNMLQ( 'L', 'C', N, NRHS, M, A, LDA, WORK( ITAU ), B, + $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* + ELSE +* +* Path 2 - remaining underdetermined cases. +* + ITAUQ = 1 + ITAUP = ITAUQ + M + NWORK = ITAUP + M + IE = 1 + NRWORK = IE + M +* +* Bidiagonalize A. +* (RWorkspace: need M) +* (CWorkspace: need 2*M+N, prefer 2*M+(M+N)*NB) +* + CALL ZGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), + $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, + $ INFO ) +* +* Multiply B by transpose of left bidiagonalizing vectors. +* (CWorkspace: need 2*M+NRHS, prefer 2*M+NRHS*NB) +* + CALL ZUNMBR( 'Q', 'L', 'C', M, NRHS, N, A, LDA, WORK( ITAUQ ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* +* Solve the bidiagonal least squares problem. +* + CALL ZLALSD( 'L', SMLSIZ, M, NRHS, S, RWORK( IE ), B, LDB, + $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ), + $ IWORK, INFO ) + IF( INFO.NE.0 ) THEN + GO TO 10 + END IF +* +* Multiply B by right bidiagonalizing vectors of A. +* + CALL ZUNMBR( 'P', 'L', 'N', N, NRHS, M, A, LDA, WORK( ITAUP ), + $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO ) +* + END IF +* +* Undo scaling. +* + IF( IASCL.EQ.1 ) THEN + CALL ZLASCL( 'G', 0, 0, ANRM, SMLNUM, N, NRHS, B, LDB, INFO ) + CALL DLASCL( 'G', 0, 0, SMLNUM, ANRM, MINMN, 1, S, MINMN, + $ INFO ) + ELSE IF( IASCL.EQ.2 ) THEN + CALL ZLASCL( 'G', 0, 0, ANRM, BIGNUM, N, NRHS, B, LDB, INFO ) + CALL DLASCL( 'G', 0, 0, BIGNUM, ANRM, MINMN, 1, S, MINMN, + $ INFO ) + END IF + IF( IBSCL.EQ.1 ) THEN + CALL ZLASCL( 'G', 0, 0, SMLNUM, BNRM, N, NRHS, B, LDB, INFO ) + ELSE IF( IBSCL.EQ.2 ) THEN + CALL ZLASCL( 'G', 0, 0, BIGNUM, BNRM, N, NRHS, B, LDB, INFO ) + END IF +* + 10 CONTINUE + WORK( 1 ) = MAXWRK + IWORK( 1 ) = LIWORK + RWORK( 1 ) = LRWORK + RETURN +* +* End of ZGELSD +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/zlals0.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/zlals0.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,433 @@ + SUBROUTINE ZLALS0( ICOMPQ, NL, NR, SQRE, NRHS, B, LDB, BX, LDBX, + $ PERM, GIVPTR, GIVCOL, LDGCOL, GIVNUM, LDGNUM, + $ POLES, DIFL, DIFR, Z, K, C, S, RWORK, INFO ) +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER GIVPTR, ICOMPQ, INFO, K, LDB, LDBX, LDGCOL, + $ LDGNUM, NL, NR, NRHS, SQRE + DOUBLE PRECISION C, S +* .. +* .. Array Arguments .. + INTEGER GIVCOL( LDGCOL, * ), PERM( * ) + DOUBLE PRECISION DIFL( * ), DIFR( LDGNUM, * ), + $ GIVNUM( LDGNUM, * ), POLES( LDGNUM, * ), + $ RWORK( * ), Z( * ) + COMPLEX*16 B( LDB, * ), BX( LDBX, * ) +* .. +* +* Purpose +* ======= +* +* ZLALS0 applies back the multiplying factors of either the left or the +* right singular vector matrix of a diagonal matrix appended by a row +* to the right hand side matrix B in solving the least squares problem +* using the divide-and-conquer SVD approach. +* +* For the left singular vector matrix, three types of orthogonal +* matrices are involved: +* +* (1L) Givens rotations: the number of such rotations is GIVPTR; the +* pairs of columns/rows they were applied to are stored in GIVCOL; +* and the C- and S-values of these rotations are stored in GIVNUM. +* +* (2L) Permutation. The (NL+1)-st row of B is to be moved to the first +* row, and for J=2:N, PERM(J)-th row of B is to be moved to the +* J-th row. +* +* (3L) The left singular vector matrix of the remaining matrix. +* +* For the right singular vector matrix, four types of orthogonal +* matrices are involved: +* +* (1R) The right singular vector matrix of the remaining matrix. +* +* (2R) If SQRE = 1, one extra Givens rotation to generate the right +* null space. +* +* (3R) The inverse transformation of (2L). +* +* (4R) The inverse transformation of (1L). +* +* Arguments +* ========= +* +* ICOMPQ (input) INTEGER +* Specifies whether singular vectors are to be computed in +* factored form: +* = 0: Left singular vector matrix. +* = 1: Right singular vector matrix. +* +* NL (input) INTEGER +* The row dimension of the upper block. NL >= 1. +* +* NR (input) INTEGER +* The row dimension of the lower block. NR >= 1. +* +* SQRE (input) INTEGER +* = 0: the lower block is an NR-by-NR square matrix. +* = 1: the lower block is an NR-by-(NR+1) rectangular matrix. +* +* The bidiagonal matrix has row dimension N = NL + NR + 1, +* and column dimension M = N + SQRE. +* +* NRHS (input) INTEGER +* The number of columns of B and BX. NRHS must be at least 1. +* +* B (input/output) COMPLEX*16 array, dimension ( LDB, NRHS ) +* On input, B contains the right hand sides of the least +* squares problem in rows 1 through M. On output, B contains +* the solution X in rows 1 through N. +* +* LDB (input) INTEGER +* The leading dimension of B. LDB must be at least +* max(1,MAX( M, N ) ). +* +* BX (workspace) COMPLEX*16 array, dimension ( LDBX, NRHS ) +* +* LDBX (input) INTEGER +* The leading dimension of BX. +* +* PERM (input) INTEGER array, dimension ( N ) +* The permutations (from deflation and sorting) applied +* to the two blocks. +* +* GIVPTR (input) INTEGER +* The number of Givens rotations which took place in this +* subproblem. +* +* GIVCOL (input) INTEGER array, dimension ( LDGCOL, 2 ) +* Each pair of numbers indicates a pair of rows/columns +* involved in a Givens rotation. +* +* LDGCOL (input) INTEGER +* The leading dimension of GIVCOL, must be at least N. +* +* GIVNUM (input) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ) +* Each number indicates the C or S value used in the +* corresponding Givens rotation. +* +* LDGNUM (input) INTEGER +* The leading dimension of arrays DIFR, POLES and +* GIVNUM, must be at least K. +* +* POLES (input) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ) +* On entry, POLES(1:K, 1) contains the new singular +* values obtained from solving the secular equation, and +* POLES(1:K, 2) is an array containing the poles in the secular +* equation. +* +* DIFL (input) DOUBLE PRECISION array, dimension ( K ). +* On entry, DIFL(I) is the distance between I-th updated +* (undeflated) singular value and the I-th (undeflated) old +* singular value. +* +* DIFR (input) DOUBLE PRECISION array, dimension ( LDGNUM, 2 ). +* On entry, DIFR(I, 1) contains the distances between I-th +* updated (undeflated) singular value and the I+1-th +* (undeflated) old singular value. And DIFR(I, 2) is the +* normalizing factor for the I-th right singular vector. +* +* Z (input) DOUBLE PRECISION array, dimension ( K ) +* Contain the components of the deflation-adjusted updating row +* vector. +* +* K (input) INTEGER +* Contains the dimension of the non-deflated matrix, +* This is the order of the related secular equation. 1 <= K <=N. +* +* C (input) DOUBLE PRECISION +* C contains garbage if SQRE =0 and the C-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* S (input) DOUBLE PRECISION +* S contains garbage if SQRE =0 and the S-value of a Givens +* rotation related to the right null space if SQRE = 1. +* +* RWORK (workspace) DOUBLE PRECISION array, dimension +* ( K*(1+NRHS) + 2*NRHS ) +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ONE, ZERO, NEGONE + PARAMETER ( ONE = 1.0D0, ZERO = 0.0D0, NEGONE = -1.0D0 ) +* .. +* .. Local Scalars .. + INTEGER I, J, JCOL, JROW, M, N, NLP1 + DOUBLE PRECISION DIFLJ, DIFRJ, DJ, DSIGJ, DSIGJP, TEMP +* .. +* .. External Subroutines .. + EXTERNAL DGEMV, XERBLA, ZCOPY, ZDROT, ZDSCAL, ZLACPY, + $ ZLASCL +* .. +* .. External Functions .. + DOUBLE PRECISION DLAMC3, DNRM2 + EXTERNAL DLAMC3, DNRM2 +* .. +* .. Intrinsic Functions .. + INTRINSIC DBLE, DCMPLX, DIMAG, MAX +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( NL.LT.1 ) THEN + INFO = -2 + ELSE IF( NR.LT.1 ) THEN + INFO = -3 + ELSE IF( ( SQRE.LT.0 ) .OR. ( SQRE.GT.1 ) ) THEN + INFO = -4 + END IF +* + N = NL + NR + 1 +* + IF( NRHS.LT.1 ) THEN + INFO = -5 + ELSE IF( LDB.LT.N ) THEN + INFO = -7 + ELSE IF( LDBX.LT.N ) THEN + INFO = -9 + ELSE IF( GIVPTR.LT.0 ) THEN + INFO = -11 + ELSE IF( LDGCOL.LT.N ) THEN + INFO = -13 + ELSE IF( LDGNUM.LT.N ) THEN + INFO = -15 + ELSE IF( K.LT.1 ) THEN + INFO = -20 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'ZLALS0', -INFO ) + RETURN + END IF +* + M = N + SQRE + NLP1 = NL + 1 +* + IF( ICOMPQ.EQ.0 ) THEN +* +* Apply back orthogonal transformations from the left. +* +* Step (1L): apply back the Givens rotations performed. +* + DO 10 I = 1, GIVPTR + CALL ZDROT( NRHS, B( GIVCOL( I, 2 ), 1 ), LDB, + $ B( GIVCOL( I, 1 ), 1 ), LDB, GIVNUM( I, 2 ), + $ GIVNUM( I, 1 ) ) + 10 CONTINUE +* +* Step (2L): permute rows of B. +* + CALL ZCOPY( NRHS, B( NLP1, 1 ), LDB, BX( 1, 1 ), LDBX ) + DO 20 I = 2, N + CALL ZCOPY( NRHS, B( PERM( I ), 1 ), LDB, BX( I, 1 ), LDBX ) + 20 CONTINUE +* +* Step (3L): apply the inverse of the left singular vector +* matrix to BX. +* + IF( K.EQ.1 ) THEN + CALL ZCOPY( NRHS, BX, LDBX, B, LDB ) + IF( Z( 1 ).LT.ZERO ) THEN + CALL ZDSCAL( NRHS, NEGONE, B, LDB ) + END IF + ELSE + DO 100 J = 1, K + DIFLJ = DIFL( J ) + DJ = POLES( J, 1 ) + DSIGJ = -POLES( J, 2 ) + IF( J.LT.K ) THEN + DIFRJ = -DIFR( J, 1 ) + DSIGJP = -POLES( J+1, 2 ) + END IF + IF( ( Z( J ).EQ.ZERO ) .OR. ( POLES( J, 2 ).EQ.ZERO ) ) + $ THEN + RWORK( J ) = ZERO + ELSE + RWORK( J ) = -POLES( J, 2 )*Z( J ) / DIFLJ / + $ ( POLES( J, 2 )+DJ ) + END IF + DO 30 I = 1, J - 1 + IF( ( Z( I ).EQ.ZERO ) .OR. + $ ( POLES( I, 2 ).EQ.ZERO ) ) THEN + RWORK( I ) = ZERO + ELSE + RWORK( I ) = POLES( I, 2 )*Z( I ) / + $ ( DLAMC3( POLES( I, 2 ), DSIGJ )- + $ DIFLJ ) / ( POLES( I, 2 )+DJ ) + END IF + 30 CONTINUE + DO 40 I = J + 1, K + IF( ( Z( I ).EQ.ZERO ) .OR. + $ ( POLES( I, 2 ).EQ.ZERO ) ) THEN + RWORK( I ) = ZERO + ELSE + RWORK( I ) = POLES( I, 2 )*Z( I ) / + $ ( DLAMC3( POLES( I, 2 ), DSIGJP )+ + $ DIFRJ ) / ( POLES( I, 2 )+DJ ) + END IF + 40 CONTINUE + RWORK( 1 ) = NEGONE + TEMP = DNRM2( K, RWORK, 1 ) +* +* Since B and BX are complex, the following call to DGEMV +* is performed in two steps (real and imaginary parts). +* +* CALL DGEMV( 'T', K, NRHS, ONE, BX, LDBX, WORK, 1, ZERO, +* $ B( J, 1 ), LDB ) +* + I = K + NRHS*2 + DO 60 JCOL = 1, NRHS + DO 50 JROW = 1, K + I = I + 1 + RWORK( I ) = DBLE( BX( JROW, JCOL ) ) + 50 CONTINUE + 60 CONTINUE + CALL DGEMV( 'T', K, NRHS, ONE, RWORK( 1+K+NRHS*2 ), K, + $ RWORK( 1 ), 1, ZERO, RWORK( 1+K ), 1 ) + I = K + NRHS*2 + DO 80 JCOL = 1, NRHS + DO 70 JROW = 1, K + I = I + 1 + RWORK( I ) = DIMAG( BX( JROW, JCOL ) ) + 70 CONTINUE + 80 CONTINUE + CALL DGEMV( 'T', K, NRHS, ONE, RWORK( 1+K+NRHS*2 ), K, + $ RWORK( 1 ), 1, ZERO, RWORK( 1+K+NRHS ), 1 ) + DO 90 JCOL = 1, NRHS + B( J, JCOL ) = DCMPLX( RWORK( JCOL+K ), + $ RWORK( JCOL+K+NRHS ) ) + 90 CONTINUE + CALL ZLASCL( 'G', 0, 0, TEMP, ONE, 1, NRHS, B( J, 1 ), + $ LDB, INFO ) + 100 CONTINUE + END IF +* +* Move the deflated rows of BX to B also. +* + IF( K.LT.MAX( M, N ) ) + $ CALL ZLACPY( 'A', N-K, NRHS, BX( K+1, 1 ), LDBX, + $ B( K+1, 1 ), LDB ) + ELSE +* +* Apply back the right orthogonal transformations. +* +* Step (1R): apply back the new right singular vector matrix +* to B. +* + IF( K.EQ.1 ) THEN + CALL ZCOPY( NRHS, B, LDB, BX, LDBX ) + ELSE + DO 180 J = 1, K + DSIGJ = POLES( J, 2 ) + IF( Z( J ).EQ.ZERO ) THEN + RWORK( J ) = ZERO + ELSE + RWORK( J ) = -Z( J ) / DIFL( J ) / + $ ( DSIGJ+POLES( J, 1 ) ) / DIFR( J, 2 ) + END IF + DO 110 I = 1, J - 1 + IF( Z( J ).EQ.ZERO ) THEN + RWORK( I ) = ZERO + ELSE + RWORK( I ) = Z( J ) / ( DLAMC3( DSIGJ, -POLES( I+1, + $ 2 ) )-DIFR( I, 1 ) ) / + $ ( DSIGJ+POLES( I, 1 ) ) / DIFR( I, 2 ) + END IF + 110 CONTINUE + DO 120 I = J + 1, K + IF( Z( J ).EQ.ZERO ) THEN + RWORK( I ) = ZERO + ELSE + RWORK( I ) = Z( J ) / ( DLAMC3( DSIGJ, -POLES( I, + $ 2 ) )-DIFL( I ) ) / + $ ( DSIGJ+POLES( I, 1 ) ) / DIFR( I, 2 ) + END IF + 120 CONTINUE +* +* Since B and BX are complex, the following call to DGEMV +* is performed in two steps (real and imaginary parts). +* +* CALL DGEMV( 'T', K, NRHS, ONE, B, LDB, WORK, 1, ZERO, +* $ BX( J, 1 ), LDBX ) +* + I = K + NRHS*2 + DO 140 JCOL = 1, NRHS + DO 130 JROW = 1, K + I = I + 1 + RWORK( I ) = DBLE( B( JROW, JCOL ) ) + 130 CONTINUE + 140 CONTINUE + CALL DGEMV( 'T', K, NRHS, ONE, RWORK( 1+K+NRHS*2 ), K, + $ RWORK( 1 ), 1, ZERO, RWORK( 1+K ), 1 ) + I = K + NRHS*2 + DO 160 JCOL = 1, NRHS + DO 150 JROW = 1, K + I = I + 1 + RWORK( I ) = DIMAG( B( JROW, JCOL ) ) + 150 CONTINUE + 160 CONTINUE + CALL DGEMV( 'T', K, NRHS, ONE, RWORK( 1+K+NRHS*2 ), K, + $ RWORK( 1 ), 1, ZERO, RWORK( 1+K+NRHS ), 1 ) + DO 170 JCOL = 1, NRHS + BX( J, JCOL ) = DCMPLX( RWORK( JCOL+K ), + $ RWORK( JCOL+K+NRHS ) ) + 170 CONTINUE + 180 CONTINUE + END IF +* +* Step (2R): if SQRE = 1, apply back the rotation that is +* related to the right null space of the subproblem. +* + IF( SQRE.EQ.1 ) THEN + CALL ZCOPY( NRHS, B( M, 1 ), LDB, BX( M, 1 ), LDBX ) + CALL ZDROT( NRHS, BX( 1, 1 ), LDBX, BX( M, 1 ), LDBX, C, S ) + END IF + IF( K.LT.MAX( M, N ) ) + $ CALL ZLACPY( 'A', N-K, NRHS, B( K+1, 1 ), LDB, BX( K+1, 1 ), + $ LDBX ) +* +* Step (3R): permute rows of B. +* + CALL ZCOPY( NRHS, BX( 1, 1 ), LDBX, B( NLP1, 1 ), LDB ) + IF( SQRE.EQ.1 ) THEN + CALL ZCOPY( NRHS, BX( M, 1 ), LDBX, B( M, 1 ), LDB ) + END IF + DO 190 I = 2, N + CALL ZCOPY( NRHS, BX( I, 1 ), LDBX, B( PERM( I ), 1 ), LDB ) + 190 CONTINUE +* +* Step (4R): apply back the Givens rotations performed. +* + DO 200 I = GIVPTR, 1, -1 + CALL ZDROT( NRHS, B( GIVCOL( I, 2 ), 1 ), LDB, + $ B( GIVCOL( I, 1 ), 1 ), LDB, GIVNUM( I, 2 ), + $ -GIVNUM( I, 1 ) ) + 200 CONTINUE + END IF +* + RETURN +* +* End of ZLALS0 +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/zlalsa.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/zlalsa.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,503 @@ + SUBROUTINE ZLALSA( ICOMPQ, SMLSIZ, N, NRHS, B, LDB, BX, LDBX, U, + $ LDU, VT, K, DIFL, DIFR, Z, POLES, GIVPTR, + $ GIVCOL, LDGCOL, PERM, GIVNUM, C, S, RWORK, + $ IWORK, INFO ) +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + INTEGER ICOMPQ, INFO, LDB, LDBX, LDGCOL, LDU, N, NRHS, + $ SMLSIZ +* .. +* .. Array Arguments .. + INTEGER GIVCOL( LDGCOL, * ), GIVPTR( * ), IWORK( * ), + $ K( * ), PERM( LDGCOL, * ) + DOUBLE PRECISION C( * ), DIFL( LDU, * ), DIFR( LDU, * ), + $ GIVNUM( LDU, * ), POLES( LDU, * ), RWORK( * ), + $ S( * ), U( LDU, * ), VT( LDU, * ), Z( LDU, * ) + COMPLEX*16 B( LDB, * ), BX( LDBX, * ) +* .. +* +* Purpose +* ======= +* +* ZLALSA is an itermediate step in solving the least squares problem +* by computing the SVD of the coefficient matrix in compact form (The +* singular vectors are computed as products of simple orthorgonal +* matrices.). +* +* If ICOMPQ = 0, ZLALSA applies the inverse of the left singular vector +* matrix of an upper bidiagonal matrix to the right hand side; and if +* ICOMPQ = 1, ZLALSA applies the right singular vector matrix to the +* right hand side. The singular vector matrices were generated in +* compact form by ZLALSA. +* +* Arguments +* ========= +* +* ICOMPQ (input) INTEGER +* Specifies whether the left or the right singular vector +* matrix is involved. +* = 0: Left singular vector matrix +* = 1: Right singular vector matrix +* +* SMLSIZ (input) INTEGER +* The maximum size of the subproblems at the bottom of the +* computation tree. +* +* N (input) INTEGER +* The row and column dimensions of the upper bidiagonal matrix. +* +* NRHS (input) INTEGER +* The number of columns of B and BX. NRHS must be at least 1. +* +* B (input/output) COMPLEX*16 array, dimension ( LDB, NRHS ) +* On input, B contains the right hand sides of the least +* squares problem in rows 1 through M. +* On output, B contains the solution X in rows 1 through N. +* +* LDB (input) INTEGER +* The leading dimension of B in the calling subprogram. +* LDB must be at least max(1,MAX( M, N ) ). +* +* BX (output) COMPLEX*16 array, dimension ( LDBX, NRHS ) +* On exit, the result of applying the left or right singular +* vector matrix to B. +* +* LDBX (input) INTEGER +* The leading dimension of BX. +* +* U (input) DOUBLE PRECISION array, dimension ( LDU, SMLSIZ ). +* On entry, U contains the left singular vector matrices of all +* subproblems at the bottom level. +* +* LDU (input) INTEGER, LDU = > N. +* The leading dimension of arrays U, VT, DIFL, DIFR, +* POLES, GIVNUM, and Z. +* +* VT (input) DOUBLE PRECISION array, dimension ( LDU, SMLSIZ+1 ). +* On entry, VT' contains the right singular vector matrices of +* all subproblems at the bottom level. +* +* K (input) INTEGER array, dimension ( N ). +* +* DIFL (input) DOUBLE PRECISION array, dimension ( LDU, NLVL ). +* where NLVL = INT(log_2 (N/(SMLSIZ+1))) + 1. +* +* DIFR (input) DOUBLE PRECISION array, dimension ( LDU, 2 * NLVL ). +* On entry, DIFL(*, I) and DIFR(*, 2 * I -1) record +* distances between singular values on the I-th level and +* singular values on the (I -1)-th level, and DIFR(*, 2 * I) +* record the normalizing factors of the right singular vectors +* matrices of subproblems on I-th level. +* +* Z (input) DOUBLE PRECISION array, dimension ( LDU, NLVL ). +* On entry, Z(1, I) contains the components of the deflation- +* adjusted updating row vector for subproblems on the I-th +* level. +* +* POLES (input) DOUBLE PRECISION array, dimension ( LDU, 2 * NLVL ). +* On entry, POLES(*, 2 * I -1: 2 * I) contains the new and old +* singular values involved in the secular equations on the I-th +* level. +* +* GIVPTR (input) INTEGER array, dimension ( N ). +* On entry, GIVPTR( I ) records the number of Givens +* rotations performed on the I-th problem on the computation +* tree. +* +* GIVCOL (input) INTEGER array, dimension ( LDGCOL, 2 * NLVL ). +* On entry, for each I, GIVCOL(*, 2 * I - 1: 2 * I) records the +* locations of Givens rotations performed on the I-th level on +* the computation tree. +* +* LDGCOL (input) INTEGER, LDGCOL = > N. +* The leading dimension of arrays GIVCOL and PERM. +* +* PERM (input) INTEGER array, dimension ( LDGCOL, NLVL ). +* On entry, PERM(*, I) records permutations done on the I-th +* level of the computation tree. +* +* GIVNUM (input) DOUBLE PRECISION array, dimension ( LDU, 2 * NLVL ). +* On entry, GIVNUM(*, 2 *I -1 : 2 * I) records the C- and S- +* values of Givens rotations performed on the I-th level on the +* computation tree. +* +* C (input) DOUBLE PRECISION array, dimension ( N ). +* On entry, if the I-th subproblem is not square, +* C( I ) contains the C-value of a Givens rotation related to +* the right null space of the I-th subproblem. +* +* S (input) DOUBLE PRECISION array, dimension ( N ). +* On entry, if the I-th subproblem is not square, +* S( I ) contains the S-value of a Givens rotation related to +* the right null space of the I-th subproblem. +* +* RWORK (workspace) DOUBLE PRECISION array, dimension at least +* max ( N, (SMLSZ+1)*NRHS*3 ). +* +* IWORK (workspace) INTEGER array. +* The dimension must be at least 3 * N +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) +* .. +* .. Local Scalars .. + INTEGER I, I1, IC, IM1, INODE, J, JCOL, JIMAG, JREAL, + $ JROW, LF, LL, LVL, LVL2, ND, NDB1, NDIML, + $ NDIMR, NL, NLF, NLP1, NLVL, NR, NRF, NRP1, SQRE +* .. +* .. External Subroutines .. + EXTERNAL DGEMM, DLASDT, XERBLA, ZCOPY, ZLALS0 +* .. +* .. Intrinsic Functions .. + INTRINSIC DBLE, DCMPLX, DIMAG +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( ( ICOMPQ.LT.0 ) .OR. ( ICOMPQ.GT.1 ) ) THEN + INFO = -1 + ELSE IF( SMLSIZ.LT.3 ) THEN + INFO = -2 + ELSE IF( N.LT.SMLSIZ ) THEN + INFO = -3 + ELSE IF( NRHS.LT.1 ) THEN + INFO = -4 + ELSE IF( LDB.LT.N ) THEN + INFO = -6 + ELSE IF( LDBX.LT.N ) THEN + INFO = -8 + ELSE IF( LDU.LT.N ) THEN + INFO = -10 + ELSE IF( LDGCOL.LT.N ) THEN + INFO = -19 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'ZLALSA', -INFO ) + RETURN + END IF +* +* Book-keeping and setting up the computation tree. +* + INODE = 1 + NDIML = INODE + N + NDIMR = NDIML + N +* + CALL DLASDT( N, NLVL, ND, IWORK( INODE ), IWORK( NDIML ), + $ IWORK( NDIMR ), SMLSIZ ) +* +* The following code applies back the left singular vector factors. +* For applying back the right singular vector factors, go to 170. +* + IF( ICOMPQ.EQ.1 ) THEN + GO TO 170 + END IF +* +* The nodes on the bottom level of the tree were solved +* by DLASDQ. The corresponding left and right singular vector +* matrices are in explicit form. First apply back the left +* singular vector matrices. +* + NDB1 = ( ND+1 ) / 2 + DO 130 I = NDB1, ND +* +* IC : center row of each node +* NL : number of rows of left subproblem +* NR : number of rows of right subproblem +* NLF: starting row of the left subproblem +* NRF: starting row of the right subproblem +* + I1 = I - 1 + IC = IWORK( INODE+I1 ) + NL = IWORK( NDIML+I1 ) + NR = IWORK( NDIMR+I1 ) + NLF = IC - NL + NRF = IC + 1 +* +* Since B and BX are complex, the following call to DGEMM +* is performed in two steps (real and imaginary parts). +* +* CALL DGEMM( 'T', 'N', NL, NRHS, NL, ONE, U( NLF, 1 ), LDU, +* $ B( NLF, 1 ), LDB, ZERO, BX( NLF, 1 ), LDBX ) +* + J = NL*NRHS*2 + DO 20 JCOL = 1, NRHS + DO 10 JROW = NLF, NLF + NL - 1 + J = J + 1 + RWORK( J ) = DBLE( B( JROW, JCOL ) ) + 10 CONTINUE + 20 CONTINUE + CALL DGEMM( 'T', 'N', NL, NRHS, NL, ONE, U( NLF, 1 ), LDU, + $ RWORK( 1+NL*NRHS*2 ), NL, ZERO, RWORK( 1 ), NL ) + J = NL*NRHS*2 + DO 40 JCOL = 1, NRHS + DO 30 JROW = NLF, NLF + NL - 1 + J = J + 1 + RWORK( J ) = DIMAG( B( JROW, JCOL ) ) + 30 CONTINUE + 40 CONTINUE + CALL DGEMM( 'T', 'N', NL, NRHS, NL, ONE, U( NLF, 1 ), LDU, + $ RWORK( 1+NL*NRHS*2 ), NL, ZERO, RWORK( 1+NL*NRHS ), + $ NL ) + JREAL = 0 + JIMAG = NL*NRHS + DO 60 JCOL = 1, NRHS + DO 50 JROW = NLF, NLF + NL - 1 + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + BX( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 50 CONTINUE + 60 CONTINUE +* +* Since B and BX are complex, the following call to DGEMM +* is performed in two steps (real and imaginary parts). +* +* CALL DGEMM( 'T', 'N', NR, NRHS, NR, ONE, U( NRF, 1 ), LDU, +* $ B( NRF, 1 ), LDB, ZERO, BX( NRF, 1 ), LDBX ) +* + J = NR*NRHS*2 + DO 80 JCOL = 1, NRHS + DO 70 JROW = NRF, NRF + NR - 1 + J = J + 1 + RWORK( J ) = DBLE( B( JROW, JCOL ) ) + 70 CONTINUE + 80 CONTINUE + CALL DGEMM( 'T', 'N', NR, NRHS, NR, ONE, U( NRF, 1 ), LDU, + $ RWORK( 1+NR*NRHS*2 ), NR, ZERO, RWORK( 1 ), NR ) + J = NR*NRHS*2 + DO 100 JCOL = 1, NRHS + DO 90 JROW = NRF, NRF + NR - 1 + J = J + 1 + RWORK( J ) = DIMAG( B( JROW, JCOL ) ) + 90 CONTINUE + 100 CONTINUE + CALL DGEMM( 'T', 'N', NR, NRHS, NR, ONE, U( NRF, 1 ), LDU, + $ RWORK( 1+NR*NRHS*2 ), NR, ZERO, RWORK( 1+NR*NRHS ), + $ NR ) + JREAL = 0 + JIMAG = NR*NRHS + DO 120 JCOL = 1, NRHS + DO 110 JROW = NRF, NRF + NR - 1 + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + BX( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 110 CONTINUE + 120 CONTINUE +* + 130 CONTINUE +* +* Next copy the rows of B that correspond to unchanged rows +* in the bidiagonal matrix to BX. +* + DO 140 I = 1, ND + IC = IWORK( INODE+I-1 ) + CALL ZCOPY( NRHS, B( IC, 1 ), LDB, BX( IC, 1 ), LDBX ) + 140 CONTINUE +* +* Finally go through the left singular vector matrices of all +* the other subproblems bottom-up on the tree. +* + J = 2**NLVL + SQRE = 0 +* + DO 160 LVL = NLVL, 1, -1 + LVL2 = 2*LVL - 1 +* +* find the first node LF and last node LL on +* the current level LVL +* + IF( LVL.EQ.1 ) THEN + LF = 1 + LL = 1 + ELSE + LF = 2**( LVL-1 ) + LL = 2*LF - 1 + END IF + DO 150 I = LF, LL + IM1 = I - 1 + IC = IWORK( INODE+IM1 ) + NL = IWORK( NDIML+IM1 ) + NR = IWORK( NDIMR+IM1 ) + NLF = IC - NL + NRF = IC + 1 + J = J - 1 + CALL ZLALS0( ICOMPQ, NL, NR, SQRE, NRHS, BX( NLF, 1 ), LDBX, + $ B( NLF, 1 ), LDB, PERM( NLF, LVL ), + $ GIVPTR( J ), GIVCOL( NLF, LVL2 ), LDGCOL, + $ GIVNUM( NLF, LVL2 ), LDU, POLES( NLF, LVL2 ), + $ DIFL( NLF, LVL ), DIFR( NLF, LVL2 ), + $ Z( NLF, LVL ), K( J ), C( J ), S( J ), RWORK, + $ INFO ) + 150 CONTINUE + 160 CONTINUE + GO TO 330 +* +* ICOMPQ = 1: applying back the right singular vector factors. +* + 170 CONTINUE +* +* First now go through the right singular vector matrices of all +* the tree nodes top-down. +* + J = 0 + DO 190 LVL = 1, NLVL + LVL2 = 2*LVL - 1 +* +* Find the first node LF and last node LL on +* the current level LVL. +* + IF( LVL.EQ.1 ) THEN + LF = 1 + LL = 1 + ELSE + LF = 2**( LVL-1 ) + LL = 2*LF - 1 + END IF + DO 180 I = LL, LF, -1 + IM1 = I - 1 + IC = IWORK( INODE+IM1 ) + NL = IWORK( NDIML+IM1 ) + NR = IWORK( NDIMR+IM1 ) + NLF = IC - NL + NRF = IC + 1 + IF( I.EQ.LL ) THEN + SQRE = 0 + ELSE + SQRE = 1 + END IF + J = J + 1 + CALL ZLALS0( ICOMPQ, NL, NR, SQRE, NRHS, B( NLF, 1 ), LDB, + $ BX( NLF, 1 ), LDBX, PERM( NLF, LVL ), + $ GIVPTR( J ), GIVCOL( NLF, LVL2 ), LDGCOL, + $ GIVNUM( NLF, LVL2 ), LDU, POLES( NLF, LVL2 ), + $ DIFL( NLF, LVL ), DIFR( NLF, LVL2 ), + $ Z( NLF, LVL ), K( J ), C( J ), S( J ), RWORK, + $ INFO ) + 180 CONTINUE + 190 CONTINUE +* +* The nodes on the bottom level of the tree were solved +* by DLASDQ. The corresponding right singular vector +* matrices are in explicit form. Apply them back. +* + NDB1 = ( ND+1 ) / 2 + DO 320 I = NDB1, ND + I1 = I - 1 + IC = IWORK( INODE+I1 ) + NL = IWORK( NDIML+I1 ) + NR = IWORK( NDIMR+I1 ) + NLP1 = NL + 1 + IF( I.EQ.ND ) THEN + NRP1 = NR + ELSE + NRP1 = NR + 1 + END IF + NLF = IC - NL + NRF = IC + 1 +* +* Since B and BX are complex, the following call to DGEMM is +* performed in two steps (real and imaginary parts). +* +* CALL DGEMM( 'T', 'N', NLP1, NRHS, NLP1, ONE, VT( NLF, 1 ), LDU, +* $ B( NLF, 1 ), LDB, ZERO, BX( NLF, 1 ), LDBX ) +* + J = NLP1*NRHS*2 + DO 210 JCOL = 1, NRHS + DO 200 JROW = NLF, NLF + NLP1 - 1 + J = J + 1 + RWORK( J ) = DBLE( B( JROW, JCOL ) ) + 200 CONTINUE + 210 CONTINUE + CALL DGEMM( 'T', 'N', NLP1, NRHS, NLP1, ONE, VT( NLF, 1 ), LDU, + $ RWORK( 1+NLP1*NRHS*2 ), NLP1, ZERO, RWORK( 1 ), + $ NLP1 ) + J = NLP1*NRHS*2 + DO 230 JCOL = 1, NRHS + DO 220 JROW = NLF, NLF + NLP1 - 1 + J = J + 1 + RWORK( J ) = DIMAG( B( JROW, JCOL ) ) + 220 CONTINUE + 230 CONTINUE + CALL DGEMM( 'T', 'N', NLP1, NRHS, NLP1, ONE, VT( NLF, 1 ), LDU, + $ RWORK( 1+NLP1*NRHS*2 ), NLP1, ZERO, + $ RWORK( 1+NLP1*NRHS ), NLP1 ) + JREAL = 0 + JIMAG = NLP1*NRHS + DO 250 JCOL = 1, NRHS + DO 240 JROW = NLF, NLF + NLP1 - 1 + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + BX( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 240 CONTINUE + 250 CONTINUE +* +* Since B and BX are complex, the following call to DGEMM is +* performed in two steps (real and imaginary parts). +* +* CALL DGEMM( 'T', 'N', NRP1, NRHS, NRP1, ONE, VT( NRF, 1 ), LDU, +* $ B( NRF, 1 ), LDB, ZERO, BX( NRF, 1 ), LDBX ) +* + J = NRP1*NRHS*2 + DO 270 JCOL = 1, NRHS + DO 260 JROW = NRF, NRF + NRP1 - 1 + J = J + 1 + RWORK( J ) = DBLE( B( JROW, JCOL ) ) + 260 CONTINUE + 270 CONTINUE + CALL DGEMM( 'T', 'N', NRP1, NRHS, NRP1, ONE, VT( NRF, 1 ), LDU, + $ RWORK( 1+NRP1*NRHS*2 ), NRP1, ZERO, RWORK( 1 ), + $ NRP1 ) + J = NRP1*NRHS*2 + DO 290 JCOL = 1, NRHS + DO 280 JROW = NRF, NRF + NRP1 - 1 + J = J + 1 + RWORK( J ) = DIMAG( B( JROW, JCOL ) ) + 280 CONTINUE + 290 CONTINUE + CALL DGEMM( 'T', 'N', NRP1, NRHS, NRP1, ONE, VT( NRF, 1 ), LDU, + $ RWORK( 1+NRP1*NRHS*2 ), NRP1, ZERO, + $ RWORK( 1+NRP1*NRHS ), NRP1 ) + JREAL = 0 + JIMAG = NRP1*NRHS + DO 310 JCOL = 1, NRHS + DO 300 JROW = NRF, NRF + NRP1 - 1 + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + BX( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 300 CONTINUE + 310 CONTINUE +* + 320 CONTINUE +* + 330 CONTINUE +* + RETURN +* +* End of ZLALSA +* + END diff -r c3b479e753dd -r b48d486f641d libcruft/lapack/zlalsd.f --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libcruft/lapack/zlalsd.f Fri Oct 26 15:52:58 2007 +0000 @@ -0,0 +1,600 @@ + SUBROUTINE ZLALSD( UPLO, SMLSIZ, N, NRHS, D, E, B, LDB, RCOND, + $ RANK, WORK, RWORK, IWORK, INFO ) +* +* -- LAPACK routine (version 3.1) -- +* Univ. of Tennessee, Univ. of California Berkeley and NAG Ltd.. +* November 2006 +* +* .. Scalar Arguments .. + CHARACTER UPLO + INTEGER INFO, LDB, N, NRHS, RANK, SMLSIZ + DOUBLE PRECISION RCOND +* .. +* .. Array Arguments .. + INTEGER IWORK( * ) + DOUBLE PRECISION D( * ), E( * ), RWORK( * ) + COMPLEX*16 B( LDB, * ), WORK( * ) +* .. +* +* Purpose +* ======= +* +* ZLALSD uses the singular value decomposition of A to solve the least +* squares problem of finding X to minimize the Euclidean norm of each +* column of A*X-B, where A is N-by-N upper bidiagonal, and X and B +* are N-by-NRHS. The solution X overwrites B. +* +* The singular values of A smaller than RCOND times the largest +* singular value are treated as zero in solving the least squares +* problem; in this case a minimum norm solution is returned. +* The actual singular values are returned in D in ascending order. +* +* This code makes very mild assumptions about floating point +* arithmetic. It will work on machines with a guard digit in +* add/subtract, or on those binary machines without guard digits +* which subtract like the Cray XMP, Cray YMP, Cray C 90, or Cray 2. +* It could conceivably fail on hexadecimal or decimal machines +* without guard digits, but we know of none. +* +* Arguments +* ========= +* +* UPLO (input) CHARACTER*1 +* = 'U': D and E define an upper bidiagonal matrix. +* = 'L': D and E define a lower bidiagonal matrix. +* +* SMLSIZ (input) INTEGER +* The maximum size of the subproblems at the bottom of the +* computation tree. +* +* N (input) INTEGER +* The dimension of the bidiagonal matrix. N >= 0. +* +* NRHS (input) INTEGER +* The number of columns of B. NRHS must be at least 1. +* +* D (input/output) DOUBLE PRECISION array, dimension (N) +* On entry D contains the main diagonal of the bidiagonal +* matrix. On exit, if INFO = 0, D contains its singular values. +* +* E (input/output) DOUBLE PRECISION array, dimension (N-1) +* Contains the super-diagonal entries of the bidiagonal matrix. +* On exit, E has been destroyed. +* +* B (input/output) COMPLEX*16 array, dimension (LDB,NRHS) +* On input, B contains the right hand sides of the least +* squares problem. On output, B contains the solution X. +* +* LDB (input) INTEGER +* The leading dimension of B in the calling subprogram. +* LDB must be at least max(1,N). +* +* RCOND (input) DOUBLE PRECISION +* The singular values of A less than or equal to RCOND times +* the largest singular value are treated as zero in solving +* the least squares problem. If RCOND is negative, +* machine precision is used instead. +* For example, if diag(S)*X=B were the least squares problem, +* where diag(S) is a diagonal matrix of singular values, the +* solution would be X(i) = B(i) / S(i) if S(i) is greater than +* RCOND*max(S), and X(i) = 0 if S(i) is less than or equal to +* RCOND*max(S). +* +* RANK (output) INTEGER +* The number of singular values of A greater than RCOND times +* the largest singular value. +* +* WORK (workspace) COMPLEX*16 array, dimension at least +* (N * NRHS). +* +* RWORK (workspace) DOUBLE PRECISION array, dimension at least +* (9*N + 2*N*SMLSIZ + 8*N*NLVL + 3*SMLSIZ*NRHS + (SMLSIZ+1)**2), +* where +* NLVL = MAX( 0, INT( LOG_2( MIN( M,N )/(SMLSIZ+1) ) ) + 1 ) +* +* IWORK (workspace) INTEGER array, dimension at least +* (3*N*NLVL + 11*N). +* +* INFO (output) INTEGER +* = 0: successful exit. +* < 0: if INFO = -i, the i-th argument had an illegal value. +* > 0: The algorithm failed to compute an singular value while +* working on the submatrix lying in rows and columns +* INFO/(N+1) through MOD(INFO,N+1). +* +* Further Details +* =============== +* +* Based on contributions by +* Ming Gu and Ren-Cang Li, Computer Science Division, University of +* California at Berkeley, USA +* Osni Marques, LBNL/NERSC, USA +* +* ===================================================================== +* +* .. Parameters .. + DOUBLE PRECISION ZERO, ONE, TWO + PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0 ) + COMPLEX*16 CZERO + PARAMETER ( CZERO = ( 0.0D0, 0.0D0 ) ) +* .. +* .. Local Scalars .. + INTEGER BX, BXST, C, DIFL, DIFR, GIVCOL, GIVNUM, + $ GIVPTR, I, ICMPQ1, ICMPQ2, IRWB, IRWIB, IRWRB, + $ IRWU, IRWVT, IRWWRK, IWK, J, JCOL, JIMAG, + $ JREAL, JROW, K, NLVL, NM1, NRWORK, NSIZE, NSUB, + $ PERM, POLES, S, SIZEI, SMLSZP, SQRE, ST, ST1, + $ U, VT, Z + DOUBLE PRECISION CS, EPS, ORGNRM, RCND, R, SN, TOL +* .. +* .. External Functions .. + INTEGER IDAMAX + DOUBLE PRECISION DLAMCH, DLANST + EXTERNAL IDAMAX, DLAMCH, DLANST +* .. +* .. External Subroutines .. + EXTERNAL DGEMM, DLARTG, DLASCL, DLASDA, DLASDQ, DLASET, + $ DLASRT, XERBLA, ZCOPY, ZDROT, ZLACPY, ZLALSA, + $ ZLASCL, ZLASET +* .. +* .. Intrinsic Functions .. + INTRINSIC ABS, DBLE, DCMPLX, DIMAG, INT, LOG, SIGN +* .. +* .. Executable Statements .. +* +* Test the input parameters. +* + INFO = 0 +* + IF( N.LT.0 ) THEN + INFO = -3 + ELSE IF( NRHS.LT.1 ) THEN + INFO = -4 + ELSE IF( ( LDB.LT.1 ) .OR. ( LDB.LT.N ) ) THEN + INFO = -8 + END IF + IF( INFO.NE.0 ) THEN + CALL XERBLA( 'ZLALSD', -INFO ) + RETURN + END IF +* + EPS = DLAMCH( 'Epsilon' ) +* +* Set up the tolerance. +* + IF( ( RCOND.LE.ZERO ) .OR. ( RCOND.GE.ONE ) ) THEN + RCND = EPS + ELSE + RCND = RCOND + END IF +* + RANK = 0 +* +* Quick return if possible. +* + IF( N.EQ.0 ) THEN + RETURN + ELSE IF( N.EQ.1 ) THEN + IF( D( 1 ).EQ.ZERO ) THEN + CALL ZLASET( 'A', 1, NRHS, CZERO, CZERO, B, LDB ) + ELSE + RANK = 1 + CALL ZLASCL( 'G', 0, 0, D( 1 ), ONE, 1, NRHS, B, LDB, INFO ) + D( 1 ) = ABS( D( 1 ) ) + END IF + RETURN + END IF +* +* Rotate the matrix if it is lower bidiagonal. +* + IF( UPLO.EQ.'L' ) THEN + DO 10 I = 1, N - 1 + CALL DLARTG( D( I ), E( I ), CS, SN, R ) + D( I ) = R + E( I ) = SN*D( I+1 ) + D( I+1 ) = CS*D( I+1 ) + IF( NRHS.EQ.1 ) THEN + CALL ZDROT( 1, B( I, 1 ), 1, B( I+1, 1 ), 1, CS, SN ) + ELSE + RWORK( I*2-1 ) = CS + RWORK( I*2 ) = SN + END IF + 10 CONTINUE + IF( NRHS.GT.1 ) THEN + DO 30 I = 1, NRHS + DO 20 J = 1, N - 1 + CS = RWORK( J*2-1 ) + SN = RWORK( J*2 ) + CALL ZDROT( 1, B( J, I ), 1, B( J+1, I ), 1, CS, SN ) + 20 CONTINUE + 30 CONTINUE + END IF + END IF +* +* Scale. +* + NM1 = N - 1 + ORGNRM = DLANST( 'M', N, D, E ) + IF( ORGNRM.EQ.ZERO ) THEN + CALL ZLASET( 'A', N, NRHS, CZERO, CZERO, B, LDB ) + RETURN + END IF +* + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, N, 1, D, N, INFO ) + CALL DLASCL( 'G', 0, 0, ORGNRM, ONE, NM1, 1, E, NM1, INFO ) +* +* If N is smaller than the minimum divide size SMLSIZ, then solve +* the problem with another solver. +* + IF( N.LE.SMLSIZ ) THEN + IRWU = 1 + IRWVT = IRWU + N*N + IRWWRK = IRWVT + N*N + IRWRB = IRWWRK + IRWIB = IRWRB + N*NRHS + IRWB = IRWIB + N*NRHS + CALL DLASET( 'A', N, N, ZERO, ONE, RWORK( IRWU ), N ) + CALL DLASET( 'A', N, N, ZERO, ONE, RWORK( IRWVT ), N ) + CALL DLASDQ( 'U', 0, N, N, N, 0, D, E, RWORK( IRWVT ), N, + $ RWORK( IRWU ), N, RWORK( IRWWRK ), 1, + $ RWORK( IRWWRK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF +* +* In the real version, B is passed to DLASDQ and multiplied +* internally by Q'. Here B is complex and that product is +* computed below in two steps (real and imaginary parts). +* + J = IRWB - 1 + DO 50 JCOL = 1, NRHS + DO 40 JROW = 1, N + J = J + 1 + RWORK( J ) = DBLE( B( JROW, JCOL ) ) + 40 CONTINUE + 50 CONTINUE + CALL DGEMM( 'T', 'N', N, NRHS, N, ONE, RWORK( IRWU ), N, + $ RWORK( IRWB ), N, ZERO, RWORK( IRWRB ), N ) + J = IRWB - 1 + DO 70 JCOL = 1, NRHS + DO 60 JROW = 1, N + J = J + 1 + RWORK( J ) = DIMAG( B( JROW, JCOL ) ) + 60 CONTINUE + 70 CONTINUE + CALL DGEMM( 'T', 'N', N, NRHS, N, ONE, RWORK( IRWU ), N, + $ RWORK( IRWB ), N, ZERO, RWORK( IRWIB ), N ) + JREAL = IRWRB - 1 + JIMAG = IRWIB - 1 + DO 90 JCOL = 1, NRHS + DO 80 JROW = 1, N + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + B( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 80 CONTINUE + 90 CONTINUE +* + TOL = RCND*ABS( D( IDAMAX( N, D, 1 ) ) ) + DO 100 I = 1, N + IF( D( I ).LE.TOL ) THEN + CALL ZLASET( 'A', 1, NRHS, CZERO, CZERO, B( I, 1 ), LDB ) + ELSE + CALL ZLASCL( 'G', 0, 0, D( I ), ONE, 1, NRHS, B( I, 1 ), + $ LDB, INFO ) + RANK = RANK + 1 + END IF + 100 CONTINUE +* +* Since B is complex, the following call to DGEMM is performed +* in two steps (real and imaginary parts). That is for V * B +* (in the real version of the code V' is stored in WORK). +* +* CALL DGEMM( 'T', 'N', N, NRHS, N, ONE, WORK, N, B, LDB, ZERO, +* $ WORK( NWORK ), N ) +* + J = IRWB - 1 + DO 120 JCOL = 1, NRHS + DO 110 JROW = 1, N + J = J + 1 + RWORK( J ) = DBLE( B( JROW, JCOL ) ) + 110 CONTINUE + 120 CONTINUE + CALL DGEMM( 'T', 'N', N, NRHS, N, ONE, RWORK( IRWVT ), N, + $ RWORK( IRWB ), N, ZERO, RWORK( IRWRB ), N ) + J = IRWB - 1 + DO 140 JCOL = 1, NRHS + DO 130 JROW = 1, N + J = J + 1 + RWORK( J ) = DIMAG( B( JROW, JCOL ) ) + 130 CONTINUE + 140 CONTINUE + CALL DGEMM( 'T', 'N', N, NRHS, N, ONE, RWORK( IRWVT ), N, + $ RWORK( IRWB ), N, ZERO, RWORK( IRWIB ), N ) + JREAL = IRWRB - 1 + JIMAG = IRWIB - 1 + DO 160 JCOL = 1, NRHS + DO 150 JROW = 1, N + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + B( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 150 CONTINUE + 160 CONTINUE +* +* Unscale. +* + CALL DLASCL( 'G', 0, 0, ONE, ORGNRM, N, 1, D, N, INFO ) + CALL DLASRT( 'D', N, D, INFO ) + CALL ZLASCL( 'G', 0, 0, ORGNRM, ONE, N, NRHS, B, LDB, INFO ) +* + RETURN + END IF +* +* Book-keeping and setting up some constants. +* + NLVL = INT( LOG( DBLE( N ) / DBLE( SMLSIZ+1 ) ) / LOG( TWO ) ) + 1 +* + SMLSZP = SMLSIZ + 1 +* + U = 1 + VT = 1 + SMLSIZ*N + DIFL = VT + SMLSZP*N + DIFR = DIFL + NLVL*N + Z = DIFR + NLVL*N*2 + C = Z + NLVL*N + S = C + N + POLES = S + N + GIVNUM = POLES + 2*NLVL*N + NRWORK = GIVNUM + 2*NLVL*N + BX = 1 +* + IRWRB = NRWORK + IRWIB = IRWRB + SMLSIZ*NRHS + IRWB = IRWIB + SMLSIZ*NRHS +* + SIZEI = 1 + N + K = SIZEI + N + GIVPTR = K + N + PERM = GIVPTR + N + GIVCOL = PERM + NLVL*N + IWK = GIVCOL + NLVL*N*2 +* + ST = 1 + SQRE = 0 + ICMPQ1 = 1 + ICMPQ2 = 0 + NSUB = 0 +* + DO 170 I = 1, N + IF( ABS( D( I ) ).LT.EPS ) THEN + D( I ) = SIGN( EPS, D( I ) ) + END IF + 170 CONTINUE +* + DO 240 I = 1, NM1 + IF( ( ABS( E( I ) ).LT.EPS ) .OR. ( I.EQ.NM1 ) ) THEN + NSUB = NSUB + 1 + IWORK( NSUB ) = ST +* +* Subproblem found. First determine its size and then +* apply divide and conquer on it. +* + IF( I.LT.NM1 ) THEN +* +* A subproblem with E(I) small for I < NM1. +* + NSIZE = I - ST + 1 + IWORK( SIZEI+NSUB-1 ) = NSIZE + ELSE IF( ABS( E( I ) ).GE.EPS ) THEN +* +* A subproblem with E(NM1) not too small but I = NM1. +* + NSIZE = N - ST + 1 + IWORK( SIZEI+NSUB-1 ) = NSIZE + ELSE +* +* A subproblem with E(NM1) small. This implies an +* 1-by-1 subproblem at D(N), which is not solved +* explicitly. +* + NSIZE = I - ST + 1 + IWORK( SIZEI+NSUB-1 ) = NSIZE + NSUB = NSUB + 1 + IWORK( NSUB ) = N + IWORK( SIZEI+NSUB-1 ) = 1 + CALL ZCOPY( NRHS, B( N, 1 ), LDB, WORK( BX+NM1 ), N ) + END IF + ST1 = ST - 1 + IF( NSIZE.EQ.1 ) THEN +* +* This is a 1-by-1 subproblem and is not solved +* explicitly. +* + CALL ZCOPY( NRHS, B( ST, 1 ), LDB, WORK( BX+ST1 ), N ) + ELSE IF( NSIZE.LE.SMLSIZ ) THEN +* +* This is a small subproblem and is solved by DLASDQ. +* + CALL DLASET( 'A', NSIZE, NSIZE, ZERO, ONE, + $ RWORK( VT+ST1 ), N ) + CALL DLASET( 'A', NSIZE, NSIZE, ZERO, ONE, + $ RWORK( U+ST1 ), N ) + CALL DLASDQ( 'U', 0, NSIZE, NSIZE, NSIZE, 0, D( ST ), + $ E( ST ), RWORK( VT+ST1 ), N, RWORK( U+ST1 ), + $ N, RWORK( NRWORK ), 1, RWORK( NRWORK ), + $ INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF +* +* In the real version, B is passed to DLASDQ and multiplied +* internally by Q'. Here B is complex and that product is +* computed below in two steps (real and imaginary parts). +* + J = IRWB - 1 + DO 190 JCOL = 1, NRHS + DO 180 JROW = ST, ST + NSIZE - 1 + J = J + 1 + RWORK( J ) = DBLE( B( JROW, JCOL ) ) + 180 CONTINUE + 190 CONTINUE + CALL DGEMM( 'T', 'N', NSIZE, NRHS, NSIZE, ONE, + $ RWORK( U+ST1 ), N, RWORK( IRWB ), NSIZE, + $ ZERO, RWORK( IRWRB ), NSIZE ) + J = IRWB - 1 + DO 210 JCOL = 1, NRHS + DO 200 JROW = ST, ST + NSIZE - 1 + J = J + 1 + RWORK( J ) = DIMAG( B( JROW, JCOL ) ) + 200 CONTINUE + 210 CONTINUE + CALL DGEMM( 'T', 'N', NSIZE, NRHS, NSIZE, ONE, + $ RWORK( U+ST1 ), N, RWORK( IRWB ), NSIZE, + $ ZERO, RWORK( IRWIB ), NSIZE ) + JREAL = IRWRB - 1 + JIMAG = IRWIB - 1 + DO 230 JCOL = 1, NRHS + DO 220 JROW = ST, ST + NSIZE - 1 + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + B( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 220 CONTINUE + 230 CONTINUE +* + CALL ZLACPY( 'A', NSIZE, NRHS, B( ST, 1 ), LDB, + $ WORK( BX+ST1 ), N ) + ELSE +* +* A large problem. Solve it using divide and conquer. +* + CALL DLASDA( ICMPQ1, SMLSIZ, NSIZE, SQRE, D( ST ), + $ E( ST ), RWORK( U+ST1 ), N, RWORK( VT+ST1 ), + $ IWORK( K+ST1 ), RWORK( DIFL+ST1 ), + $ RWORK( DIFR+ST1 ), RWORK( Z+ST1 ), + $ RWORK( POLES+ST1 ), IWORK( GIVPTR+ST1 ), + $ IWORK( GIVCOL+ST1 ), N, IWORK( PERM+ST1 ), + $ RWORK( GIVNUM+ST1 ), RWORK( C+ST1 ), + $ RWORK( S+ST1 ), RWORK( NRWORK ), + $ IWORK( IWK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + BXST = BX + ST1 + CALL ZLALSA( ICMPQ2, SMLSIZ, NSIZE, NRHS, B( ST, 1 ), + $ LDB, WORK( BXST ), N, RWORK( U+ST1 ), N, + $ RWORK( VT+ST1 ), IWORK( K+ST1 ), + $ RWORK( DIFL+ST1 ), RWORK( DIFR+ST1 ), + $ RWORK( Z+ST1 ), RWORK( POLES+ST1 ), + $ IWORK( GIVPTR+ST1 ), IWORK( GIVCOL+ST1 ), N, + $ IWORK( PERM+ST1 ), RWORK( GIVNUM+ST1 ), + $ RWORK( C+ST1 ), RWORK( S+ST1 ), + $ RWORK( NRWORK ), IWORK( IWK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + END IF + ST = I + 1 + END IF + 240 CONTINUE +* +* Apply the singular values and treat the tiny ones as zero. +* + TOL = RCND*ABS( D( IDAMAX( N, D, 1 ) ) ) +* + DO 250 I = 1, N +* +* Some of the elements in D can be negative because 1-by-1 +* subproblems were not solved explicitly. +* + IF( ABS( D( I ) ).LE.TOL ) THEN + CALL ZLASET( 'A', 1, NRHS, CZERO, CZERO, WORK( BX+I-1 ), N ) + ELSE + RANK = RANK + 1 + CALL ZLASCL( 'G', 0, 0, D( I ), ONE, 1, NRHS, + $ WORK( BX+I-1 ), N, INFO ) + END IF + D( I ) = ABS( D( I ) ) + 250 CONTINUE +* +* Now apply back the right singular vectors. +* + ICMPQ2 = 1 + DO 320 I = 1, NSUB + ST = IWORK( I ) + ST1 = ST - 1 + NSIZE = IWORK( SIZEI+I-1 ) + BXST = BX + ST1 + IF( NSIZE.EQ.1 ) THEN + CALL ZCOPY( NRHS, WORK( BXST ), N, B( ST, 1 ), LDB ) + ELSE IF( NSIZE.LE.SMLSIZ ) THEN +* +* Since B and BX are complex, the following call to DGEMM +* is performed in two steps (real and imaginary parts). +* +* CALL DGEMM( 'T', 'N', NSIZE, NRHS, NSIZE, ONE, +* $ RWORK( VT+ST1 ), N, RWORK( BXST ), N, ZERO, +* $ B( ST, 1 ), LDB ) +* + J = BXST - N - 1 + JREAL = IRWB - 1 + DO 270 JCOL = 1, NRHS + J = J + N + DO 260 JROW = 1, NSIZE + JREAL = JREAL + 1 + RWORK( JREAL ) = DBLE( WORK( J+JROW ) ) + 260 CONTINUE + 270 CONTINUE + CALL DGEMM( 'T', 'N', NSIZE, NRHS, NSIZE, ONE, + $ RWORK( VT+ST1 ), N, RWORK( IRWB ), NSIZE, ZERO, + $ RWORK( IRWRB ), NSIZE ) + J = BXST - N - 1 + JIMAG = IRWB - 1 + DO 290 JCOL = 1, NRHS + J = J + N + DO 280 JROW = 1, NSIZE + JIMAG = JIMAG + 1 + RWORK( JIMAG ) = DIMAG( WORK( J+JROW ) ) + 280 CONTINUE + 290 CONTINUE + CALL DGEMM( 'T', 'N', NSIZE, NRHS, NSIZE, ONE, + $ RWORK( VT+ST1 ), N, RWORK( IRWB ), NSIZE, ZERO, + $ RWORK( IRWIB ), NSIZE ) + JREAL = IRWRB - 1 + JIMAG = IRWIB - 1 + DO 310 JCOL = 1, NRHS + DO 300 JROW = ST, ST + NSIZE - 1 + JREAL = JREAL + 1 + JIMAG = JIMAG + 1 + B( JROW, JCOL ) = DCMPLX( RWORK( JREAL ), + $ RWORK( JIMAG ) ) + 300 CONTINUE + 310 CONTINUE + ELSE + CALL ZLALSA( ICMPQ2, SMLSIZ, NSIZE, NRHS, WORK( BXST ), N, + $ B( ST, 1 ), LDB, RWORK( U+ST1 ), N, + $ RWORK( VT+ST1 ), IWORK( K+ST1 ), + $ RWORK( DIFL+ST1 ), RWORK( DIFR+ST1 ), + $ RWORK( Z+ST1 ), RWORK( POLES+ST1 ), + $ IWORK( GIVPTR+ST1 ), IWORK( GIVCOL+ST1 ), N, + $ IWORK( PERM+ST1 ), RWORK( GIVNUM+ST1 ), + $ RWORK( C+ST1 ), RWORK( S+ST1 ), + $ RWORK( NRWORK ), IWORK( IWK ), INFO ) + IF( INFO.NE.0 ) THEN + RETURN + END IF + END IF + 320 CONTINUE +* +* Unscale and sort the singular values. +* + CALL DLASCL( 'G', 0, 0, ONE, ORGNRM, N, 1, D, N, INFO ) + CALL DLASRT( 'D', N, D, INFO ) + CALL ZLASCL( 'G', 0, 0, ORGNRM, ONE, N, NRHS, B, LDB, INFO ) +* + RETURN +* +* End of ZLALSD +* + END diff -r c3b479e753dd -r b48d486f641d liboctave/CMatrix.cc --- a/liboctave/CMatrix.cc Fri Oct 26 15:14:35 2007 +0000 +++ b/liboctave/CMatrix.cc Fri Oct 26 15:52:58 2007 +0000 @@ -120,10 +120,17 @@ F77_CHAR_ARG_LEN_DECL); F77_RET_T - F77_FUNC (zgelss, ZGELSS) (const octave_idx_type&, const octave_idx_type&, const octave_idx_type&, + F77_FUNC (zgelsy, ZGELSY) (const octave_idx_type&, const octave_idx_type&, const octave_idx_type&, + Complex*, const octave_idx_type&, Complex*, + const octave_idx_type&, octave_idx_type*, double&, octave_idx_type&, + Complex*, const octave_idx_type&, double*, octave_idx_type&); + + F77_RET_T + F77_FUNC (zgelsd, ZGELSD) (const octave_idx_type&, const octave_idx_type&, const octave_idx_type&, Complex*, const octave_idx_type&, Complex*, const octave_idx_type&, double*, double&, octave_idx_type&, - Complex*, const octave_idx_type&, double*, octave_idx_type&); + Complex*, const octave_idx_type&, double*, + octave_idx_type*, octave_idx_type&); F77_RET_T F77_FUNC (zpotrf, ZPOTRF) (F77_CONST_CHAR_ARG_DECL, const octave_idx_type&, @@ -2436,62 +2443,63 @@ retval = ComplexMatrix (n, b.cols (), Complex (0.0, 0.0)); else { + volatile octave_idx_type minmn = (m < n ? m : n); + octave_idx_type maxmn = m > n ? m : n; + double rcond = -1.0; + + if (m != n) + { + retval = ComplexMatrix (maxmn, nrhs); + + for (octave_idx_type j = 0; j < nrhs; j++) + for (octave_idx_type i = 0; i < m; i++) + retval.elem (i, j) = b.elem (i, j); + } + else + retval = b; + ComplexMatrix atmp = *this; Complex *tmp_data = atmp.fortran_vec (); - octave_idx_type nrr = m > n ? m : n; - ComplexMatrix result (nrr, nrhs); - - for (octave_idx_type j = 0; j < nrhs; j++) - for (octave_idx_type i = 0; i < m; i++) - result.elem (i, j) = b.elem (i, j); - - Complex *presult = result.fortran_vec (); - - octave_idx_type len_s = m < n ? m : n; - Array s (len_s); + Complex *pretval = retval.fortran_vec (); + Array s (minmn); double *ps = s.fortran_vec (); - double rcond = -1.0; - - octave_idx_type lrwork = (5 * (m < n ? m : n)) - 4; - lrwork = lrwork > 1 ? lrwork : 1; - Array rwork (lrwork); - double *prwork = rwork.fortran_vec (); - - // Ask ZGELSS what the dimension of WORK should be. - + // Ask ZGELSD what the dimension of WORK should be. octave_idx_type lwork = -1; Array work (1); - - F77_XFCN (zgelss, ZGELSS, (m, n, nrhs, tmp_data, m, presult, - nrr, ps, rcond, rank, - work.fortran_vec (), lwork, prwork, - info)); + Array rwork (1); + Array iwork (1); + + F77_XFCN (zgelsd, ZGELSD, (m, n, nrhs, tmp_data, m, pretval, maxmn, + ps, rcond, rank, work.fortran_vec (), + lwork, rwork.fortran_vec (), + iwork.fortran_vec (), info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) ("unrecoverable error in zgelss"); + (*current_liboctave_error_handler) + ("unrecoverable error in zgelsd"); else { lwork = static_cast (std::real (work(0))); work.resize (lwork); - - F77_XFCN (zgelss, ZGELSS, (m, n, nrhs, tmp_data, m, presult, - nrr, ps, rcond, rank, - work.fortran_vec (), lwork, - prwork, info)); + rwork.resize (static_cast (rwork(0))); + iwork.resize (iwork(0)); + + F77_XFCN (zgelsd, ZGELSD, (m, n, nrhs, tmp_data, m, pretval, + maxmn, ps, rcond, rank, + work.fortran_vec (), lwork, + rwork.fortran_vec (), + iwork.fortran_vec (), info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) - ("unrecoverable error in zgelss"); - else - { - retval.resize (n, nrhs); - for (octave_idx_type j = 0; j < nrhs; j++) - for (octave_idx_type i = 0; i < n; i++) - retval.elem (i, j) = result.elem (i, j); - } + (*current_liboctave_error_handler) + ("unrecoverable error in zgelsd"); + else if (rank < minmn) + (*current_liboctave_warning_handler) + ("zgelsd: rank deficient %dx%d matrix, rank = %d, tol = %e", + m, n, rank, rcond); } } @@ -2552,60 +2560,62 @@ retval = ComplexColumnVector (n, Complex (0.0, 0.0)); else { + volatile octave_idx_type minmn = (m < n ? m : n); + octave_idx_type maxmn = m > n ? m : n; + double rcond = -1.0; + + if (m != n) + { + retval = ComplexColumnVector (maxmn); + + for (octave_idx_type i = 0; i < m; i++) + retval.elem (i) = b.elem (i); + } + else + retval = b; + ComplexMatrix atmp = *this; Complex *tmp_data = atmp.fortran_vec (); - octave_idx_type nrr = m > n ? m : n; - ComplexColumnVector result (nrr); - - for (octave_idx_type i = 0; i < m; i++) - result.elem (i) = b.elem (i); - - Complex *presult = result.fortran_vec (); - - octave_idx_type len_s = m < n ? m : n; - Array s (len_s); + Complex *pretval = retval.fortran_vec (); + Array s (minmn); double *ps = s.fortran_vec (); - double rcond = -1.0; - - octave_idx_type lrwork = (5 * (m < n ? m : n)) - 4; - lrwork = lrwork > 1 ? lrwork : 1; - Array rwork (lrwork); - double *prwork = rwork.fortran_vec (); - - // Ask ZGELSS what the dimension of WORK should be. - + // Ask ZGELSD what the dimension of WORK should be. octave_idx_type lwork = -1; Array work (1); - - F77_XFCN (zgelss, ZGELSS, (m, n, nrhs, tmp_data, m, presult, - nrr, ps, rcond, rank, - work.fortran_vec (), lwork, prwork, - info)); + Array rwork (1); + Array iwork (1); + + F77_XFCN (zgelsd, ZGELSD, (m, n, nrhs, tmp_data, m, pretval, maxmn, + ps, rcond, rank, work.fortran_vec (), + lwork, rwork.fortran_vec (), + iwork.fortran_vec (), info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) ("unrecoverable error in zgelss"); + (*current_liboctave_error_handler) + ("unrecoverable error in zgelsd"); else { - lwork = static_cast (std::real (work(0))); + lwork = static_cast (std::real (work(0))); work.resize (lwork); - - F77_XFCN (zgelss, ZGELSS, (m, n, nrhs, tmp_data, m, presult, - nrr, ps, rcond, rank, - work.fortran_vec (), lwork, - prwork, info)); + rwork.resize (static_cast (rwork(0))); + iwork.resize (iwork(0)); + + F77_XFCN (zgelsd, ZGELSD, (m, n, nrhs, tmp_data, m, pretval, + maxmn, ps, rcond, rank, + work.fortran_vec (), lwork, + rwork.fortran_vec (), + iwork.fortran_vec (), info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) - ("unrecoverable error in zgelss"); - else - { - retval.resize (n); - for (octave_idx_type i = 0; i < n; i++) - retval.elem (i) = result.elem (i); - } + (*current_liboctave_error_handler) + ("unrecoverable error in zgelsd"); + else if (rank < minmn) + (*current_liboctave_warning_handler) + ("zgelsd: rank deficient %dx%d matrix, rank = %d, tol = %e", + m, n, rank, rcond); } } diff -r c3b479e753dd -r b48d486f641d liboctave/ChangeLog --- a/liboctave/ChangeLog Fri Oct 26 15:14:35 2007 +0000 +++ b/liboctave/ChangeLog Fri Oct 26 15:52:58 2007 +0000 @@ -1,7 +1,8 @@ -2007-09-26 John W. Eaton - - * dMatrix.cc (lssolve): Revert change of 2007-09-26. - * CMatrix.cc (lssolve): Ditto. +2007-10-26 David Bateman + + * dMatrix.cc (Matrix::lssolve): Use xGELSD for rank deficient + matrices to avoid reliability issues with xGELSY. + * CMatrix.cc (ComplexMatrix::lssolve): Likewise. 2007-10-25 John W. Eaton diff -r c3b479e753dd -r b48d486f641d liboctave/dMatrix.cc --- a/liboctave/dMatrix.cc Fri Oct 26 15:14:35 2007 +0000 +++ b/liboctave/dMatrix.cc Fri Oct 26 15:52:58 2007 +0000 @@ -117,10 +117,17 @@ F77_CHAR_ARG_LEN_DECL); F77_RET_T - F77_FUNC (dgelss, DGELSS) (const octave_idx_type&, const octave_idx_type&, const octave_idx_type&, + F77_FUNC (dgelsy, DGELSY) (const octave_idx_type&, const octave_idx_type&, const octave_idx_type&, + double*, const octave_idx_type&, double*, + const octave_idx_type&, octave_idx_type*, double&, octave_idx_type&, + double*, const octave_idx_type&, octave_idx_type&); + + F77_RET_T + F77_FUNC (dgelsd, DGELSD) (const octave_idx_type&, const octave_idx_type&, const octave_idx_type&, double*, const octave_idx_type&, double*, const octave_idx_type&, double*, double&, octave_idx_type&, - double*, const octave_idx_type&, octave_idx_type&); + double*, const octave_idx_type&, octave_idx_type*, + octave_idx_type&); F77_RET_T F77_FUNC (dpotrf, DPOTRF) (F77_CONST_CHAR_ARG_DECL, const octave_idx_type&, @@ -2043,7 +2050,8 @@ } Matrix -Matrix::lssolve (const Matrix& b, octave_idx_type& info, octave_idx_type& rank) const +Matrix::lssolve (const Matrix& b, octave_idx_type& info, + octave_idx_type& rank) const { Matrix retval; @@ -2052,7 +2060,6 @@ octave_idx_type m = rows (); octave_idx_type n = cols (); - if (m != b.rows ()) (*current_liboctave_error_handler) ("matrix dimension mismatch solution of linear equations"); @@ -2060,55 +2067,60 @@ retval = Matrix (n, b.cols (), 0.0); else { + volatile octave_idx_type minmn = (m < n ? m : n); + octave_idx_type maxmn = m > n ? m : n; + double rcond = -1.0; + if (m != n) + { + retval = Matrix (maxmn, nrhs, 0.0); + + for (octave_idx_type j = 0; j < nrhs; j++) + for (octave_idx_type i = 0; i < m; i++) + retval.elem (i, j) = b.elem (i, j); + } + else + retval = b; + Matrix atmp = *this; double *tmp_data = atmp.fortran_vec (); - octave_idx_type nrr = m > n ? m : n; - Matrix result (nrr, nrhs, 0.0); - - for (octave_idx_type j = 0; j < nrhs; j++) - for (octave_idx_type i = 0; i < m; i++) - result.elem (i, j) = b.elem (i, j); - - double *presult = result.fortran_vec (); - - octave_idx_type len_s = m < n ? m : n; - Array s (len_s); + double *pretval = retval.fortran_vec (); + Array s (minmn); double *ps = s.fortran_vec (); - double rcond = -1.0; - - // Ask DGELSS what the dimension of WORK should be. - + // Ask DGELSD what the dimension of WORK should be. octave_idx_type lwork = -1; Array work (1); - F77_XFCN (dgelss, DGELSS, (m, n, nrhs, tmp_data, m, presult, nrr, ps, - rcond, rank, work.fortran_vec (), - lwork, info)); + // FIXME: Can SMLSIZ be other than 25? + octave_idx_type liwork = 3 * minmn * 25 + 11 * minmn; + Array iwork (liwork); + octave_idx_type* piwork = iwork.fortran_vec (); + + F77_XFCN (dgelsd, DGELSD, (m, n, nrhs, tmp_data, m, pretval, maxmn, + ps, rcond, rank, work.fortran_vec (), + lwork, piwork, info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) ("unrecoverable error in dgelss"); + (*current_liboctave_error_handler) + ("unrecoverable error in dgelsd"); else { lwork = static_cast (work(0)); work.resize (lwork); - F77_XFCN (dgelss, DGELSS, (m, n, nrhs, tmp_data, m, presult, - nrr, ps, rcond, rank, - work.fortran_vec (), lwork, info)); + F77_XFCN (dgelsd, DGELSD, (m, n, nrhs, tmp_data, m, pretval, + maxmn, ps, rcond, rank, + work.fortran_vec (), lwork, + piwork, info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) - ("unrecoverable error in dgelss"); - else - { - retval.resize (n, nrhs); - for (octave_idx_type j = 0; j < nrhs; j++) - for (octave_idx_type i = 0; i < n; i++) - retval.elem (i, j) = result.elem (i, j); - } + (*current_liboctave_error_handler) + ("unrecoverable error in dgelsd"); + else if (rank < minmn) + (*current_liboctave_warning_handler) + ("dgelsd: rank deficient %dx%d matrix, rank = %d", m, n, rank); } } @@ -2155,7 +2167,8 @@ } ColumnVector -Matrix::lssolve (const ColumnVector& b, octave_idx_type& info, octave_idx_type& rank) const +Matrix::lssolve (const ColumnVector& b, octave_idx_type& info, + octave_idx_type& rank) const { ColumnVector retval; @@ -2171,53 +2184,60 @@ retval = ColumnVector (n, 0.0); else { + volatile octave_idx_type minmn = (m < n ? m : n); + octave_idx_type maxmn = m > n ? m : n; + double rcond = -1.0; + + if (m != n) + { + retval = ColumnVector (maxmn, 0.0); + + for (octave_idx_type i = 0; i < m; i++) + retval.elem (i) = b.elem (i); + } + else + retval = b; + Matrix atmp = *this; double *tmp_data = atmp.fortran_vec (); - octave_idx_type nrr = m > n ? m : n; - ColumnVector result (nrr); - - for (octave_idx_type i = 0; i < m; i++) - result.elem (i) = b.elem (i); - - double *presult = result.fortran_vec (); - - octave_idx_type len_s = m < n ? m : n; - Array s (len_s); + double *pretval = retval.fortran_vec (); + Array s (minmn); double *ps = s.fortran_vec (); - double rcond = -1.0; - - // Ask DGELSS what the dimension of WORK should be. - + // Ask DGELSD what the dimension of WORK should be. octave_idx_type lwork = -1; Array work (1); - F77_XFCN (dgelss, DGELSS, (m, n, nrhs, tmp_data, m, presult, nrr, ps, - rcond, rank, work.fortran_vec (), - lwork, info)); + // FIXME: Can SMLSIZ be other than 25? + octave_idx_type liwork = 3 * minmn * 25 + 11 * minmn; + Array iwork (liwork); + octave_idx_type* piwork = iwork.fortran_vec (); + + F77_XFCN (dgelsd, DGELSD, (m, n, nrhs, tmp_data, m, pretval, maxmn, + ps, rcond, rank, work.fortran_vec (), + lwork, piwork, info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) ("unrecoverable error in dgelss"); + (*current_liboctave_error_handler) + ("unrecoverable error in dgelsd"); else { lwork = static_cast (work(0)); work.resize (lwork); - F77_XFCN (dgelss, DGELSS, (m, n, nrhs, tmp_data, m, presult, - nrr, ps, rcond, rank, - work.fortran_vec (), lwork, info)); + F77_XFCN (dgelsd, DGELSD, (m, n, nrhs, tmp_data, m, pretval, + maxmn, ps, rcond, rank, + work.fortran_vec (), lwork, + piwork, info)); if (f77_exception_encountered) - (*current_liboctave_error_handler) - ("unrecoverable error in dgelss"); - else - { - retval.resize (n); - for (octave_idx_type i = 0; i < n; i++) - retval.elem (i) = result.elem (i); - } + (*current_liboctave_error_handler) + ("unrecoverable error in dgelsd"); + else if (rank < minmn) + (*current_liboctave_warning_handler) + ("dgelsd: rank deficient %dx%d matrix, rank = %d", m, n, rank); } }