Mercurial > octave-nkf

%-------------------------------------------------------------------------------
% The UserGuide.stex file.  Processed into UserGuide.tex via sed.
%-------------------------------------------------------------------------------

\documentclass[11pt]{article}

\newcommand{\m}[1]{{\bf{#1}}}       % for matrices and vectors
\newcommand{\tr}{^{\sf T}}          % transpose
\newcommand{\he}{^{\sf H}}          % complex conjugate transpose
\newcommand{\implies}{\rightarrow}

\topmargin 0in
\textheight 9in
\oddsidemargin 0pt
\evensidemargin 0pt
\textwidth 6.5in

\begin{document}

\author{Timothy A. Davis \\
Dept. of Computer and Information Science and Engineering \\
Univ. of Florida, Gainesville, FL}
\title{UMFPACK Version 4.4 User Guide}
\date{Jan. 28, 2005}
\maketitle

%-------------------------------------------------------------------------------
\begin{abstract}
    UMFPACK is a set of routines for solving unsymmetric sparse linear
    systems, $\m{Ax}=\m{b}$, using the Unsymmetric MultiFrontal method
    and direct sparse LU factorization.  It is written in ANSI/ISO C, with a
    MATLAB interface.  UMFPACK relies on the Level-3 Basic
    Linear Algebra Subprograms (dense matrix multiply) for its performance.
    This code works on Windows and many versions of Unix (Sun Solaris,
    Red Hat Linux, IBM AIX, SGI IRIX, and Compaq Alpha).
\end{abstract}
%-------------------------------------------------------------------------------

Technical Report TR-04-003 (revised)

UMFPACK Version 4.4 (Jan. 28, 2005), Copyright\copyright 2005 by Timothy A.
Davis.  All Rights Reserved.

{\bf UMFPACK License:}
    Your use or distribution of UMFPACK or any modified version of
    UMFPACK implies that you agree to this License.

    THIS MATERIAL IS PROVIDED AS IS, WITH ABSOLUTELY NO WARRANTY
    EXPRESSED OR IMPLIED.  ANY USE IS AT YOUR OWN RISK.

    Permission is hereby granted to use or copy this program, provided
    that the Copyright, this License, and the Availability of the original
    version is retained on all copies.  User documentation of any code that
    uses UMFPACK or any modified version of UMFPACK code must cite the
    Copyright, this License, the Availability note, and ``Used by permission.''
    Permission to modify the code and to distribute modified code is granted,
    provided the Copyright, this License, and the Availability note are
    retained, and a notice that the code was modified is included.  This
    software was developed with support from the National Science Foundation,
    and is provided to you free of charge.

{\bf Availability:}
    http://www.cise.ufl.edu/research/sparse/umfpack

{\bf Acknowledgments:}

    This work was supported by the National Science Foundation, under
    grants DMS-9504974, DMS-9803599, and CCR-0203270.
    The upgrade to Version 4.1 and the inclusion of the
    symmetric and 2-by-2 pivoting strategies
    were done while the author was on sabbatical at
    Stanford University and Lawrence Berkeley National Laboratory.

%-------------------------------------------------------------------------------
\newpage
%-------------------------------------------------------------------------------

\tableofcontents

%-------------------------------------------------------------------------------
\newpage
\section{Overview}
%-------------------------------------------------------------------------------

UMFPACK\footnote{Pronounced with two syllables: umph-pack}
Version 4.4 is a set of routines for solving systems of linear
equations, $\m{Ax}=\m{b}$, when $\m{A}$ is sparse and unsymmetric.  It is based
on the Unsymmetric-pattern MultiFrontal method \cite{DavisDuff97,DavisDuff99}.
UMFPACK factorizes
$\m{PAQ}$, $\m{PRAQ}$, or $\m{PR}^{-1}\m{AQ}$ into the product $\m{LU}$,
where $\m{L}$ and $\m{U}$
are lower and upper triangular, respectively, $\m{P}$ and $\m{Q}$ are
permutation matrices, and $\m{R}$ is a diagonal matrix of row scaling factors
(or $\m{R}=\m{I}$ if row-scaling is not used).  Both $\m{P}$ and $\m{Q}$ are
chosen to reduce fill-in (new nonzeros in $\m{L}$ and $\m{U}$ that are not
present in $\m{A}$).  The permutation $\m{P}$ has the dual role of reducing
fill-in and maintaining numerical accuracy (via relaxed partial pivoting
and row interchanges).

The sparse matrix $\m{A}$ can be square or rectangular, singular
or non-singular, and real or complex (or any combination).  Only square
matrices $\m{A}$ can be used to solve $\m{Ax}=\m{b}$ or related systems.
Rectangular matrices can only be factorized.

UMFPACK first finds a column pre-ordering that reduces fill-in, without regard
to numerical values.  It scales and analyzes the matrix, and then automatically
selects one of three strategies for pre-ordering the rows and columns:
{\em unsymmetric},
{\em 2-by-2}, and
{\em symmetric}.  These strategies are described below.

First, all pivots with zero Markowitz cost are eliminated and placed in the
LU factors.  The remaining submatrix $\m{S}$ is then analyzed.
The following rules are applied, and the first one that matches defines
the strategy.

\begin{itemize}
\item Rule 1: $\m{A}$ rectangular $\implies$ unsymmetric.
\item Rule 2:
    If the zero-Markowitz elimination results in a rectangular $\m{S}$,
    or an $\m{S}$ whose diagonal has not been preserved, the
    unsymmetric strategy is used.
\item The symmetry $\sigma_1$ of $\m{S}$ is computed.  It is defined as
    the number of {\em matched} off-diagonal entries, divided by the
    total number of off-diagonal entries.  An entry $s_{ij}$ is matched
    if $s_{ji}$ is also an entry.  They need not be numerically equal.
    An {\em entry} is a value in $\m{A}$ which is present
    in the input data structure.  All nonzeros are entries, but some entries
    may be numerically zero.
    Rule 3: $\sigma_1 < 0.1 \implies$ unsymmetric.
    The matrix is very unsymmetric.
\item Let $d$ be the number of nonzero entries on the diagonal of $\m{S}$.
    Let $\m{S}$ be $\nu$-by-$\nu$.
    Rule 4: $(\sigma_1 \ge 0.7) \:\wedge\: (d = \nu) \implies$ symmetric.
    The matrix has a nearly symmetric nonzero pattern, and a zero-free
    diagonal.
\end{itemize}

If the strategy has not yet been determined,
the 2-by-2 strategy is attempted.  A row permutation $\m{P}_2$
is found which attempts to reduce the number of small
diagonal entries of $\m{P}_2 \m{S}$.
An entry $s_{ij}$ is determined to be small if
$|s_{ij}| < 0.01 \max |s_{*j}|$, or large otherwise.
If $s_{ii}$ is numerically small, the method attempts to swap
two rows $i$ and $j$, such that both $s_{ij}$ and $s_{ji}$ are large.
Once these rows are swapped,
they remain in place.  Let $\sigma_2$ be the symmetry of $\m{P}_2 \m{S}$,
and let $d_2$ be the number of nonzero entries (either small or large)
on the diagonal of $\m{P}_2 \m{S}$.

\begin{itemize}
\item Rule 5:
    ($\sigma_2 > 1.1 \sigma_1) \:\wedge\: (d_2 > 0.9 \nu) \implies$ 2-by-2.
    The 2-by-2 permutation has made the matrix significantly more symmetric.
\item Rule 6: $\sigma_2 < 0.7 \sigma_1 \implies$ unsymmetric.
    The 2-by-2 strategy has significantly deteriorated the symmetry,
\item Rule 7: $\sigma_2 < 0.25 \implies$ unsymmetric.
    The matrix is still very unsymmetric.
\item Rule 8: $\sigma_2 \ge 0.51 \implies$ 2-by-2.
    The matrix is roughly symmetric.
\item Rule 9: $\sigma_2 \ge 0.999 \sigma_1 \implies$ 2-by-2.
    The 2-by-2 permutation has preserved symmetry, or made it only
    slightly worse.
\item Rule 10: if no rule has yet triggered, use the unsymmetric strategy.
\end{itemize}

Each strategy is described below:
\begin{itemize}
\item {\em unsymmetric}:
The column pre-ordering of $\m{S}$ is computed by a modified version of COLAMD
\cite{DavisGilbertLarimoreNg00_algo,DavisGilbertLarimoreNg00,Larimore98}.
The method finds a symmetric permutation $\m{Q}$ of the matrix $\m{S}\tr\m{S}$
(without forming $\m{S}\tr\m{S}$ explicitly).  This is a good choice for
$\m{Q}$, since the Cholesky factors of $\m{(SQ)\tr(SQ)}$ are an upper bound (in
terms of nonzero pattern) of the factor $\m{U}$ for the unsymmetric LU
factorization ($\m{PSQ}=\m{LU}$) regardless of the choice of $\m{P}$
\cite{GeorgeNg85,GeorgeNg87,GilbertNg93}.  This modified version of
COLAMD also computes the column elimination tree and post-orders the
tree.  It finds the upper bound on the number of nonzeros in L and U.
It also has a different threshold for determining dense rows and columns.
During factorization, the column pre-ordering can be modified.
Columns within a single super-column can be reshuffled, to reduce fill-in.
Threshold partial pivoting is used with no preference given to the diagonal
entry.  Within a given pivot column $j$, an entry $a_{ij}$ can be chosen if
$|a_{ij}| \ge 0.1 \max |a_{*j}|$.  Among those numerically acceptable
entries, the sparsest row $i$ is chosen as the pivot row.

\item {\em 2-by-2}:
The symmetric strategy (see below) is applied to the matrix $\m{P}_2 \m{S}$,
rather than $\m{S}$.

\item {\em symmetric}:
The column ordering is computed from AMD
\cite{AmestoyDavisDuff96,AmestoyDavisDuff03},
applied to the pattern of $\m{S}+\m{S}\tr$
followed by a post-ordering of the supernodal elimination
tree of $\m{S}+\m{S}\tr$.
No modification of the column pre-ordering is made during numerical
factorization.  Threshold partial pivoting is used, with a strong
preference given to the diagonal entry.  The diagonal entry is chosen if
$a_{jj} \ge 0.001 \max |a_{*j}|$.  Otherwise, a sparse row is selected,
using the same method used by the unsymmetric strategy.

\end{itemize}

The symmetric and 2-by-2 strategies, and their automatic selection,
are new to Version 4.1.  Version 4.0 only used the unsymmetric strategy.

Once the strategy is selected,
the factorization of the matrix $\m{A}$ is broken down into the factorization
of a sequence of dense rectangular frontal matrices.  The frontal matrices are
related to each other by a supernodal column elimination tree, in which each
node in the tree represents one frontal matrix.  This analysis phase also
determines upper bounds on the memory usage, the floating-point operation count,
and the number of nonzeros in the LU factors.

UMFPACK factorizes each {\em chain} of frontal matrices in a single working
array, similar to how the unifrontal method \cite{dusc:96} factorizes the whole
matrix.  A chain of frontal matrices is a sequence of fronts where the parent
of front $i$ is $i$+1 in the supernodal column elimination tree.  For the
nonsingular matrices factorized with the unsymmetric strategy, there are
exactly the same number of chains as there are leaves in the supernodal
column elimination tree.  UMFPACK is an
outer-product based, right-looking method.  At the $k$-th step of Gaussian
elimination, it represents the updated submatrix $\m{A}_k$ as an implicit
summation of a set of dense sub-matrices (referred to as {\em elements},
borrowing a phrase from finite-element methods) that arise when the frontal
matrices are factorized and their pivot rows and columns eliminated.

Each frontal matrix represents the elimination of one or more columns;
each column of $\m{A}$ will be eliminated in a specific frontal matrix,
and which frontal matrix will be used for which column is determined by
the pre-analysis phase.  The pre-analysis phase also determines the worst-case
size of each frontal matrix so that they can hold any candidate pivot column
and any candidate pivot row.  From the perspective of the analysis phase, any
candidate pivot column in the frontal matrix is identical (in terms of nonzero
pattern), and so is any row.  However, the numeric factorization phase has
more information than the analysis phase.  It uses this information to reorder
the columns within each frontal matrix to reduce fill-in.  Similarly, since
the number of nonzeros in each row and column are maintained (more precisely,
COLMMD-style approximate degrees \cite{GilbertMolerSchreiber}), a pivot row can
be selected based on sparsity-preserving criteria (low degree) as well as
numerical considerations (relaxed threshold partial pivoting).

When the symmetric or 2-by-2 strategies are used,
the column preordering is not refined during numeric factorization.
Row pivoting for sparsity and numerical accuracy is performed if the
diagonal entry is too small.

More details of the method, including experimental results, are
described in \cite{Davis03,Davis03_algo}, available at
http://www.cise.ufl.edu/tech-reports.

%-------------------------------------------------------------------------------
\section{Availability}
%-------------------------------------------------------------------------------

In addition to appearing as a Collected Algorithm of the ACM,
UMFPACK Version 4.4 is available at http://www.cise.ufl.edu/research/sparse.
Version 4.3 is included as a built-in routine in MATLAB
7.1.  Version 4.0 (in MATLAB 6.5)
does not have the symmetric or 2-by-2 strategies and it takes
less advantage of the level-3
BLAS \cite{DaydeDuff99,ACM679a,ATLAS,GotoVandeGeijn02}.
Version 4.4 (and v4.3 through v4.1) tend to be much faster than Version 4.0,
particularly on unsymmetric matrices with mostly symmetric
nonzero pattern (such as finite element and circuit simulation matrices).
Version 3.0 and following make
use of a modified version of COLAMD V2.0 by Timothy A.~Davis, Stefan
Larimore, John Gilbert, and Esmond Ng.  The original COLAMD V2.1 is available in
as a built-in routine in MATLAB V6.0 (or later), and at
http://www.cise.ufl.edu/research/sparse.
These codes are also available in Netlib \cite{netlib} at
http://www.netlib.org.
UMFPACK Versions 2.2.1 and earlier, co-authored with Iain Duff,
are available at http://www.cise.ufl.edu/research/sparse and as
MA38 (functionally equivalent to Version 2.2.1) in the Harwell
Subroutine Library.

%-------------------------------------------------------------------------------
\section{Primary changes from prior versions}
%-------------------------------------------------------------------------------

A detailed list of changes is in the {\tt ChangeLog} file.

%-------------------------------------------------------------------------------
\subsection{Version 4.4}
%-------------------------------------------------------------------------------

Bug fix in strategy selection in {\tt umfpack\_*\_qsymbolic}.
Added packed complex case for all complex input/output arguments.
Added {\tt umfpack\_get\_determinant}.
Added minimal support for Microsoft Visual Studio
(the {\tt umf\_multicompile.c} file).

%-------------------------------------------------------------------------------
\subsection{Version 4.3.1}
%-------------------------------------------------------------------------------

Minor bug fix in the forward/backsolve.  This bug had the effect of turning
off iterative refinement when solving $\m{A}\tr\m{x}=\m{b}$ after factorizing
$\m{A}$.  UMFPACK mexFunction now factorizes $\m{A}\tr$ in its forward-slash
operation.

%-------------------------------------------------------------------------------
\subsection{Version 4.3}
%-------------------------------------------------------------------------------

No changes are visible to the C or MATLAB user, except the presence of
one new control parameter in the {\tt Control} array,
and three new statistics in the {\tt Info} array.
The primary change is the addition of an (optional) drop tolerance.

%-------------------------------------------------------------------------------
\subsection{Version 4.1}
%-------------------------------------------------------------------------------

The following is a summary of the main changes that are visible to the C
or MATLAB user:

\begin{enumerate}

\item New ordering strategies added.  No changes are required in user code
    (either C or MATLAB) to use the new default strategy, which is an automatic
    selection of the unsymmetric, symmetric, or 2-by-2 strategies.

\item Row scaling added.  This is only visible to the MATLAB caller when using
    the form {\tt [L,U,P,Q,R] = umfpack (A)}, to retrieve the LU factors.
    Likewise, it is only visible to the C caller when the LU factors are
    retrieved, or when solving systems with just $\m{L}$ or $\m{U}$.
    New C-callable and MATLAB-callable routines are included to get and to
    apply the scale factors computed by UMFPACK.  Row scaling is enabled by
    default, but can be disabled.  Row scaling usually leads to a better
    factorization, particularly when the symmetric strategy is used.

\item Error code {\tt UMFPACK\_ERROR\_problem\_to\_large} removed.
    Version 4.0 would generate this error when the upper bound memory usage
    exceeded 2GB (for the {\tt int} version), even when the actual memory
    usage was less than this.  The new version properly handles this case,
    and can successfully factorize the matrix if sufficient memory is
    available.

\item New control parameters and statistics provided.

\item The AMD symmetric approximate minimum degree ordering routine added
    \cite{AmestoyDavisDuff96,AmestoyDavisDuff03}.
    It is used by UMFPACK, and can also be called independently from C or
    MATLAB.

\item The {\tt umfpack} mexFunction now returns permutation matrices, not
    permutation vectors, when using the form {\tt [L,U,P,Q] = umfpack (A)}
    or the new form {\tt [L,U,P,Q,R] = umfpack (A)}.

\item New arguments added to the user-callable routines
    {\tt umfpack\_*\_symbolic},
    {\tt umfpack\_*\_qsymbolic},
    {\tt umfpack\_*\_get\_numeric}, and
    {\tt umfpack\_*\_get\_symbolic}.
    The symbolic analysis now makes use of the numerical values of the matrix
    $\m{A}$, to guide the 2-by-2 strategy.  The subsequent matrix passed to
    the numeric factorization step does not have to have the same numerical
    values.  All of the new arguments are optional.  If you do not wish to
    include them, simply pass {\tt NULL} pointers instead.  The 2-by-2 strategy
    will assume all entries are numerically large, for example.

\item New routines added to save and load the {\tt Numeric} and {\tt Symbolic}
    objects to and from a binary file.

\item A Fortran interface added.  It provides access to a subset of
    UMFPACK's features.

\item You can compute an incomplete LU factorization, by dropping small
    entries from $\m{L}$ and $\m{U}$.  By default, no nonzero entry is
    dropped, no matter how small in absolute value.  This feature is new
    to Version 4.3.

\end{enumerate}

%-------------------------------------------------------------------------------
\section{Using UMFPACK in MATLAB}
%-------------------------------------------------------------------------------

The easiest way to use UMFPACK is within MATLAB.  Version 4.3 is a built-in
routine in MATLAB 7.1, and is used in {\tt x = A}$\backslash${\tt b} when
{\tt A} is sparse, square, unsymmetric (or symmetric but not positive definite),
and with nonzero entries that are not confined in a narrow band.
It is also used for the {\tt [L,U,P,Q] = lu (A)} usage of {\tt lu}.
Type {\tt help lu} in MATLAB 6.5 or later for more details.

To use the UMFPACK mexFunction, you must download and compile it,
since the mexFunction itself is not part of MATLAB.
The following discussion assumes that
you have MATLAB Version 6.0 or later (which includes the BLAS, and the
{\tt colamd} ordering routine).  To compile both the UMFPACK and AMD
mexFunctions, just type {\tt make} in the Unix system shell,
while in the {\tt UMFPACK} directory.
You can also type {\tt umfpack\_make} in MATLAB, if you are in the
{\tt UMFPACK/MATLAB} directory, or if that directory is in your MATLAB path.
This works on any system with MATLAB, including Windows.
See Section~\ref{Install} for more details on how to install UMFPACK.
Once installed, the UMFPACK mexFunction can analyze, factor, and solve linear
systems.  Table~\ref{matlab} summarizes some of the more common uses
of the UMFPACK mexFunction within MATLAB.

An optional input argument can be used to modify the control parameters for
UMFPACK, and an optional output argument provides statistics on the
factorization.

Refer to the AMD User Guide for more details about the AMD mexFunction.

\begin{table}
\caption{Using UMFPACK's MATLAB interface}
\label{matlab}
\vspace{0.1in}
{\footnotesize
\begin{tabular}{l|l|l}
\hline
Function & Using UMFPACK & MATLAB 6.0 equivalent \\
\hline
 & & \\
\begin{minipage}[t]{1.5in}
Solve $\m{Ax}=\m{b}$.
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
x = umfpack (A,'\',b) ;
\end{verbatim}
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
x = A \ b ;
\end{verbatim}
\end{minipage}
 \\
 & & \\
\hline
 & & \\
\begin{minipage}[t]{1.5in}
Solve $\m{Ax}=\m{b}$ using a different row and column pre-ordering
(symmetric ordering).
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
S = spones (A) ;
Q = symamd (S+S') ;
Control = umfpack ;
Control (6) = 3 ;
x = umfpack (A,Q,'\',b,Control) ;
\end{verbatim}
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
spparms ('autommd',0) ;
S = spones (A) ;
Q = symamd (S+S') ;
x = A (Q,Q) \ b (Q) ;
x (Q) = x ;
spparms ('autommd',1) ;
\end{verbatim}
\end{minipage}
 \\
 & & \\
\hline
 & & \\
\begin{minipage}[t]{1.5in}
Solve $\m{A}\tr\m{x}\tr = \m{b}\tr$.
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
x = umfpack (b,'/',A) ;
\end{verbatim}
Note: $\m{A}$ is factorized.
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
x = b / A ;
\end{verbatim}
Note: $\m{A}\tr$ is factorized.
\end{minipage}
 \\
 & & \\
\hline
 & & \\
\begin{minipage}[t]{1.5in}
Scale and factorize $\m{A}$, then solve $\m{Ax}=\m{b}$.
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
[L,U,P,Q,R] = umfpack (A) ;
c = P * (R \ b) ;
x = Q * (U \ (L \ c)) ;
\end{verbatim}
\end{minipage}
&
\begin{minipage}[t]{2.2in}
\begin{verbatim}
[m n] = size (A) ;
r = full (sum (abs (A), 2)) ;
r (find (r == 0)) = 1 ;
R = spdiags (r, 0, m, m) ;
I = speye (n) ;
Q = I (:, colamd (A)) ;
[L,U,P] = lu ((R\A)*Q) ;
c = P * (R \ b) ;
x = Q * (U \ (L \ c)) ;
\end{verbatim}
\end{minipage}
 \\
 & & \\
\hline
\end{tabular}
}
\end{table}

Note: in MATLAB 6.5 or later, use {\tt spparms ('autoamd',0)} in addition to
{\tt spparms ('autommd',0)}, in Table~\ref{matlab}, to turn off MATLAB's
default reordering.

UMFPACK requires
{\tt b} to be a dense vector (real or complex) of the appropriate dimension.
This is more restrictive than what you can do with MATLAB's
backslash or forward slash.  See {\tt umfpack\_solve} for an M-file that
removes this restriction.
This restriction does not apply to the built-in backslash operator
in MATLAB 6.5 or later, which uses UMFPACK to factorize the matrix.
You can do this yourself in MATLAB:

{\footnotesize
\begin{verbatim}
    [L,U,P,Q,R] = umfpack (A) ;
    x = Q * (U \ (L \ (P * (R \ b)))) ;
\end{verbatim}
}

or, with no row scaling:

{\footnotesize
\begin{verbatim}
    [L,U,P,Q] = umfpack (A) ;
    x = Q * (U \ (L \ (P * b))) ;
\end{verbatim}
}

The above examples do not make use of the iterative refinement
that is built into
{\tt x = }{\tt umfpack (A,'}$\backslash${\tt ',b)}
however.

MATLAB's {\tt [L,U,P] = lu(A)} returns a lower triangular {\tt L}, an upper
triangular {\tt U}, and a permutation matrix {\tt P} such that {\tt P*A} is
equal to {\tt L*U}.  UMFPACK behaves differently.  By default, it scales
the rows of {\tt A} and reorders the columns of {\tt A} prior to
factorization, so that {\tt L*U} is equal to {\tt P*(R}$\backslash${\tt A)*Q},
where {\tt R} is a diagonal sparse matrix of scale factors for the rows
of {\tt A}.  The scale factors {\tt R} are applied to {\tt A} via the MATLAB
expression {\tt R}$\backslash${\tt A} to avoid multiplying by
the reciprocal, which can be numerically inaccurate.

There are more options; you can provide your own column pre-ordering (in which
case UMFPACK does not call COLAMD or AMD), you can modify other control settings
(similar to the {\tt spparms} in MATLAB), and you can get various statistics on
the analysis, factorization, and solution of the linear system.  Type
{\tt umfpack\_details} and {\tt umfpack\_report} in MATLAB for more
information.  Two demo M-files are provided.   Just type {\tt umfpack\_simple}
and {\tt umfpack\_demo} to run them.
The output of these two programs should be about the same
as the files {\tt umfpack\_simple.m.out} and {\tt umfpack\_demo.m.out}
that are provided.

Factorizing {\tt A'} (or {\tt A.'}) and using the transposed factors can
sometimes be faster than factorizing {\tt A}.  It can also be preferable to
factorize {\tt A'} if {\tt A} is rectangular.  UMFPACK pre-orders the columns
to maintain sparsity; the row ordering is not determined until the matrix
is factorized.  Thus, if {\tt A} is {\tt m} by {\tt n} with rank {\tt m}
and {\tt m} $<$ {\tt n}, then {\tt umfpack} might not find a factor
{\tt U} with a zero-free diagonal.  Unless the matrix ill-conditioned or
poorly scaled, factorizing {\tt A'} in this case will guarantee that both
factors will have zero-free diagonals.  Here's how you can factorize {\tt A'}
and get the factors of {\tt A} instead:

\begin{verbatim}
    [l,u,p,q] = umfpack (A') ;
    L = u' ;
    U = l' ;
    P = q ;
    Q = p ;
    clear l u p q
\end{verbatim}

This is an alternative to {\tt [L,U,P,Q]=umfpack(A)}.

A simple M-file ({\tt umfpack\_btf}) is provided that first permutes the matrix
to upper block triangular form, using MATLAB's {\tt dmperm} routine, and then
solves each block.  The LU factors are not returned.  Its usage is simple:
{\tt x = umfpack\_btf(A,b)}.  Type {\tt help umfpack\_btf} for more options.
An estimate of the 1-norm of {\tt L*U-P*A*Q} can be computed in MATLAB
as {\tt lu\_normest(P*A*Q,L,U)}, using the {\tt lu\_normest.m} M-file
by Hager and Davis \cite{DavisHager99} that is included with the
UMFPACK distribution.  With row scaling enabled, use
{\tt lu\_normest(P*(R}$\backslash${\tt A)*Q,L,U)} instead.

One issue you may encounter is how UMFPACK allocates its memory when being used
in a mexFunction.  One part of its working space is of variable size.   The
symbolic analysis phase determines an upper bound on the size of this memory,
but not all of this memory will typically be used in the numerical
factorization.  UMFPACK tries to allocate a decent amount of working space.
This is 70\% of the upper bound, by default, for the unsymmetric strategy.
For the symmetric strategy, the fraction of the upper bound is computed
automatically (assuming a best-case scenario with no numerical pivoting
required during numeric factorization).
If this initial allocation fails, it reduces its request
and uses less memory.   If the space is not large enough during factorization,
it is increased via {\tt mxRealloc}.

However, {\tt mxMalloc} and {\tt mxRealloc} abort the {\tt umfpack} mexFunction
if they fail, so this strategy does not work in MATLAB.  The strategy works fine
when {\tt malloc} or the internal memory allocator {\tt utMalloc} are used
instead, since those routines return {\tt NULL} on failure, and do not terminate
the mexFunction.  The {\tt umfpack} mexFunction can be compiled to use
{\tt utMalloc}, but this is an internal undocumented utility routine in MATLAB,
and thus using {\tt utMalloc} might not always be successful.
To use the documented {\tt mxMalloc} routine instead, compile the
mexFunction with the {\tt -DNUTIL} flag enabled.

To compute the determinant with UMFPACK:

\begin{verbatim}
    d = umfpack (A, 'det') ;
    [d e] = umfpack (A, 'det') ;
\end{verbatim}

The first case is identical to MATLAB's {\tt det}.
The second case returns the determinant in the form
$d \times 10^e$, which avoids overflow if $e$ is large.

%-------------------------------------------------------------------------------
\section{Using UMFPACK in a C program}
\label{C}
%-------------------------------------------------------------------------------

The C-callable UMFPACK library consists of 32 user-callable routines and one
include file.  All but three of the routines come in four versions, with
different sizes of integers and for real or complex floating-point numbers:
\begin{enumerate}
\item {\tt umfpack\_di\_*}: real double precision, {\tt int} integers.
\item {\tt umfpack\_dl\_*}: real double precision, {\tt long} integers.
\item {\tt umfpack\_zi\_*}: complex double precision, {\tt int} integers.
\item {\tt umfpack\_zl\_*}: complex double precision, {\tt long} integers.
\end{enumerate}
where {\tt *} denotes the specific name of one of the routines.
Routine names beginning with {\tt umf\_} are internal to the package,
and should not be called by the user.  The include file {\tt umfpack.h}
must be included in any C program that uses UMFPACK.
The other three routines are the same for all four versions.

In addition, the C-callable AMD library distributed with UMFPACK
includes 4 user-callable routines (in two versions with {\tt int} and
{\tt long} integers) and one include file.  Refer to the AMD documentation
for more details.

Use only one version for any one problem; do not attempt to use one version
to analyze the matrix and another version to factorize the matrix, for example.

The notation {\tt umfpack\_di\_*} refers to all user-callable routines
for the real double precision and {\tt int} integer case.  The notation
{\tt umfpack\_*\_numeric}, for example, refers all four versions
(real/complex, int/long) of a single operation
(in this case numeric factorization).

%-------------------------------------------------------------------------------
\subsection{The size of an integer}
%-------------------------------------------------------------------------------

The {\tt umfpack\_di\_*} and {\tt umfpack\_zi\_*} routines use {\tt int} integer
arguments; those starting with {\tt umfpack\_dl\_} or {\tt umfpack\_zl\_}
use {\tt long} integer arguments.  If you compile UMFPACK in the standard
ILP32 mode (32-bit {\tt int}'s, {\tt long}'s, and pointers) then the versions
are essentially identical.  You will be able to solve problems using up to 2GB
of memory.  If you compile UMFPACK in the standard LP64 mode, the size of an
{\tt int} remains 32-bits, but the size of a {\tt long} and a pointer both get
promoted to 64-bits.  In the LP64 mode, the {\tt umfpack\_dl\_*}
and {\tt umfpack\_zl\_*} routines can solve huge
problems (not limited to 2GB), limited of course by the amount of available
memory.  The only drawback to the 64-bit mode is that not all BLAS libraries
support 64-bit integers.  This limits the performance you will obtain.
Those that do support 64-bit integers are specific to particular
architectures, and are not portable.  UMFPACK and AMD should be compiled
in the same mode.
If you compile UMFPACK and AMD in the LP64 mode,
be sure to add {\tt -DLP64} to the compilation command.  See the examples in
{\tt Make.alpha}, {\tt Make.sgi}, and {\tt Make.solaris}.

%-------------------------------------------------------------------------------
\subsection{Real and complex floating-point}
%-------------------------------------------------------------------------------

The {\tt umfpack\_di\_*} and {\tt umfpack\_dl\_*} routines take (real) double
precision arguments, and return double precision arguments.  In the
{\tt umfpack\_zi\_*} and {\tt umfpack\_zl\_*} routines, these same arguments
hold the real part of the matrices; and second double precision arrays hold
the imaginary part of the input and output matrices.  Internally, complex
numbers are stored in arrays with their real and imaginary parts interleaved,
as required by the BLAS (``packed'' complex form).

New to Version 4.4 is the option of providing input/output arguments
in packed complex form.

%-------------------------------------------------------------------------------
\subsection{Primary routines, and a simple example}
%-------------------------------------------------------------------------------

Five primary UMFPACK routines are required to factorize $\m{A}$ or
solve $\m{Ax}=\m{b}$.  They are fully described in Section~\ref{Primary}:

\begin{itemize}
\item {\tt umfpack\_*\_symbolic}:

    Pre-orders the columns of $\m{A}$ to reduce fill-in.
    Returns an opaque {\tt Symbolic} object as a {\tt void *}
    pointer.  The object contains the symbolic analysis and is needed for the
    numeric factorization.  This routine requires only $O(|\m{A}|)$ space,
    where $|\m{A}|$ is the number of nonzero entries in the matrix.  It computes
    upper bounds on the nonzeros in $\m{L}$ and $\m{U}$, the floating-point
    operations required, and the memory usage of {\tt umfpack\_*\_numeric}.  The
    {\tt Symbolic} object is small; it contains just the column pre-ordering,
    the supernodal column elimination tree, and information about each frontal
    matrix. It is no larger than about $13n$ integers if $\m{A}$ is
    $n$-by-$n$.

\item {\tt umfpack\_*\_numeric}:

    Numerically scales and then factorizes a sparse matrix into
    $\m{PAQ}$, $\m{PRAQ}$, or $\m{PR}^{-1}\m{AQ}$ into the product $\m{LU}$,
    where
    $\m{P}$ and $\m{Q}$ are permutation matrices, $\m{R}$ is a diagonal
    matrix of scale factors, $\m{L}$ is lower triangular with unit diagonal,
    and $\m{U}$ is upper triangular.  Requires the
    symbolic ordering and analysis computed by {\tt umfpack\_*\_symbolic}
    or {\tt umfpack\_*\_qsymbolic}.
    Returns an opaque {\tt Numeric} object as a
    {\tt void *} pointer.  The object contains the numerical factorization and
    is used by {\tt umfpack\_*\_solve}.  You can factorize a new matrix with a
    different values (but identical pattern) as the matrix analyzed by
    {\tt umfpack\_*\_symbolic} or {\tt umfpack\_*\_qsymbolic} by re-using the
    {\tt Symbolic} object (this feature is available when using UMFPACK in a
    C or Fortran program, but not in MATLAB).
    The matrix
    $\m{U}$ will have zeros on the diagonal if $\m{A}$ is singular; this
    produces a warning, but the factorization is still valid.

\item {\tt umfpack\_*\_solve}:

    Solves a sparse linear system ($\m{Ax}=\m{b}$, $\m{A}\tr\m{x}=\m{b}$, or
    systems involving just $\m{L}$ or $\m{U}$), using the numeric factorization
    computed by {\tt umfpack\_*\_numeric}.  Iterative refinement with sparse
    backward error \cite{ardd:89} is used by default.  The matrix $\m{A}$ must
    be square.  If it is singular, then a divide-by-zero will occur, and your
    solution with contain IEEE Inf's or NaN's in the appropriate places.

\item {\tt umfpack\_*\_free\_symbolic}:

    Frees the {\tt Symbolic} object created by {\tt umfpack\_*\_symbolic}
    or {\tt umfpack\_*\_qsymbolic}.

\item {\tt umfpack\_*\_free\_numeric}:

    Frees the {\tt Numeric} object created by {\tt umfpack\_*\_numeric}.

\end{itemize}

Be careful not to free a {\tt Symbolic} object with
{\tt umfpack\_*\_free\_numeric}.  Nor should you attempt to free a {\tt Numeric}
object with {\tt umfpack\_*\_free\_symbolic}.
Failure to free these objects will lead to memory leaks.

The matrix $\m{A}$ is represented in compressed column form, which is
identical to the sparse matrix representation used by MATLAB.  It consists
of three or four arrays, where the matrix is {\tt m}-by-{\tt n},
with {\tt nz} entries.  For the {\tt int} version of UMFPACK:

{\footnotesize
\begin{verbatim}
     int Ap [n+1] ;
     int Ai [nz] ;
     double Ax [nz] ;
\end{verbatim}
}

For the {\tt long} version of UMFPACK:

{\footnotesize
\begin{verbatim}
     long Ap [n+1] ;
     long Ai [nz] ;
     double Ax [nz] ;
\end{verbatim}
}

The complex versions add another array for the imaginary part:

{\footnotesize
\begin{verbatim}
     double Az [nz] ;
\end{verbatim}
}

Alternatively, if {\tt Az} is {\tt NULL},
the real part of the $k$th entry is located in
{\tt Ax[2*k]} and the imaginary part is located in
{\tt Ax[2*k+1]}, and the {\tt Ax} array is of size {\tt 2*nz}.

All nonzeros are entries, but an entry may be numerically zero.  The row indices
of entries in column {\tt j} are stored in
    {\tt Ai[Ap[j]} \ldots {\tt Ap[j+1]-1]}.
The corresponding numerical values are stored in
    {\tt Ax[Ap[j]} \ldots {\tt Ap[j+1]-1]}.
The imaginary part, for the complex versions, is stored in
    {\tt Az[Ap[j]} \ldots {\tt Ap[j+1]-1]}
    (see above for the packed complex case).

No duplicate row indices may be present, and the row indices in any given
column must be sorted in ascending order.  The first entry {\tt Ap[0]} must be
zero.  The total number of entries in the matrix is thus {\tt nz = Ap[n]}.
Except for the fact that extra zero entries can be included, there is thus a
unique compressed column representation of any given matrix $\m{A}$.
For a more flexible method for providing an input matrix to UMFPACK,
see Section~\ref{triplet}.

Here is a simple main program, {\tt umfpack\_simple.c}, that illustrates the
basic usage of UMFPACK.  See Section~\ref{Synopsis} for a short description
of each calling sequence, including a list of options for the first
argument of {\tt umfpack\_di\_solve}.

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_simple.c via sed
\end{verbatim}
}

The {\tt Ap}, {\tt Ai}, and {\tt Ax} arrays represent the matrix
\[
\m{A} = \left[
\begin{array}{rrrrr}
 2 &  3 &  0 &  0 &  0 \\
 3 &  0 &  4 &  0 &  6 \\
 0 & -1 & -3 &  2 &  0 \\
 0 &  0 &  1 &  0 &  0 \\
 0 &  4 &  2 &  0 &  1 \\
\end{array}
\right].
\]
and the solution to $\m{Ax}=\m{b}$ is $\m{x} = [1 \, 2 \, 3 \, 4 \, 5]\tr$.
The program uses default control settings and does not return any statistics
about the ordering, factorization, or solution ({\tt Control} and {\tt Info}
are both {\tt (double *) NULL}).  It also ignores the status value returned by
most user-callable UMFPACK routines.

%-------------------------------------------------------------------------------
\subsection{A note about zero-sized arrays}
%-------------------------------------------------------------------------------

UMFPACK uses many user-provided arrays of
size {\tt m} or {\tt n} (the order of the matrix), and of size
{\tt nz} (the number of nonzeros in a matrix).  UMFPACK does not handle
zero-dimensioned arrays;
it returns an error code if {\tt m} or {\tt n}
are zero.  However, {\tt nz} can be zero, since all singular matrices are
handled correctly.  If you attempt to {\tt malloc} an array of size {\tt nz}
= 0, however, {\tt malloc} will return a null pointer which UMFPACK will report
as a missing argument.  If you {\tt malloc} an array of
size {\tt nz} to pass to UMFPACK, make sure that you handle the {\tt nz} = 0
case correctly (use a size equal to the maximum of {\tt nz} and 1, or use a
size of {\tt nz+1}).

%-------------------------------------------------------------------------------
\subsection{Alternative routines}
%-------------------------------------------------------------------------------

Three alternative routines are provided that modify UMFPACK's default
behavior.  They are fully described in Section~\ref{Alternative}:

\begin{itemize}
\item {\tt umfpack\_*\_defaults}:

    Sets the default control parameters in the {\tt Control} array.  These can
    then be modified as desired before passing the array to the other UMFPACK
    routines.  Control parameters are summarized in Section~\ref{control_param}.
    Three particular parameters deserve special notice.
    UMFPACK uses relaxed partial pivoting, where a candidate pivot entry is
    numerically acceptable if its magnitude is greater than or equal to a
    tolerance parameter times the magnitude of the largest entry in the same
    column.  The parameter {\tt Control [UMFPACK\_PIVOT\_TOLERANCE]} has a
    default value of 0.1, and is used for the unsymmetric strategy.
    For complex matrices, a cheap approximation of the absolute value is
    used for the threshold pivoting test
    ($|a| \approx |a_{\mbox{real}}|+|a_{\mbox{imag}}|$).

    For the symmetric strategy, a second tolerance is used for diagonal
    entries: \newline {\tt Control [UMFPACK\_SYM\_PIVOT\_TOLERANCE]}, with
    a default value of 0.001.  The first parameter (with a default of 0.1)
    is used for any off-diagonal candidate pivot entries.

    These two parameters may be too small for some matrices, particularly for
    ill-conditioned or poorly scaled ones.  With the default pivot tolerances
    and default iterative refinement,
        {\tt x = umfpack (A,'}$\backslash${\tt ',b)}
    is just as accurate as (or more accurate) than
        {\tt x = A}$\backslash${\tt b}
    in MATLAB 6.1 for nearly all matrices.

    If {\tt Control [UMFPACK\_PIVOT\_TOLERANCE]} is zero, than any
    nonzero entry is acceptable as a pivot (this is changed from Version 4.0,
    which treated a value of 0.0 the same as 1.0).  If the symmetric strategy is
    used, and {\tt Control [UMFPACK\_SYM\_PIVOT\_TOLERANCE]} is zero, then any
    nonzero entry on the diagonal is accepted as a pivot.  Off-diagonal pivoting
    will still occur if the diagonal entry is exactly zero.  The
    {\tt Control [UMFPACK\_SYM\_PIVOT\_TOLERANCE]} parameter is new to Version
    4.1.  It is similar in function to the pivot tolerance for left-looking
    methods (the MATLAB {\tt THRESH} option in {\tt [L,U,P] = lu (A, THRESH)},
    and the pivot tolerance parameter in SuperLU).

    The parameter {\tt Control [UMFPACK\_STRATEGY]} can be used to bypass
    UMFPACK's automatic strategy selection.  The automatic strategy nearly
    always selects the best method.  When it does not, the different methods
    nearly always give about the same quality of results.  There may be
    cases where the automatic strategy fails to pick a good strategy. Also,
    you can save some computing time if you know the right strategy for your
    set of matrix problems.

\item {\tt umfpack\_*\_qsymbolic}:

    An alternative to {\tt umfpack\_*\_symbolic}.  Allows the user to specify
    his or her own column pre-ordering, rather than using the default COLAMD
    or AMD pre-orderings.  For example, a graph partitioning-based order
    of $\m{A}\tr\m{A}$ would be suitable for UMFPACK's unsymmetric strategy.
    A partitioning of $\m{A}+\m{A}\tr$ would be suitable for UMFPACK's
    symmetric or 2-by-2 strategies.

\item {\tt umfpack\_*\_wsolve}:

    An alternative to {\tt umfpack\_*\_solve} which does not dynamically
    allocate any memory.  Requires the user to pass two additional work
    arrays.

\end{itemize}

%-------------------------------------------------------------------------------
\subsection{Matrix manipulation routines}
\label{triplet}
%-------------------------------------------------------------------------------

The compressed column data structure is compact, and simplifies the UMFPACK
routines that operate on the sparse matrix $\m{A}$.  However, it can be
inconvenient for the user to generate.  Section~\ref{Manipulate} presents the
details of routines for manipulating sparse matrices in {\em triplet} form,
compressed column form, and compressed row form (the transpose of the
compressed column form).  The triplet form of a matrix consists of three or
four arrays.  For the {\tt int} version of UMFPACK:

{\footnotesize
\begin{verbatim}
     int Ti [nz] ;
     int Tj [nz] ;
     double Tx [nz] ;
\end{verbatim}
}

For the {\tt long} version:

{\footnotesize
\begin{verbatim}
     long Ti [nz] ;
     long Tj [nz] ;
     double Tx [nz] ;
\end{verbatim}
}

The complex versions use another array to hold the imaginary part:

{\footnotesize
\begin{verbatim}
     double Tz [nz] ;
\end{verbatim}
}

The {\tt k}-th triplet is $(i,j,a_{ij})$, where $i =$ {\tt Ti[k]},
$j =$ {\tt Tj[k]}, and $a_{ij} =$ {\tt Tx[k]}.  For the complex versions,
{\tt Tx[k]} is the real part of $a_{ij}$ and
{\tt Tz[k]} is the imaginary part.
The triplets can be in any
order in the {\tt Ti}, {\tt Tj}, and {\tt Tx} arrays (and {\tt Tz} for
the complex versions), and duplicate entries may
exist.
If {\tt Tz} is NULL, then the array {\tt Tx} becomes of size {\tt 2*nz},
and the real and imaginary parts of the
{\tt k}-th triplet are located in {\tt Tx[2*k]} and {\tt Tx[2*k+1]},
respectively.
Any duplicate entries are summed when the triplet form is converted to
compressed column form.  This is a convenient way to create a matrix arising in
finite-element methods, for example.

Four routines are provided for manipulating sparse matrices:

\begin{itemize}
\item {\tt umfpack\_*\_triplet\_to\_col}:

    Converts a triplet form of a matrix to compressed column form (ready for
    input to \newline
    {\tt umfpack\_*\_symbolic}, {\tt umfpack\_*\_qsymbolic}, and
    {\tt umfpack\_*\_numeric}).  Identical to {\tt A = spconvert(i,j,x)} in
    MATLAB, except that zero entries are not removed, so that the pattern of
    entries in the compressed column form of $\m{A}$ are fully under user
    control.  This is important if you want to factorize a new matrix with the
    {\tt Symbolic} object from a prior matrix with the same pattern as the new
    one.

\item {\tt umfpack\_*\_col\_to\_triplet}:

    The opposite of {\tt umfpack\_*\_triplet\_to\_col}.  Identical to
    {\tt [i,j,x] = find(A)} in MATLAB, except that numerically zero entries
    may be included.

\item {\tt umfpack\_*\_transpose}:

    Transposes and optionally permutes a column form matrix \cite{Gustavson78}.
    Identical to
    {\tt R = A(P,Q)'} (linear algebraic transpose, using the complex conjugate)
    or {\tt R = A(P,Q).'} (the array transpose)
    in MATLAB, except for the presence of numerically zero entries.

    Factorizing $\m{A}\tr$ and then solving $\m{Ax}=\m{b}$ with the transposed
    factors can sometimes be much faster or much slower than factorizing
    $\m{A}$.  It is highly dependent on your particular matrix.

\item {\tt umfpack\_*\_scale}:

    Applies the row scale factors to a user-provided vector.  This is not
    required to solve the sparse linear system $\m{Ax}=\m{b}$ or
    $\m{A}\tr\m{x}=\m{b}$, since {\tt umfpack\_*\_solve} applies the scale
    factors for those systems.

\end{itemize}

It is quite easy to add matrices in triplet form, subtract them, transpose
them, permute them, construct a submatrix, and multiply a triplet-form matrix
times a vector.  UMFPACK does not provide code for these basic operations,
however.  Refer to the discussion of
{\tt umfpack\_*\_triplet\_to\_col} in Section~\ref{Manipulate} for more details
on how to compute these operations in your own code.
The only primary matrix operation not provided by UMFPACK is the
multiplication of two sparse matrices \cite{Gustavson78}.
A future package under development (as of Jan. 2005), CHOLMOD,
will provide many of these matrix operations, which
can then be used in conjunction with UMFPACK.
Watch my web page for details.

%-------------------------------------------------------------------------------
\subsection{Getting the contents of opaque objects}
%-------------------------------------------------------------------------------

There are cases where you may wish to do more with the LU factorization
of a matrix than solve a linear system.  The opaque {\tt Symbolic} and
{\tt Numeric} objects are just that - opaque.  You cannot do anything with them
except to pass them back to subsequent calls to UMFPACK.  Three routines
are provided for copying their contents into user-provided arrays using simpler
data structures.  Four routines are provided for saving and loading the
{\tt Numeric} and {\tt Symbolic} objects to/from binary files.
An additional routine is provided that computes the determinant.
They are fully described in Section~\ref{Get}:

\begin{itemize}
\item {\tt umfpack\_*\_get\_lunz}:

    Returns the number of nonzeros in $\m{L}$ and $\m{U}$.

\item {\tt umfpack\_*\_get\_numeric}:

    Copies $\m{L}$, $\m{U}$, $\m{P}$, $\m{Q}$, and $\m{R}$
    from the {\tt Numeric} object
    into arrays provided by the user.  The matrix $\m{L}$ is returned in
    compressed row form (with the column indices in each row sorted in ascending
    order).  The matrix $\m{U}$ is returned in compressed column form (with
    sorted columns).  There are no explicit zero entries in $\m{L}$ and $\m{U}$,
    but such entries may exist in the {\tt Numeric} object.  The permutations
    $\m{P}$ and $\m{Q}$ are represented as permutation vectors, where
    {\tt P[k] = i} means that row {\tt i} of the original matrix is the
    the {\tt k}-th row of $\m{PAQ}$, and where
    {\tt Q[k] = j} means that column {\tt j} of the original matrix is the
    {\tt k}-th column of $\m{PAQ}$.  This is identical to how MATLAB uses
    permutation vectors (type {\tt help colamd} in MATLAB 6.1 or later).

\item {\tt umfpack\_*\_get\_symbolic}:

    Copies the contents of the {\tt Symbolic} object (the initial row and column
    preordering, supernodal column elimination tree, and information
    about each frontal matrix) into arrays provided by the user.

\item {\tt umfpack\_*\_get\_determinant}:

    Computes the determinant from the diagonal of $\m{U}$ and the permutations
    $\m{P}$ and $\m{Q}$.  This is mostly of theoretical interest.
    It is not a good test to determine if your matrix is singular or not.

\item {\tt umfpack\_*\_save\_numeric}:

    Saves a copy of the {\tt Numeric} object to a file, in binary format.

\item {\tt umfpack\_*\_load\_numeric}:

    Creates a {\tt Numeric} object by loading it from a file created
    by {\tt umfpack\_*\_save\_numeric}.

\item {\tt umfpack\_*\_save\_symbolic}:

    Saves a copy of the {\tt Symbolic} object to a file, in binary format.

\item {\tt umfpack\_*\_load\_symbolic}:

    Creates a {\tt Symbolic} object by loading it from a file created
    by {\tt umfpack\_*\_save\_symbolic}.

\end{itemize}

UMFPACK itself does not make use of these routines;
they are provided solely for returning the contents of the opaque
{\tt Symbolic} and {\tt Numeric} objects to the user, and saving/loading
them to/from a binary file.  None of them do any computation, except for
{\tt umfpack\_*\_get\_determinant}.

%-------------------------------------------------------------------------------
\subsection{Reporting routines}
\label{Reporting}
%-------------------------------------------------------------------------------

None of the UMFPACK routines discussed so far prints anything, even when an
error occurs.  UMFPACK provides you with nine routines for printing the input
and output arguments (including the {\tt Control} settings and {\tt Info}
statistics) of UMFPACK routines discussed above.  They are fully described in
Section~\ref{Report}:

\begin{itemize}
\item {\tt umfpack\_*\_report\_status}:

    Prints the status (return value) of other {\tt umfpack\_*} routines.

\item {\tt umfpack\_*\_report\_info}:

    Prints the statistics returned in the {\tt Info} array by
    {\tt umfpack\_*\_*symbolic},
    {\tt umfpack\_*\_numeric}, and {\tt umfpack\_*\_*solve}.

\item {\tt umfpack\_*\_report\_control}:

    Prints the {\tt Control} settings.

\item {\tt umfpack\_*\_report\_matrix}:

    Verifies and prints a compressed column-form or compressed row-form sparse
    matrix.

\item {\tt umfpack\_*\_report\_triplet}:

    Verifies and prints a matrix in triplet form.

\item {\tt umfpack\_*\_report\_symbolic}:

    Verifies and prints a {\tt Symbolic} object.

\item {\tt umfpack\_*\_report\_numeric}:

    Verifies and prints a {\tt Numeric} object.

\item {\tt umfpack\_*\_report\_perm}:

    Verifies and prints a permutation vector.

\item {\tt umfpack\_*\_report\_vector}:

    Verifies and prints a real or complex vector.

\end{itemize}

The {\tt umfpack\_*\_report\_*} routines behave slightly differently when
compiled
into the C-callable UMFPACK library than when used in the MATLAB mexFunction.
MATLAB stores its sparse matrices using the same compressed column data
structure discussed above, where row and column indices of an $m$-by-$n$
matrix are in the range 0 to $m-1$ or $n-1$, respectively\footnote{Complex
matrices in MATLAB use the split array form, with one {\tt double} array
for the real part and another array for the imaginary part.  UMFPACK
supports that format, as well as the packed complex format (new to Version 4.4).}
It prints them as if they are in the range 1 to $m$ or $n$.
The UMFPACK mexFunction behaves the same way.

You can control how much the {\tt umfpack\_*\_report\_*} routines print by
modifying the {\tt Control [UMFPACK\_PRL]} parameter.  Its default value is 1.
Here is a summary of how the routines use this print level parameter:

\begin{itemize}
\item {\tt umfpack\_*\_report\_status}:

    No output if the print level is 0 or less, even when an error occurs.
    If 1, then error messages are printed, and nothing is printed if
    the status is {\tt UMFPACK\_OK}.  A warning message is printed if
    the matrix is singular.  If 2 or more, then the status is always
    printed.  If 4 or more, then the UMFPACK Copyright is printed.
    If 6 or more, then the UMFPACK License is printed.  See also the first page
    of this User Guide for the Copyright and License.

\item {\tt umfpack\_*\_report\_control}:

    No output if the print level is 1 or less.  If 2 or more, all of
    {\tt Control} is printed.

\item {\tt umfpack\_*\_report\_info}:

    No output if the print level is 1 or less.  If 2 or more, all of
    {\tt Info} is printed.

\item all other {\tt umfpack\_*\_report\_*} routines:

    If the print level is 2 or less, then these routines return silently without
    checking their inputs.  If 3 or more, the inputs are fully verified and a
    short status summary is printed.  If 4, then the first few entries of the
    input arguments are printed.  If 5, then all of the input arguments are
    printed.

\end{itemize}

This print level parameter has an additional effect on the MATLAB mexFunction.
If zero, then no warnings of singular or nearly singular matrices are
printed (similar to the MATLAB commands
{\tt warning off MATLAB:singularMatrix} and
{\tt warning off MATLAB:nearlySingularMatrix}).

%-------------------------------------------------------------------------------
\subsection{Utility routines}
%-------------------------------------------------------------------------------

UMFPACK v4.0 included a routine that returns the time used by the process,
{\tt umfpack\_timer}.  The routine uses either {\tt getrusage} (which is
preferred), or the ANSI C {\tt clock} routine if that is not available.
It is fully described in Section~\ref{Utility}.  It is still available in
UMFPACK v4.1 and following, but not used internally.
Two new timing routines are provided in UMFPACK Version 4.1 and following,
{\tt umfpack\_tic} and {\tt umfpack\_toc}.  They use POSIX-compliant
{\tt sysconf} and {\tt times} routines to find both the CPU time
and wallclock time.
These three routines are the only user-callable
routine that is identical in all four {\tt int}/{\tt long}, real/complex
versions (there is no {\tt umfpack\_di\_timer} routine, for example).

%-------------------------------------------------------------------------------
\subsection{Control parameters}
\label{control_param}
%-------------------------------------------------------------------------------

UMFPACK uses an optional {\tt double} array (currently of size 20)
to modify its control parameters.  If you pass {\tt (double *) NULL} instead
of a {\tt Control} array, then defaults are used.  These defaults provide
nearly optimal performance (both speed, memory usage, and numerical accuracy)
for a wide range of matrices from real applications.

This array will almost certainly grow in size in future releases,
so be sure to dimension your {\tt Control} array to be of size
{\tt UMFPACK\_CONTROL}.  That constant is currently defined to be 20,
but may increase in future versions, since all 20 entries are in use.

The contents of this array may be modified by the user
(see {\tt umfpack\_*\_defaults}).  Each
user-callable routine includes a complete description of how each control
setting modifies its behavior.  Table~\ref{control} summarizes the entire
contents of the {\tt Control} array.
Note that ANSI C uses 0-based indexing, while MATLAB uses 1-based
indexing.  Thus, {\tt Control(1)} in MATLAB is the same as
{\tt Control[0]} or {\tt Control[UMFPACK\_PRL]} in ANSI C.

\begin{table}
\caption{UMFPACK Control parameters}
\label{control}
{\footnotesize
\begin{tabular}{llll}
\hline

MATLAB & ANSI C & default & description \\
\hline
{\tt Control(1)}  & {\tt Control[UMFPACK\_PRL]} & 1 & printing level \\
{\tt Control(2)}  & {\tt Control[UMFPACK\_DENSE\_ROW]} & 0.2 & dense row parameter \\
{\tt Control(3)}  & {\tt Control[UMFPACK\_DENSE\_COL]} & 0.2 & dense column parameter \\
{\tt Control(4)}  & {\tt Control[UMFPACK\_PIVOT\_TOLERANCE]} & 0.1 & partial pivoting tolerance \\
{\tt Control(5)}  & {\tt Control[UMFPACK\_BLOCK\_SIZE]} & 32 & BLAS block size \\
{\tt Control(6)}  & {\tt Control[UMFPACK\_STRATEGY]} & 0 (auto) & select strategy \\
{\tt Control(7)}  & {\tt Control[UMFPACK\_ALLOC\_INIT]} & 0.7 & initial memory allocation  \\
{\tt Control(8)}  & {\tt Control[UMFPACK\_IRSTEP]} & 2 & max iter. refinement steps \\
{\tt Control(13)} & {\tt Control[UMFPACK\_2BY2\_TOLERANCE]} & 0.01 & defines ``large'' entries \\
{\tt Control(14)} & {\tt Control[UMFPACK\_FIXQ]} & 0 (auto) & fix or modify Q \\
{\tt Control(15)} & {\tt Control[UMFPACK\_AMD\_DENSE]} & 10 & AMD dense row/column parameter \\
{\tt Control(16)} & {\tt Control[UMFPACK\_SYM\_PIVOT\_TOLERANCE]} & 0.001 & for diagonal entries \\
{\tt Control(17)} & {\tt Control[UMFPACK\_SCALE]} & 1 (sum) & row scaling (none, sum, or max) \\
{\tt Control(18)} & {\tt Control[UMFPACK\_FRONT\_ALLOC\_INIT]} & 0.5 & frontal matrix allocation ratio \\
{\tt Control(19)} & {\tt Control[UMFPACK\_DROPTOL]} & 0 & drop tolerance \\
{\tt Control(20)} & {\tt Control[UMFPACK\_AGGRESSIVE]} & 1 (yes) & aggressive absorption \\
 & & & in AMD and COLAMD \\
%
\hline
\multicolumn{4}{l}{Can only be changed at compile time:} \\
{\tt Control(9)}  & {\tt Control[UMFPACK\_COMPILED\_WITH\_BLAS]} & - & true if BLAS is used \\
{\tt Control(10)} & {\tt Control[UMFPACK\_COMPILED\_FOR\_MATLAB]} & - & true for mexFunction \\
{\tt Control(11)} & {\tt Control[UMFPACK\_COMPILED\_WITH\_GETRUSAGE]} & - & 1 if {\tt getrusage} used \\
{\tt Control(12)} & {\tt Control[UMFPACK\_COMPILED\_IN\_DEBUG\_MODE]} & - & true if debug mode enabled \\
\hline
\end{tabular}
}
\end{table}

Let $\alpha_r = ${\tt Control [UMFPACK\_DENSE\_ROW]},
    $\alpha_c = ${\tt Control [UMFPACK\_DENSE\_COL]}, and
    $\alpha = ${\tt Control [UMFPACK\_AMD\_DENSE]}.
Suppose the submatrix $\m{S}$, obtained after eliminating pivots with
zero Markowitz cost, is $m$-by-$n$.
Then a row is considered ``dense'' if it has more than
$\max (16, 16 \alpha_r \sqrt{n})$ entries.
A column is considered ``dense'' if it has more than
$\max (16, 16 \alpha_c \sqrt{m})$ entries.
These rows and columns are treated different in COLAMD and during numerical
factorization.   In COLAMD, dense columns are placed last in their natural
order, and dense rows are ignored.  During numerical factorization, dense
rows are stored differently.
In AMD, a row/column of the square matrix $\m{S}+\m{S}\tr$ is
considered ``dense'' if it has more than $\max (16, \alpha \sqrt{n})$ entries.
These rows/columns are placed last in AMD's output ordering.
For more details on the control parameters, refer to the documentation of
{\tt umfpack\_*\_qsymbolic}, {\tt umfpack\_*\_numeric}, {\tt umfpack\_*\_solve},
and the {\tt umfpack\_*\_report\_*} routines,
in Sections~\ref{Primary}~through~\ref{Report}, below.

%-------------------------------------------------------------------------------
\subsection{Error codes}
\label{error_codes}
%-------------------------------------------------------------------------------

Many of the routines return a {\tt status} value.
This is also returned as the first entry in the {\tt Info} array, for
those routines with that argument.  The following list summarizes
all of the error codes in UMFPACK.  Each error code is given a
specific name in the {\tt umfpack.h} include file, so you can use
those constants instead of hard-coded values in your program.
Future versions may report additional error codes.

A value of zero means everything was successful, and the matrix is
non-singular.  A value greater than zero means the routine was successful,
but a warning occurred.
A negative value means the routine was not successful.
In this case, no {\tt Symbolic} or {\tt Numeric} object was created.

\begin{itemize}
\item {\tt UMFPACK\_OK},  (0):  UMFPACK was successful.

\item {\tt UMFPACK\_WARNING\_singular\_matrix},  (1):  Matrix is singular.
    There are exact zeros on the diagonal of $\m{U}$.

\item {\tt UMFPACK\_WARNING\_determinant\_underflow}, (2):
    The determinant is nonzero, but smaller in magnitude than
    the smallest positive floating-point number.

\item {\tt UMFPACK\_WARNING\_determinant\_overflow}, (3):
    The determinant is larger in magnitude than
    the largest positive floating-point number (IEEE Inf).

\item {\tt UMFPACK\_ERROR\_out\_of\_memory},  (-1):  Not enough memory.
    The ANSI C {\tt malloc} or {\tt realloc} routine failed.

\item {\tt UMFPACK\_ERROR\_invalid\_Numeric\_object},  (-3):
    Routines that take a {\tt Numeric} object as input (or load it
    from a file) check this object and return this error code if it is
    invalid.  This can be caused by a memory leak or overrun in your
    program, which can overwrite part of the Numeric object.  It can also
    be caused by passing a Symbolic object by mistake, or some other pointer.
    If you try to factorize a matrix using one version of UMFPACK and
    then use the factors in another version, this error code will trigger as
    well.  You cannot factor your matrix using
    version 4.0 and then solve with version 4.1, for example.\footnote{
    Exception: v4.3, v4.3.1, and v4.4 use identical data structures
    for the {\tt Numeric} and {\tt Symbolic} objects}.
    You cannot use different precisions of the same version
    (real and complex, for example).
    It is possible for the {\tt Numeric} object to be corrupted by your
    program in subtle ways that are not detectable by this quick check.
    In this case, you may see an
    {\tt UMFPACK\_ERROR\_different\_pattern} error code, or even an
    {\tt UMFPACK\_ERROR\_internal\_error}.

\item {\tt UMFPACK\_ERROR\_invalid\_Symbolic\_object},  (-4):
    Routines that take a {\tt Symbolic} object as input (or load it
    from a file) check this object and return this error code if it is
    invalid.  The causes of this error are analogous to the
    {\tt UMFPACK\_ERROR\_invalid\_Numeric\_object} error described above.

\item {\tt UMFPACK\_ERROR\_argument\_missing},  (-5):
    Some arguments of some are optional (you can pass a {\tt NULL} pointer
    instead of an array).  This error code occurs if you pass a {\tt NULL}
    pointer when that argument is required to be present.

\item {\tt UMFPACK\_ERROR\_n\_nonpositive}  (-6):
    The number of rows or columns of the matrix must be greater than zero.

\item {\tt UMFPACK\_ERROR\_invalid\_matrix}  (-8):
    The matrix is invalid.  For the column-oriented input, this error
    code will occur if the contents of {\tt Ap} and/or {\tt Ai} are invalid.

    {\tt Ap} is an integer array of size {\tt n\_col+1}.
    On input, it holds the
    ``pointers'' for the column form of the sparse matrix $\m{A}$.
    Column {\tt j} of
    the matrix A is held in {\tt Ai [(Ap [j])} \ldots {\tt (Ap [j+1]-1)]}.
    The first entry, {\tt Ap [0]}, must be zero,
    and {\tt Ap [j]} $\le$ {\tt Ap [j+1]} must hold for all
    {\tt j} in the range 0 to {\tt n\_col-1}.
    The value {\tt nz = Ap [n\_col]} is thus the
    total number of entries in the pattern of the matrix A.
    {\tt nz} must be greater than or equal to zero.

    The nonzero pattern (row indices) for column {\tt j} is stored in
    {\tt Ai [(Ap [j])} \ldots {\tt (Ap [j+1]-1)]}.  The row indices in a given
    column {\tt j}
    must be in ascending order, and no duplicate row indices may be present.
    Row indices must be in the range 0 to {\tt n\_row-1}
    (the matrix is 0-based).

    Some routines take a triplet-form input, with arguments
    {\tt nz}, {\tt Ti}, and {\tt Tj}.  This error code is returned
    if {\tt nz} is less than zero,
    if any row    index in {\tt Ti} is outside the range 0 to {\tt n\_col-1}, or
    if any column index in {\tt Tj} is outside the range 0 to {\tt n\_row-1}.

\item {\tt UMFPACK\_ERROR\_different\_pattern},  (-11):
    The most common cause of this error is that the pattern of the
    matrix has changed between the symbolic and numeric factorization.
    It can also occur if the {\tt Numeric} or {\tt Symbolic} object has
    been subtly corrupted by your program.

\item {\tt UMFPACK\_ERROR\_invalid\_system},  (-13):
    The {\tt sys} argument provided to one of the solve routines is invalid.

\item {\tt UMFPACK\_ERROR\_invalid\_permutation},  (-15):
    The permutation vector provided as input is invalid.

\item {\tt UMFPACK\_ERROR\_file\_IO},  (-17):
    This error code is returned by the routines that save and load
    the {\tt Numeric} or {\tt Symbolic} objects to/from a file, if a
    file I/O error has occurred.  The file may not exist or may not be readable,
    you may be trying to create a file that you don't have permission to create,
    or you may be out of disk space.  The file you are trying to read might
    be the wrong one, and an earlier end-of-file condition would then result
    in this error.

\item {\tt UMFPACK\_ERROR\_internal\_error},  (-911):
    An internal error has occurred, of unknown cause.  This is either a bug
    in UMFPACK, or the result of a memory overrun from your program.
    Try modifying the file {\tt AMD/Source/amd\_internal.h} and adding
    the statement {\tt \#undef NDEBUG}, to enable the debugging mode.
    Recompile UMFPACK and rerun your program.
    A failed assertion might occur which
    can give you a better indication as to what is going wrong.  Be aware that
    UMFPACK will be extraordinarily slow when running in debug mode.
    If all else fails, contact the developer (davis@cise.ufl.edu) with
    as many details as possible.

\end{itemize}

%-------------------------------------------------------------------------------
\subsection{Larger examples}
%-------------------------------------------------------------------------------

Full examples of all user-callable UMFPACK routines
are available in four stand-alone C main programs, {\tt umfpack\_*\_demo.c}.
Another example is
the UMFPACK mexFunction, {\tt umfpackmex.c}.  The mexFunction accesses only the
user-callable C interface to UMFPACK.  The only features that it does not use
are the support for the triplet form (MATLAB's sparse arrays are already in the
compressed column form) and the ability to reuse the {\tt Symbolic} object to
numerically factorize a matrix whose pattern is the same as a prior matrix
analyzed by {\tt umfpack\_*\_symbolic} or {\tt umfpack\_*\_qsymbolic}.  The
latter is an important feature, but the mexFunction does not return its opaque
{\tt Symbolic} and {\tt Numeric} objects to MATLAB.  Instead, it gets the
contents of these objects after extracting them via the {\tt umfpack\_*\_get\_*}
routines, and returns them as MATLAB sparse matrices.

The {\tt umf4.c} program for reading matrices in Harwell/Boeing format
\cite{DuffGrimesLewis87b} is provided.  It requires three Fortran 77 programs
({\tt readhb.f}, {\tt readhb\_nozeros.f}, and {\tt readhb\_size.f})
for reading in the sample Harwell/Boeing files in the {\tt UMFPACK/Demo/HB}
directory.  More matrices are available at
http://www.cise.ufl.edu/research/sparse/matrices.
Type {\tt make hb} in the {\tt UMFPACK/Demo/HB} directory
to compile and run this demo.  This program was used for the experimental
results in \cite{Davis03}.

%-------------------------------------------------------------------------------
\section{Synopsis of C-callable routines}
\label{Synopsis}
%-------------------------------------------------------------------------------

Each subsection, below, summarizes the input variables, output variables, return
values, and calling sequences of the routines in one category.  Variables with
the same name as those already listed in a prior category have the same size
and type.

The real, {\tt long} integer {\tt umfpack\_dl\_*} routines are
identical to the real, {\tt int} routines, except that {\tt \_di\_} is replaced
with {\tt \_dl\_} in the name, and all {\tt int} arguments become {\tt long}.
Similarly, the complex, {\tt long} integer {\tt umfpack\_zl\_*} routines are
identical to the complex, {\tt int} routines, except that {\tt \_zi\_} is
replaced
with {\tt \_zl\_} in the name, and all {\tt int} arguments become {\tt long}.
Only the real and complex {\tt int} versions are listed in the synopsis below.

The matrix $\m{A}$ is {\tt m}-by-{\tt n} with {\tt nz} entries.

The {\tt sys} argument of {\tt umfpack\_*\_solve}
is an integer in the range 0 to 14 which defines which linear system is
to be solved.
\footnote{Integer values for {\tt sys} are used instead of strings (as in LINPACK
and LAPACK) to avoid C-to-Fortran portability issues.}
Valid values are listed in Table~\ref{sys}.
The notation $\m{A}\he$ refers to the matrix transpose, which is the
complex conjugate transpose for complex matrices ({\tt A'} in MATLAB).
The array transpose is $\m{A}\tr$, which is {\tt A.'} in MATLAB.

\begin{table}
\begin{center}
\caption{UMFPACK {\tt sys} parameter}
\label{sys}
{\footnotesize
\begin{tabular}{ll|l}
\hline
Value & & system \\
\hline
& & \\
{\tt UMFPACK\_A}      &  (0) & $\m{Ax}=\m{b}$ \\
{\tt UMFPACK\_At}     &  (1) & $\m{A}\he\m{x}=\m{b}$ \\
{\tt UMFPACK\_Aat}    &  (2) & $\m{A}\tr\m{x}=\m{b}$ \\
& & \\
\hline
& & \\
{\tt UMFPACK\_Pt\_L}  &  (3) & $\m{P}\tr\m{Lx}=\m{b}$ \\
{\tt UMFPACK\_L}      &  (4) & $\m{Lx}=\m{b}$ \\
{\tt UMFPACK\_Lt\_P}  &  (5) & $\m{L}\he\m{Px}=\m{b}$ \\
{\tt UMFPACK\_Lat\_P} &  (6) & $\m{L}\tr\m{Px}=\m{b}$ \\
{\tt UMFPACK\_Lt}     &  (7) & $\m{L}\he\m{x}=\m{b}$ \\
{\tt UMFPACK\_Lat}    &  (8) & $\m{L}\tr\m{x}=\m{b}$ \\
& & \\
\hline
& & \\
{\tt UMFPACK\_U\_Qt}  &  (9) & $\m{UQ}\tr\m{x}=\m{b}$ \\
{\tt UMFPACK\_U}      & (10) & $\m{Ux}=\m{b}$ \\
{\tt UMFPACK\_Q\_Ut}  & (11) & $\m{QU}\he\m{x}=\m{b}$ \\
{\tt UMFPACK\_Q\_Uat} & (12) & $\m{QU}\tr\m{x}=\m{b}$ \\
{\tt UMFPACK\_Ut}     & (13) & $\m{U}\he\m{x}=\m{b}$ \\
{\tt UMFPACK\_Uat}    & (14) & $\m{U}\tr\m{x}=\m{b}$ \\
& & \\
\hline
\end{tabular}
}
\end{center}
\end{table}

%-------------------------------------------------------------------------------
\subsection{Primary routines: real/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}
#include "umfpack.h"
int status, sys, n, m, nz, Ap [n+1], Ai [nz] ;
double Control [UMFPACK_CONTROL], Info [UMFPACK_INFO], Ax [nz], X [n], B [n] ;
void *Symbolic, *Numeric ;

status = umfpack_di_symbolic (m, n, Ap, Ai, Ax, &Symbolic, Control, Info) ;
status = umfpack_di_numeric (Ap, Ai, Ax, Symbolic, &Numeric, Control, Info) ;
status = umfpack_di_solve (sys, Ap, Ai, Ax, X, B, Numeric, Control, Info) ;
umfpack_di_free_symbolic (&Symbolic) ;
umfpack_di_free_numeric (&Numeric) ;
\end{verbatim}
}

%-------------------------------------------------------------------------------
\subsection{Alternative routines: real/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}
int Qinit [n], Wi [n] ;
double W [5*n] ;

umfpack_di_defaults (Control) ;
status = umfpack_di_qsymbolic (m, n, Ap, Ai, Ax, Qinit, &Symbolic, Control, Info) ;
status = umfpack_di_wsolve (sys, Ap, Ai, Ax, X, B, Numeric, Control, Info, Wi, W) ;
\end{verbatim}
}

%-------------------------------------------------------------------------------
\subsection{Matrix manipulation routines: real/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}
int Ti [nz], Tj [nz], P [m], Q [n], Rp [m+1], Ri [nz], Map [nz] ;
double Tx [nz], Rx [nz], Y [m], Z [m] ;

status = umfpack_di_col_to_triplet (n, Ap, Tj) ;
status = umfpack_di_triplet_to_col (m, n, nz, Ti, Tj, Tx, Ap, Ai, Ax, Map) ;
status = umfpack_di_transpose (m, n, Ap, Ai, Ax, P, Q, Rp, Ri, Rx) ;
status = umfpack_di_scale (Y, Z, Numeric) ;
\end{verbatim}
}

%-------------------------------------------------------------------------------
\subsection{Getting the contents of opaque objects: real/{\tt int}}
%-------------------------------------------------------------------------------

The {\tt filename} string should be large enough to hold the name of a file.

{\footnotesize
\begin{verbatim}
int lnz, unz, Lp [m+1], Lj [lnz], Up [n+1], Ui [unz], do_recip ;
double Lx [lnz], Ux [unz], D [min (m,n)], Rs [m], Mx [1], Ex [1] ;
int nfr, nchains, P1 [m], Q1 [n], Front_npivcol [n+1], Front_parent [n+1], Front_1strow [n+1],
    Front_leftmostdesc [n+1], Chain_start [n+1], Chain_maxrows [n+1], Chain_maxcols [n+1] ;
char filename [100] ;

status = umfpack_di_get_lunz (&lnz, &unz, &m, &n, &nz_udiag, Numeric) ;
status = umfpack_di_get_numeric (Lp, Lj, Lx, Up, Ui, Ux, P, Q, D,
    &do_recip, Rs, Numeric) ;
status = umfpack_di_get_symbolic (&m, &n, &n1, &nz, &nfr, &nchains, P1, Q1,
    Front_npivcol, Front_parent, Front_1strow, Front_leftmostdesc,
    Chain_start, Chain_maxrows, Chain_maxcols, Symbolic) ;
status = umfpack_di_load_numeric (&Numeric, filename) ;
status = umfpack_di_save_numeric (Numeric, filename) ;
status = umfpack_di_load_symbolic (&Symbolic, filename) ;
status = umfpack_di_save_symbolic (Symbolic, filename) ;
status = umfapck_di_get_determinant (Mx, Ex, Numeric, Info) ;
\end{verbatim}
}

%-------------------------------------------------------------------------------
\subsection{Reporting routines: real/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}

umfpack_di_report_status (Control, status) ;
umfpack_di_report_control (Control) ;
umfpack_di_report_info (Control, Info) ;
status = umfpack_di_report_matrix (m, n, Ap, Ai, Ax, 1, Control) ;
status = umfpack_di_report_matrix (m, n, Rp, Ri, Rx, 0, Control) ;
status = umfpack_di_report_numeric (Numeric, Control) ;
status = umfpack_di_report_perm (m, P, Control) ;
status = umfpack_di_report_perm (n, Q, Control) ;
status = umfpack_di_report_symbolic (Symbolic, Control) ;
status = umfpack_di_report_triplet (m, n, nz, Ti, Tj, Tx, Control) ;
status = umfpack_di_report_vector (n, X, Control) ;
\end{verbatim}
}


%-------------------------------------------------------------------------------
\subsection{Primary routines: complex/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}
double Az [nz], Xx [n], Xz [n], Bx [n], Bz [n] ;

status = umfpack_zi_symbolic (m, n, Ap, Ai, Ax, Az, &Symbolic, Control, Info) ;
status = umfpack_zi_numeric (Ap, Ai, Ax, Az, Symbolic, &Numeric, Control, Info) ;
status = umfpack_zi_solve (sys, Ap, Ai, Ax, Az, Xx, Xz, Bx, Bz, Numeric, Control, Info) ;
umfpack_zi_free_symbolic (&Symbolic) ;
umfpack_zi_free_numeric (&Numeric) ;
\end{verbatim}
}

The arrays {\tt Ax}, {\tt Bx}, and {\tt Xx} double in size if
any imaginary argument ({\tt Az}, {\tt Xz}, or {\tt Bz}) is {\tt NULL}.

%-------------------------------------------------------------------------------
\subsection{Alternative routines: complex/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}
double Wz [10*n] ;

umfpack_zi_defaults (Control) ;
status = umfpack_zi_qsymbolic (m, n, Ap, Ai, Ax, Az, Qinit, &Symbolic, Control, Info) ;
status = umfpack_zi_wsolve (sys, Ap, Ai, Ax, Az, Xx, Xz, Bx, Bz, Numeric, Control, Info, Wi, Wz) ;
\end{verbatim}
}

%-------------------------------------------------------------------------------
\subsection{Matrix manipulation routines: complex/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}
double Tz [nz], Rz [nz], Yx [m], Yz [m], Zx [m], Zz [m] ;

status = umfpack_zi_col_to_triplet (n, Ap, Tj) ;
status = umfpack_zi_triplet_to_col (m, n, nz, Ti, Tj, Tx, Tz, Ap, Ai, Ax, Az, Map) ;
status = umfpack_zi_transpose (m, n, Ap, Ai, Ax, Az, P, Q, Rp, Ri, Rx, Rz, 1) ;
status = umfpack_zi_transpose (m, n, Ap, Ai, Ax, Az, P, Q, Rp, Ri, Rx, Rz, 0) ;
status = umfpack_zi_scale (Yx, Yz, Zx, Zz, Numeric) ;
\end{verbatim}
}

The arrays {\tt Tx}, {\tt Rx}, {\tt Yx}, and {\tt Zx} double in size if
any imaginary argument ({\tt Tz}, {\tt Rz}, {\tt Yz}, or {\tt Zz}) is {\tt NULL}.

%-------------------------------------------------------------------------------
\subsection{Getting the contents of opaque objects: complex/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}
double Lz [lnz], Uz [unz], Dx [min (m,n)], Dz [min (m,n)], Mz [1] ;

status = umfpack_zi_get_lunz (&lnz, &unz, &m, &n, &nz_udiag, Numeric) ;
status = umfpack_zi_get_numeric (Lp, Lj, Lx, Lz, Up, Ui, Ux, Uz, P, Q, Dx, Dz,
    &do_recip, Rs, Numeric) ;
status = umfpack_zi_get_symbolic (&m, &n, &n1, &nz, &nfr, &nchains, P1, Q1,
    Front_npivcol, Front_parent, Front_1strow, Front_leftmostdesc,
    Chain_start, Chain_maxrows, Chain_maxcols, Symbolic) ;
status = umfpack_zi_load_numeric (&Numeric, filename) ;
status = umfpack_zi_save_numeric (Numeric, filename) ;
status = umfpack_zi_load_symbolic (&Symbolic, filename) ;
status = umfpack_zi_save_symbolic (Symbolic, filename) ;
status = umfapck_zi_get_determinant (Mx, Mz, Ex, Numeric, Info) ;
\end{verbatim}
}

The arrays {\tt Lx}, {\tt Ux}, {\tt Dx}, and {\tt Mx} double in size if
any imaginary argument ({\tt Lz}, {\tt Uz}, {\tt Dz}, or {\tt Mz}) is {\tt NULL}.

%-------------------------------------------------------------------------------
\subsection{Reporting routines: complex/{\tt int}}
%-------------------------------------------------------------------------------

{\footnotesize
\begin{verbatim}

umfpack_zi_report_status (Control, status) ;
umfpack_zi_report_control (Control) ;
umfpack_zi_report_info (Control, Info) ;
status = umfpack_zi_report_matrix (m, n, Ap, Ai, Ax, Az, 1, Control) ;
status = umfpack_zi_report_matrix (m, n, Rp, Ri, Rx, Rz, 0, Control) ;
status = umfpack_zi_report_numeric (Numeric, Control) ;
status = umfpack_zi_report_perm (m, P, Control) ;
status = umfpack_zi_report_perm (n, Q, Control) ;
status = umfpack_zi_report_symbolic (Symbolic, Control) ;
status = umfpack_zi_report_triplet (m, n, nz, Ti, Tj, Tx, Tz, Control) ;
status = umfpack_zi_report_vector (n, Xx, Xz, Control) ;
\end{verbatim}
}

The arrays {\tt Ax}, {\tt Rx}, {\tt Tx}, and {\tt Xx} double in size if
any imaginary argument ({\tt Az}, {\tt Rz}, {\tt Tz}, or {\tt Xz}) is {\tt NULL}.


%-------------------------------------------------------------------------------
\subsection{Utility routines}
%-------------------------------------------------------------------------------

These routines are the same in all four versions of UMFPACK.

{\footnotesize
\begin{verbatim}
double t, s [2] ;

t = umfpack_timer ( ) ;
umfpack_tic (s) ;
umfpack_toc (s) ;

\end{verbatim}
}

%-------------------------------------------------------------------------------
\subsection{AMD ordering routines}
%-------------------------------------------------------------------------------

UMFPACK makes use of the AMD ordering package for its symmetric ordering
strategy.  You may also use these four user-callable routines in your own C
programs.  You need to include the {\tt amd.h} file only if you make direct
calls to the AMD routines themselves.  The {\tt int} versions are summarized
below; {\tt long} versions are also available.  Refer to the AMD User Guide
for more information, or to the file {\tt amd.h} which documents these routines.

{\footnotesize
\begin{verbatim}
#include "amd.h"
double amd_control [AMD_CONTROL], amd_info [AMD_INFO] ;

amd_defaults (amd_control) ;
status = amd_order (n, Ap, Ai, P, amd_control, amd_info) ;
amd_control (amd_control) ;
amd_info (amd_info) ;

\end{verbatim}
}

%-------------------------------------------------------------------------------
\section{Using UMFPACK in a Fortran program}
%-------------------------------------------------------------------------------

UMFPACK includes a basic Fortran 77 interface to some of the C-callable
UMFPACK routines.
Since interfacing C and Fortran programs is not portable, this interface might
not work with all C and Fortran compilers.  Refer to Section~\ref{Install} for
more details.  The following Fortran routines are provided.
The list includes the C-callable routines that the Fortran interface
routine calls.  Refer to the corresponding C routines in Section~\ref{C} for
more details on what the Fortran routine does.

\begin{itemize}
\item {\tt umf4def}: sets the default control parameters
    ({\tt umfpack\_di\_defaults}).

\item {\tt umf4sym}: pre-ordering and symbolic factorization
    ({\tt umfpack\_di\_symbolic}).

\item {\tt umf4num}: numeric factorization
    ({\tt umfpack\_di\_numeric}).

\item {\tt umf4solr}: solve a linear system with iterative refinement
    ({\tt umfpack\_di\_solve}).

\item {\tt umf4sol}: solve a linear system without iterative refinement
    ({\tt umfpack\_di\_solve}).  Sets {\tt Control [UMFPACK\_IRSTEP]}
    to zero, and does not require the matrix $\m{A}$.

\item {\tt umf4scal}: scales a vector using UMFPACK's scale factors
    ({\tt umfpack\_di\_scale}).

\item {\tt umf4fnum}: free the {\tt Numeric} object
    ({\tt umfpack\_di\_free\_numeric}).

\item {\tt umf4fsym}: free the {\tt Symbolic} object
    ({\tt umfpack\_di\_free\_symbolic}).

\item {\tt umf4pcon}: prints the control parameters
    ({\tt umfpack\_di\_report\_control}).

\item {\tt umf4pinf}: print statistics
    ({\tt umfpack\_di\_report\_info}).

\item {\tt umf4snum}: save the {\tt Numeric} object to a file
    ({\tt umfpack\_di\_save\_numeric}).

\item {\tt umf4ssym}: save the {\tt Symbolic} object to a file
    ({\tt umfpack\_di\_save\_symbolic}).

\item {\tt umf4lnum}: load the {\tt Numeric} object from a file
    ({\tt umfpack\_di\_load\_numeric}).

\item {\tt umf4lsym}: load the {\tt Symbolic} object from a file
    ({\tt umfpack\_di\_load\_symbolic}).
\end{itemize}

The matrix $\m{A}$ is passed to UMFPACK in compressed column form, with 0-based
indices.  In Fortran, for an {\tt m}-by-{\tt n} matrix $\m{A}$ with {\tt nz}
entries, the row indices of the first column (column 1) are in
{\tt Ai (Ap(1)+1} \ldots {\tt Ap(2))}, with values in
{\tt Ax (Ap(1)+1} \ldots {\tt Ap(2))}.  The last column (column {\tt n}) is in
{\tt Ai (Ap(n)+1} \ldots {\tt Ap(n+1))} and
{\tt Ax (Ap(n)+1} \ldots {\tt Ap(n+1))}.
The number of entries in the matrix is thus {\tt nz = Ap (n+1)}.
The row indices in {\tt Ai} are in the range 0 to {\tt m}-1.  They must be
sorted, with no duplicate entries allowed.  None of the UMFPACK routines
modify the input matrix $\m{A}$.
The following definitions apply for the Fortran routines:

{\footnotesize
\begin{verbatim}
    integer m, n, Ap (n+1), Ai (nz), symbolic, numeric, filenum, status
    double precision Ax (nz), control (20), info (90), x (n), b (n)
\end{verbatim}
}

UMFPACK's status is returned in either a {\tt status} argument, or in
{\tt info (1)}.
It is zero if UMFPACK was successful, 1 if the matrix is singular (this is a
warning, not an error), and negative if an error occurred.
Section~\ref{error_codes} summarizes the possible values of {\tt status}
and {\tt info (1)}.
See Table~\ref{sys} for a list of the values of the {\tt sys} argument.
See Table~\ref{control} for a list of the control parameters (the
Fortran usage is the same as the MATLAB usage for this array).

For the {\tt Numeric} and {\tt Symbolic} handles, it is probably safe to
assume that a Fortran {\tt integer} is sufficient to store a C pointer.  If
that does not work, try defining {\tt numeric} and {\tt symbolic} in your
Fortran program as integer arrays of size 2.  You will need to define them
as {\tt integer*8} if you compile UMFPACK in the 64-bit mode.

To avoid passing strings between C and Fortran in the load/save routines,
a file number is passed instead, and the C interface constructs a file name
(if {\tt filenum} is 42, the {\tt Numeric} file name is {\tt n42.umf}, and
the {\tt Symbolic} file name is {\tt s42.umf}).

The following is a summary of the calling sequence of each Fortran
interface routine.  An example of their use is in the {\tt Demo/umf4hb.f}
file.  That routine also includes an example of how to convert a 1-based
sparse matrix into 0-based form.  For more details on the arguments of each
routine, refer to the arguments of the same name in the corresponding
C-callable routine, in Sections~\ref{Primary}~through~\ref{Utility}.
The only exception is the {\tt control} argument of {\tt umf4sol},
which sets {\tt control (8)} to zero to disable iterative refinement.
Note that the solve routines do not overwrite {\tt b} with the solution,
but return their solution in a different array, {\tt x}.

{\footnotesize
\begin{verbatim}
    call umf4def (control)
    call umf4sym (m, n, Ap, Ai, Ax, symbolic, control, info)
    call umf4num (Ap, Ai, Ax, symbolic, numeric, control, info)
    call umf4solr (sys, Ap, Ai, Ax, x, b, numeric, control, info)
    call umf4sol (sys, x, b, numeric, control, info)
    call umf4scal (x, b, numeric, status)
    call umf4fnum (numeric)
    call umf4fsym (symbolic)
    call umf4pcon (control)
    call umf4pinf (control)
    call umf4snum (numeric, filenum, status)
    call umf4ssym (symbolic, filenum, status)
    call umf4lnum (numeric, filenum, status)
    call umf4lsym (symbolic, filenum, status)
\end{verbatim}
}

Access to the complex routines in UMFPACK is provided by the interface
routines in {\tt umf4\_f77zwrapper.c}.  The following is a synopsis
of each routine.  All the arguments are the same as the real versions,
except {\tt Az}, {\tt xz}, and {\tt bz} are the imaginary parts of
the matrix, solution, and right-hand-side, respectively.  The
{\tt Ax}, {\tt x}, and {\tt b} are the real parts.

{\footnotesize
\begin{verbatim}
    call umf4zdef (control)
    call umf4zsym (m, n, Ap, Ai, Ax, Az, symbolic, control, info)
    call umf4znum (Ap, Ai, Ax, Az, symbolic, numeric, control, info)
    call umf4zsolr (sys, Ap, Ai, Ax, Az, x, xz, b, bz, numeric, control, info)
    call umf4zsol (sys, x, xz, b, bz, numeric, control, info)
    call umf4zscal (x, xz, b, bz, numeric, status)
    call umf4zfnum (numeric)
    call umf4zfsym (symbolic)
    call umf4zpcon (control)
    call umf4zpinf (control)
    call umf4zsnum (numeric, filenum, status)
    call umf4zssym (symbolic, filenum, status)
    call umf4zlnum (numeric, filenum, status)
    call umf4zlsym (symbolic, filenum, status)
\end{verbatim}
}

The Fortran interface does not support the packed complex case.

%-------------------------------------------------------------------------------
\section{Installation}
\label{Install}
%-------------------------------------------------------------------------------

%-------------------------------------------------------------------------------
\subsection{Installing the C library}
%-------------------------------------------------------------------------------

The following discussion assumes you have the {\tt make} program, either in
Unix, or in Windows with Cygwin\footnote{www.cygwin.com}.
You can skip this section and go to next one if all you want to use is
the UMFPACK and AMD mexFunctions in MATLAB.

You will need to install both UMFPACK v4.4 and AMD v1.1 (or AMD v1.0) to use UMFPACK.
The {\tt UMFPACK} and {\tt AMD} subdirectories must be placed side-by-side
within the same directory.  AMD is a stand-alone package that
is required by UMFPACK.  UMFPACK can be compiled without the
BLAS \cite{DaydeDuff99,ACM679a,ATLAS,GotoVandeGeijn02},
but your performance will be much less than what it should be.

System-dependent configurations are in the {\tt AMD/Make}
and {\tt UMFPACK/Make} directories (the \newline
{\tt UMFPACK/Make} directory is actually
just a symbolic link to {\tt AMD/Make}\footnote{Windows might not extract
the symbolic link {\tt UMFPACK/Make} correctly.  If it doesn't, simply
create the {\tt UMFPACK/Make} folder by copying it from {\tt AMD/Make}.}).
You can edit the {\tt Make.include}
files in either of those directories to customize the compilation.  The default
settings will work on most systems, except that UMFPACK will be compiled so
that it does not use the BLAS.  Sample configuration files are provided
for Linux, Sun Solaris, SGI IRIX, IBM AIX, and the DEC/Compaq Alpha.

To compile and install both packages,
go to the {\tt UMFPACK} directory and type {\tt make}.  This will compile the
libraries ({\tt AMD/Lib/libamd.a} and {\tt UMFPACK/Lib/libumfpack.a}).
A demo of the AMD ordering routine will be compiled and tested in
the {\tt AMD/Demo} directory, and five demo programs will then be
compiled and tested in the {\tt UMFPACK/Demo} directory.
The outputs of these demo programs will then be compared with output
files in the distribution.  Expect to see a few differences, such as
residual norms, compile-time control settings, and perhaps memory usage
differences.  The AMD and UMFPACK mexFunctions for
use in MATLAB will also be compiled.  If you do not have MATLAB 6.0 or
later, type {\tt make lib} instead.

If you have the GNU version of {\tt make}, the {\tt Source/GNUmakefile} and
{\tt MATLAB/GNUmakefile} files are used.  These are much more concise than
what the ``old'' version of {\tt make} can handle.  If you do not have
GNU {\tt make}, the {\tt Source/Makefile} and {\tt MATLAB/Makefile} files
are used instead.  Each UMFPACK source file is compiled into four
versions ({\tt double} / complex, and {\tt int} / {\tt long}).  A proper
old-style {\tt Makefile} is cumbersome in this case, so these two
{\tt Makefile}'s have been constructed by brute force.  They ignore
dependencies, and simply compile everything.  I highly recommend using GNU
{\tt make} if you wish to modify UMFPACK.

If you compile UMFPACK and AMD and then later change the {\tt Make.include}
file or your system-specific configuration file such as {\tt Make.linux},
then you should type {\tt make purge} and then {\tt make} to recompile.

Here are the various parameters that you can control in your
{\tt Make.include} file:

\begin{itemize}
\item {\tt CC = } your C compiler, such as {\tt cc}.
\item {\tt RANLIB = } your system's {\tt ranlib} program, if needed.
\item {\tt CFLAGS = } optimization flags, such as {\tt -O}.
    Add {\tt -DLP64} if you are compiling in 64-bit mode
    (32 bit {\tt int}'s, 64 bit {\tt long}'s, and 64 bit pointers).
\item {\tt CONFIG = } configuration settings for the BLAS, memory allocation
    routines, and timing routines.
\item {\tt LIB = } your libraries, such as {\tt -lm} or {\tt -lblas}.
\item {\tt RM =} the command to delete a file.
\item {\tt MV =} the command to rename a file.
\item {\tt MEX =} the command to compile a MATLAB mexFunction.
    If you are using MATLAB 5, you need to add {\tt -DNBLAS} and
    {\tt -DNUTIL} to this command.  An example is provided in
    the {\tt Make/Make.include} file.
\item {\tt F77 =} the command to compile a Fortran program (optional).
\item {\tt F77FLAGS =} the Fortran compiler flags (optional).
\item {\tt F77LIB =} the Fortran libraries (optional).
\end{itemize}

The {\tt CONFIG} string can include combinations of the following;
most deal with how the BLAS are called:
\begin{itemize}
\item {\tt -DNBLAS} if you do not have any BLAS at all.
\item {\tt -DCBLAS} if you have the C-BLAS \cite{ATLAS}.
\item {\tt -DNSUNPERF} if you are on Solaris but do not have the Sun
    Performance Library (for the BLAS).
\item {\tt -DNSCSL} if you on SGI IRIX but do not have the SCSL BLAS library.
\item {\tt -DLONGBLAS} if your BLAS can take {\tt long} integer input
    arguments.  If not defined, then the {\tt umfpack\_*l\_*} versions of
    UMFPACK that use {\tt long} integers do not call the BLAS.
    This flag is set internally when using the Sun Performance BLAS
    or SGI's SCSL BLAS (both have 64-bit versions of the BLAS).
\item Options for controlling how C calls the Fortran BLAS:
    {\tt -DBLAS\_BY\_VALUE}, {\tt -DBLAS\_NO\_UNDERSCORE},
    and {\tt -DBLAS\_CHAR\_ARG}.  These are set automatically for Windows,
    Sun Solaris, SGI Irix, Red Hat Linux, Compaq Alpha, and
    AIX (the IBM RS 6000).  They are ignored if you are using
    the C-BLAS interface to the BLAS.
\item {\tt -DGETRUSAGE} if you have the {\tt getrusage} function.
\item {\tt -DNUTIL} if you wish to compile the MATLAB-callable
    UMFPACK mexFunction with the {\tt mxMalloc}, {\tt mxRealloc}
    and {\tt mxFree} routines, instead of the undocumented (but
    superior) {\tt utMalloc}, {\tt utRealloc}, and {\tt utFree}
    routines.  The default is to use the {\tt ut*} routines on
    Unix, and the {\tt mx*} routines on Windows.
\item {\tt -DNPOSIX} if you do not have the POSIX-compliant
    {\tt sysconf} and {\tt times} routines used by
    {\tt umfpack\_tic} and {\tt umfpack\_toc}.
\item {\tt -DNRECIPROCAL} controls a trade-off between speed and accuracy.
    If defined (or if the pivot value itself is less than $10^{-12}$),
    then the pivot column is divided by the pivot value during numeric
    factorization.  Otherwise, it is multiplied by the reciprocal of the
    pivot, which is faster but can be less accurate.  The default is
    to multiply by the reciprocal unless the pivot value is small.
    This option also modifies how the rows of the matrix $\m{A}$ are
    scaled.  If {\tt -DNRECIPROCAL} is defined (or if any scale factor is
    less than $10^{-12}$), entries in the rows of $\m{A}$ are divided
    by the scale factors.  Otherwise, they are multiplied by the reciprocal.
    When compiling the complex routines with the GNU {\tt gcc} compiler, the
    pivot column is always divided by the pivot entry, because of a
    numerical accuracy issue encountered with {\tt gcc} version 3.2 with a
    few complex matrices on a Pentium 4M (running Linux).  You can still
    use {\tt -DNRECIPROCAL} to control how the scale factors
    for the rows of $\m{A}$ are applied.
\item {\tt -DNO\_DIVIDE\_BY\_ZERO} controls how UMFPACK treats zeros
    on the diagonal of $\m{U}$, for a singular matrix $\m{A}$.
    If defined, then no division by
    zero is performed (a zero entry on the diagonal of $\m{U}$ is
    treated as if it were equal to one).  By default,
    UMFPACK will divide by zero.
\item {\tt -DNO\_TIMER} controls whether or not timing routines
    are to be called.  If defined, no timers are used.
    Timers are included by default.
\end{itemize}

If a Fortran BLAS package is used you may see compiler warnings.  The BLAS
routines
{\tt dgemm}, {\tt dgemv}, {\tt dger}, {\tt dtrsm}, {\tt dtrsv}, {\tt dscal}
and their corresponding complex versions are used.
Header files are not provided for the Fortran
BLAS.  You may safely ignore all of these warnings.

I highly recommend the recent BLAS by Goto and van de Geijn
\cite{GotoVandeGeijn02}.  Using this BLAS increased the performance
of UMFPACK by up to 50\% on a Dell Latitude C840 laptop (2GHz Pentium 4M,
512K L2 cache, 1GB main memory).  The peak performance of
{\tt umfpack\_di\_numeric} with Goto and van de Geijn's BLAS is 1.6 Gflops
on this computer.  In MATLAB, the peak performance of UMFPACK on
a dense matrix (stored in sparse format) is 900 Mflops, as compared to
1 Gflop for {\tt x = A}$\backslash${\tt b}
when {\tt A} is stored as a regular full matrix.

When you compile your program that uses the C-callable UMFPACK library,
you need to link your program with both libraries
({\tt UMFPACK/Lib/libumfpack.a} and {\tt AMD/Lib/libamd.a})
and you need to tell your compiler to look in the
directories {\tt UMFPACK/Include} and {\tt AMD/Include} for include
files.  See {\tt UMFPACK/Demo/Makefile} for an example.
You do not need to directly include any AMD include files in your
program, unless you directly call AMD routines.  You only need the
\begin{verbatim}
#include "umfpack.h"
\end{verbatim}
statement, as described in Section~\ref{Synopsis}.

If you would like to compile both 32-bit and 64-bit versions of the libraries,
you will need to do it in two steps.  Modify your {\tt Make/Make.<arch>}
file, and select the 32-bit option.  Type {\tt make} in the {\tt UMFPACK}
directory, which creates the {\tt UMFPACK/Lib/libumfpack.a} and
{\tt AMD/Lib/libamd.a} libraries.  Rename those two files.  Edit your
{\tt Make/Make.<arch>} and select the 64-bit option.   Type {\tt make purge},
and then {\tt make}, and you will create the 64-bit libraries.
You can use the same {\tt umfpack.h} include file for both 32-bit and
64-bit versions.  Simply link your program with the appropriate 32-bit
or 64-bit compiled version of the UMFPACK and AMD libraries.

Type {\tt make hb} in the {\tt UMFPACK/Demo/HB} directory
to compile and run a C program that reads in and factorizes
Harwell/Boeing matrices.  Note that this uses a stand-alone Fortran
program to read in the Fortran-formatted Harwell/Boeing matrices and
write them to a file which can be read by a C program.

The {\tt umf\_multicompile.c} file has been added to assist in the
compilation of UMFPACK in Microsoft Visual Studio, for Windows.

%-------------------------------------------------------------------------------
\subsection{Installing the MATLAB interface}
%-------------------------------------------------------------------------------

If all you want to do is use the UMFPACK mexFunction in MATLAB, you can skip
the use of the {\tt make} command described above.  Simply type
{\tt umfpack\_make} in MATLAB while in the {\tt UMFPACK/MATLAB} directory.
You can also type {\tt amd\_make} in the {\tt AMD/MATLAB} directory
to compile the stand-alone AMD mexFunction (this is not required to
compile the UMFPACK mexFunction).  This works on any computer with MATLAB,
including Windows.

You will be prompted to select several configuration options, including
whether or not to use the BLAS.
MATLAB 5.3 (or earlier) does not include the BLAS, so you either have to
compile UMFPACK without the BLAS (UMFPACK will be slow), or modify your
{\tt <matlab>/bin/mexopts.sh} by adding your BLAS library
to the {\tt CLIBS} string,
where {\tt <matlab>} is the directory in which MATLAB is installed.

If you are using Windows and the {\tt lcc} compiler bundled with
MATLAB 6.1, then you may need to copy the
{\tt UMFPACK}$\backslash${\tt MATLAB}$\backslash${\tt lcc\_lib}$\backslash${\tt libmwlapack.lib}
file into the
{\tt <matlab>}$\backslash${\tt extern}$\backslash${\tt lib}$\backslash${\tt win32}$\backslash${\tt lcc}$\backslash$
directory.
Next, type {\tt mex -setup}
at the MATLAB prompt, and ask MATLAB to select the {\tt lcc} compiler.
MATLAB 6.1 has built-in BLAS, but in that version of MATLAB the BLAS
cannot be accessed by a mexFunction compiled by {\tt lcc} without first copying
this file to the location listed above.
If you have MATLAB 6.5 or later, you can probably skip this step.

%-------------------------------------------------------------------------------
\subsection{Installing the Fortran interface}
%-------------------------------------------------------------------------------

Once the 32-bit C-callable UMFPACK library is compiled, you can also compile
the Fortran interface, by typing {\tt make fortran}.  This will create
the {\tt umf4hb} program, test it, and compare the output with the
file {\tt umf4hb.out} in the distribution.
If you compiled UMFPACK in 64-bit mode, you need to use {\tt make fortran64}
instead, which compiles the {\tt umf4hb64} program and compares its output
with the file {\tt umf4hb64.out}.
Refer to the comments in the {\tt Demo/umf4\_f77wrapper.c} file
for more details.

This interface is {\bf highly} non-portable, since it depends
on how C and Fortran are interfaced.
Because of this issue, the interface is included in the {\tt Demo} directory,
and not as a primary part of the UMFPACK library.  The interface routines are
not included in the compiled {\tt UMFPACK/Lib/libumfpack.a} library, but left
as stand-alone compiled files ({\tt umf4\_f77wrapper.o} and
{\tt umf4\_f77wrapper64.o} in the {\tt Demo} directory).
You may need to modify the interface routines in the file
{\tt umf4\_f77wrapper.c} if you are using compilers for which this interface
has not been tested.

%-------------------------------------------------------------------------------
\subsection{Known Issues}
%-------------------------------------------------------------------------------

The Microsoft C or C++ compilers on a Pentium badly break the IEEE 754 standard,
and do not treat NaN's properly.  According to IEEE 754, the expression
{\tt (x != x)} is supposed to be true if and only if {\tt x} is NaN.  For
non-compliant compilers in Windows that expression is always false, and another
test must be used: {\tt (x < x)} is true if and only if {\tt x}
is NaN.  For compliant compilers, {\tt (x < x)} is always false, for any
value of {\tt x} (including NaN).
To cover both cases, UMFPACK when running under Microsoft Windows
defines the following macro, which is true if and only if {\tt x} is NaN,
regardless of whether your compiler is compliant or not:

\begin{verbatim}
#define SCALAR_IS_NAN(x) (((x) != (x)) || ((x) < (x)))
\end{verbatim}

If your compiler breaks this test, then UMFPACK will fail catastrophically
if it encounters a NaN.  You will not just see NaN's in your output; UMFPACK
will probably crash with a segmentation fault.  In that case, you might try to
see if the common (but non-ANSI C) routine {\tt isnan} is available, and modify
the macro {\tt SCALAR\_IS\_NAN} in {\tt umf\_version.h} accordingly.  The
simpler (and IEEE 754-compliant) test {\tt (x != x)} is always true with Linux
on a PC, and on every Unix compiler I have tested.

Some compilers will complain about the Fortran BLAS being defined implicitly.
C prototypes for the BLAS are not used, except the C-BLAS.  Some compilers
will complain about unrecognized {\tt \#pragma}'s.  You may safely ignore
all of these warnings.

%-------------------------------------------------------------------------------
\section{Future work}
\label{Future}
%-------------------------------------------------------------------------------

Here are a few features that are not in UMFPACK Version 4.4, in no particular
order.  They may appear in a future release of UMFPACK.  If you are interested,
let me know and I could consider including them:

\begin{enumerate}

\item Future versions may have different default {\tt Control} parameters.
    Future versions may return more statistics in the {\tt Info} array, and
    they may use more entries in the {\tt Control} array.
    These two arrays will probably become larger, since there are very few
    unused entries.  If they change in size, the constants
    {\tt UMFPACK\_CONTROL} and {\tt UMFPACK\_INFO} defined in {\tt umfpack.h}
    will be changed to reflect their new size.  Your C program should use
    these constants when declaring the size of these two arrays.  Do not
    define them as {\tt Control [20]} and {\tt Info [90]}.

\item Forward/back solvers for the conventional row or column-form data
    structure for $\m{L}$ and $\m{U}$ (the output of
    {\tt umfpack\_*\_di\_get\_numeric}).  This would enable a separate
    solver that could be used to write a MATLAB mexFunction
    {\tt x = lu\_refine (A, b, L, U, P, Q, R)} that gives MATLAB access
    to the iterative refinement algorithm with sparse backward error
    analysis.  It would also be easier to handle sparse right-hand-sides
    in this data structure, and end up with good asymptotic run-time
    in this case
    (particularly for $\m{Lx}=\m{b}$; see \cite{GilbertPeierls88}).

\item Complex absolute value computations could be
    based on FDLIBM (see \newline
    http://www.netlib.org/fdlibm),
    using the {\tt hypot(x,y)} routine.

\item When using iterative refinement, the residual $\m{Ax}-\m{b}$ could be
    returned by {\tt umfpack\_solve}.

\item The solve routines could handle multiple right-hand sides, and sparse
    right-hand sides.  See {\tt umfpack\_solve} for the MATLAB version
    of this feature.

\item An option to redirect the error and diagnostic output.

\item Permutation to block-triangular-form \cite{Duff78a} for the C-callable
    interface.  There are two routines in the ACM Collected
    Algorithms (529 and 575) \cite{Duff81b,Duff78b}
    that could be translated from Fortran
    to C and included in UMFPACK.  This would result in better performance
    for matrices from circuit simulation and
    chemical process engineering.  See {\tt umfpack\_btf.m} for the MATLAB
    version of this feature.  An upcoming package (KLU) will include
    this feature.

\item The ability to use user-provided {\tt malloc}, {\tt free}, and
    {\tt realloc} memory allocation routines.  Note that UMFPACK makes very
    few calls to these routines.  You can do this at compile-time by
    modifying the definitions of {\tt ALLOCATE}, {\tt FREE}, and
    {\tt REALLOCATE} in the file {\tt umf\_internal.h}.  Be sure to document
    your changes carefully when you change UMFPACK source code.

\item The ability to use user-provided work arrays, so that {\tt malloc},
    {\tt free}, and {\tt realloc} realloc are not called.  The
    {\tt umfpack\_*\_wsolve} routine is one example.

\item A method that takes time proportional to the number of nonzeros in
    $\m{A}$ to compute the symbolic factorization \cite{GilbertNgPeyton94}.
    This would improve the performance of the symmetric and 2-by-2 strategies,
    and the unsymmetric strategy when dense rows are present.
    The current method takes
    time proportional to the number of nonzeros in the upper bound of $\m{U}$.
    The method used in UMFPACK exploits super-columns, however, so this
    bound is rarely reached.

\item Other basic sparse matrix operations, such as sparse matrix
    multiplication, could be included.

\item A more complete Fortran interface.

\item A C++ interface.

\item A parallel version using MPI.  This would require a large amount
    of effort.

\end{enumerate}


%-------------------------------------------------------------------------------
\newpage
\section{The primary UMFPACK routines}
\label{Primary}
%-------------------------------------------------------------------------------

The include files are the same for all four versions of
UMFPACK.  The generic integer type is {\tt Int}, which is an {\tt int} or
{\tt long}, depending on which version of UMFPACK you are using.

\subsection{umfpack\_*\_symbolic}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_symbolic.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_numeric}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_numeric.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_solve}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_solve.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_free\_symbolic}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_free_symbolic.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_free\_numeric}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_free_numeric.h via sed
\end{verbatim}
}

%-------------------------------------------------------------------------------
\newpage
\section{Alternative routines}
\label{Alternative}
%-------------------------------------------------------------------------------

\subsection{umfpack\_*\_defaults}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_defaults.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_qsymbolic}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_qsymbolic.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_wsolve}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_wsolve.h via sed
\end{verbatim}
}

%-------------------------------------------------------------------------------
\newpage
\section{Matrix manipulation routines}
\label{Manipulate}
%-------------------------------------------------------------------------------

\subsection{umfpack\_*\_col\_to\_triplet}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_col_to_triplet.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_triplet\_to\_col}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_triplet_to_col.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_transpose}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_transpose.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_scale}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_scale.h via sed
\end{verbatim}
}

%-------------------------------------------------------------------------------
\newpage
\section{Getting the contents of opaque objects}
\label{Get}
%-------------------------------------------------------------------------------

\subsection{umfpack\_*\_get\_lunz}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_get_lunz.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_get\_numeric}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_get_numeric.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_get\_symbolic}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_get_symbolic.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_save\_numeric}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_save_numeric.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_load\_numeric}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_load_numeric.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_save\_symbolic}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_save_symbolic.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_load\_symbolic}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_load_symbolic.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_get\_determinant}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_get_determinant.h via sed
\end{verbatim}
}

%-------------------------------------------------------------------------------
\newpage
\section{Reporting routines}
\label{Report}
%-------------------------------------------------------------------------------

\subsection{umfpack\_*\_report\_status}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_status.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_control}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_control.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_info}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_info.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_matrix}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_matrix.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_numeric}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_numeric.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_perm}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_perm.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_symbolic}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_symbolic.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_triplet}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_triplet.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_*\_report\_vector}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_report_vector.h via sed
\end{verbatim}
}

%-------------------------------------------------------------------------------
\newpage
\section{Utility routines}
\label{Utility}
%-------------------------------------------------------------------------------

\subsection{umfpack\_timer}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_timer.h via sed
\end{verbatim}
}

\newpage
\subsection{umfpack\_tic and umfpack\_toc}

{\footnotesize
\begin{verbatim}
INCLUDE umfpack_tictoc.h via sed
\end{verbatim}
}


%-------------------------------------------------------------------------------
\newpage
% References
%-------------------------------------------------------------------------------

\bibliographystyle{plain}
\bibliography{UserGuide}

\end{document}