view doc/interpreter/diagperm.txi @ 8917:d707aa3bbc36

manual improvements
author Jaroslav Hajek <highegg@gmail.com>
date Fri, 06 Mar 2009 13:42:24 +0100
parents 46fdf8714acf
children 349616d9c38e
line wrap: on
line source

@c Copyright (C) 2009 Jaroslav Hajek
@c
@c This file is part of Octave.
@c
@c Octave is free software; you can redistribute it and/or modify it
@c under the terms of the GNU General Public License as published by the
@c Free Software Foundation; either version 3 of the License, or (at
@c your option) any later version.
@c 
@c Octave is distributed in the hope that it will be useful, but WITHOUT
@c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
@c FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
@c for more details.
@c 
@c You should have received a copy of the GNU General Public License
@c along with Octave; see the file COPYING.  If not, see
@c <http://www.gnu.org/licenses/>.

@node Diagonal and Permutation Matrices 
@chapter Diagonal and Permutation Matrices

@menu
* Basic Usage::          Creation and Manipulation of Diagonal and Permutation Matrices
* Matrix Algebra::   	 Linear Algebra with Diagonal and Permutation Matrices
* Function Support::     Functions That Are Aware of These Matrices
* Example Codes::        Some Examples of Usage
* Zeros Treatment::      The Differences in Treatment of Zero Elements
@end menu

@node Basic Usage
@section Creating and Manipulating Diagonal and Permutation Matrices

A diagonal matrix is defined as a matrix that has zero entries outside the main diagonal;
that is, 
@iftex
@tex
$D_{ij} = 0$ if $i \neq j$
@end tex
@end iftex
@ifnottex
@code{D(i,j) == 0} if @code{i != j}.
@end ifnottex
Most often, square diagonal matrices are considered; however, the definition can equally
be applied to nonsquare matrices, in which case we usually speak of a rectangular diagonal 
matrix.

A permutation matrix is defined as a square matrix that has a single element equal to unity
in each row and each column; all other elements are zero. That is, there exists a 
permutation (vector) 
@iftex
@tex
$p$ such that $P_{ij}=1$ if $j = p_i$ and
$P_{ij}=0$ otherwise.  
@end tex
@end iftex
@ifnottex
@code{p} such that @code{P(i,j) == 1} if @code{j == p(i)} and 
@code{P(i,j) == 0} otherwise.  
@end ifnottex

Octave provides special treatment of real and complex rectangular diagonal matrices,
as well as permutation matrices. They are stored as special objects, using efficient 
storage and algorithms, facilitating writing both readable and efficient matrix algebra
expressions in the Octave language.

@menu
* Creating Diagonal Matrices::
* Creating Permutation Matrices::
* Explicit and Implicit Conversions::
@end menu

@node Creating Diagonal Matrices
@subsection Creating Diagonal Matrices

The most common and easiest way to create a diagonal matrix is using the built-in
function @dfn{diag}. The expression @code{diag (v)}, with @var{v} a vector,
will create a square diagonal matrix with elements on the main diagonal given
by the elements of @var{v}, and size equal to the length of @var{v}.
@code{diag (v, m, n)} can be used to construct a rectangular diagonal matrix.
The result of these expressions will be a special diagonal matrix object, rather
than a general matrix object.

Diagonal matrix with unit elements can be created using @dfn{eye}.
Some other built-in functions can also return diagonal matrices. Examples include
@dfn{balance} or @dfn{inv}.

Example:
@example
  diag (1:4)
@result{}
Diagonal Matrix

   1   0   0   0
   0   2   0   0
   0   0   3   0
   0   0   0   4

  diag(1:3,5,3)

@result{}
Diagonal Matrix

   1   0   0
   0   2   0
   0   0   3
   0   0   0
   0   0   0
@end example  

@node Creating Permutation Matrices
@subsection Creating Permutation Matrices

For creating permutation matrices, Octave does not introduce a new function, but
rather overrides an existing syntax: permutation matrices can be conveniently
created by indexing an identity matrix by permutation vectors.
That is, if @var{q} is a permutation vector of length @var{n}, the expression
@example
  P = eye (n) (:, q);
@end example
will create a permutation matrix - a special matrix object.
@example
eye (n) (q, :) 
@end example
will also work (and create a row permutation matrix), as well as 
@example
eye (n) (q1, q2).
@end example

For example:
@example
  eye (4) ([1,3,2,4],:)
@result{}
Permutation Matrix

   1   0   0   0
   0   0   1   0
   0   1   0   0
   0   0   0   1

  eye (4) (:,[1,3,2,4])
@result{}
Permutation Matrix

   1   0   0   0
   0   0   1   0
   0   1   0   0
   0   0   0   1
@end example

Mathematically, an identity matrix is both diagonal and permutation matrix.
In Octave, @code{eye (n)} returns a diagonal matrix, because a matrix
can only have one class. You can convert this diagonal matrix to a permutation
matrix by indexing it by an identity permutation, as shown below.
This is a special property of the identity matrix; indexing other diagonal
matrices generally produces a full matrix.

@example
  eye (3)
@result{}
Diagonal Matrix

   1   0   0
   0   1   0
   0   0   1

  eye(3)(1:3,:)
@result{}
Permutation Matrix

   1   0   0
   0   1   0
   0   0   1
@end example

Some other built-in functions can also return permutation matrices. Examples include
@dfn{inv} or @dfn{lu}.

@node Explicit and Implicit Conversions
@subsection Explicit and Implicit Conversions

The diagonal and permutation matrices are special objects in their own right. A number
of operations and built-in functions are defined for these matrices to use special,
more efficient code than would be used for a full matrix in the same place. Examples
are given in further sections.

To facilitate smooth mixing with full matrices, backward compatibility, and
compatibility with Matlab, the diagonal and permutation matrices should allow
any operation that works on full matrices, and will either treat it specially,
or implicitly convert themselves to full matrices.

Instances include matrix indexing, except for extracting a single element or
a leading submatrix, indexed assignment, or applying most mapper functions,
such as @dfn{exp}.

An explicit conversion to a full matrix can be requested using the built-in
function @dfn{full}. It should also be noted that the diagonal and permutation
matrix objects will cache the result of the conversion after it is first
requested (explicitly or implicitly), so that subsequent conversions will
be very cheap.

@node Matrix Algebra
@section Linear Algebra with Diagonal and Permutation Matrices

As has been already said, diagonal and permutation matrices make it
possible to use efficient algorithms while preserving natural linear
algebra syntax. This section describes in detail the operations that
are treated specially when performed on these special matrix objects.

@menu
* Expressions Involving Diagonal Matrices::
* Expressions Involving Permutation Matrices::
@end menu

@node Expressions Involving Diagonal Matrices
@subsection Expressions Involving Diagonal Matrices

Assume @var{D} is a diagonal matrix. If @var{M} is a full matrix,
then @code{D*M} will scale the rows of @var{M}. That means,
if @code{S = D*M}, then for each pair of indices
i,j it holds 
@iftex
@tex
$$S_{ij} = D_{ii} M_{ij}$$
@end tex
@end iftex
@ifnottex
@example
S(i,j) = D(i,i) * M(i,j).
@end example
@end ifnottex
Similarly, @code{M*D} will do a column scaling.

The matrix @var{D} may also be rectangular, m-by-n where @code{m != n}.
If @code{m < n}, then the expression @code{D*M} is equivalent to
@example
D(:,1:m) * M(1:m,:),
@end example
i.e. trailing @code{n-m} rows of @var{M} are ignored. If @code{m > n}, 
then @code{D*M} is equivalent to 
@example
[D(1:n,n) * M; zeros(m-n, columns (M))],
@end example
i.e. null rows are appended to the result.
The situation for right-multiplication @code{M*D} is analogous.

The expressions @code{D \ M} and @code{M / D} perform inverse scaling.
They are equivalent to solving a diagonal (or rectangular diagonal)
in a least-squares minimum-norm sense. In exact arithmetics, this is
equivalent to multiplying by a pseudoinverse. The pseudoinverse of
a rectangular diagonal matrix is again a rectangular diagonal matrix
with swapped dimensions, where each nonzero diagonal element is replaced
by its reciprocal.
The matrix division algorithms do, in fact, use division rather than 
multiplication by reciprocals for better numerical accuracy; otherwise, they
honor the above definition.  Note that a diagonal matrix is never truncated due
to ill-conditioning; otherwise, it would not be much useful for scaling. This
is typically consistent with linear algebra needs. A full matrix that only
happens to be diagonal (an is thus not a special object) is of course treated
normally.

If @var{D1} and @var{D2} are both diagonal matrices, then the expressions
@example
D1 + D2
D1 - D2 
D1 * D2 
D1 / D2 
D1 \ D2
@end example
again produce diagonal matrices, provided that normal
dimension matching rules are obeyed. The relations used are same as described above.

Also, a diagonal matrix @var{D} can be multiplied or divided by a scalar, or raised
to a scalar power if it is square, producing diagonal matrix result in all cases. 

A diagonal matrix can also be transposed or conjugate-transposed, giving the expected
result. Extracting a leading submatrix of a diagonal matrix, i.e. @code{D(1:m,1:n)},
will produce a diagonal matrix, other indexing expressions will implicitly convert to
full matrix.

Adding a diagonal matrix to a full matrix only operates on the diagonal elements. Thus,
@example
A = A + eps * eye (n)
@end example
is an efficient method of augmenting the diagonal of a matrix. Subtraction works
analogically.

When involved in expressions with other element-by-element operators, @code{.*},
@code{./}, @code{.\} or @code{.^}, an implicit conversion to full matrix will
take place. This is not always strictly necessary but chosen to facilitate
better consistency with Matlab.

@node Expressions Involving Permutation Matrices
@subsection Expressions Involving Permutation Matrices

If @var{P} is a permutation matrix and @var{M} a matrix, the expression
@code{P*M} will permute the rows of @var{M}. Similarly, @code{M*P} will
yield a column permutation. 
Matrix division @code{P\M} and @code{M/P} can be used to do inverse permutation.

The previously described syntax for creating permutation matrices can actually
help an user to understand the connection between a permutation matrix and
a permuting vector. Namely, the following holds, where @code{I = eye (n)}
is an identity matrix:
@example
  I(p,:) * M = (I*M) (p,:) = M(p,:)
@end example
Similarly,
@example
  M * I(:,p) = (M*I) (:,p) = M(:,p)
@end example

The expressions @code{I(p,:)} and @code{I(:,p)} are permutation matrices.

A permutation matrix can be transposed (or conjugate-transposed, which is the
same, because a permutation matrix is never complex), inverting the
permutation, or equivalently, turning a row-permutation matrix into a
column-permutation one. For permutation matrices, transpose is equivalent to
inversion, thus @code{P\M} is equivalent to @code{P'*M}.  Transpose of a
permutation matrix (or inverse) is a constant-time operation, flipping only a
flag internally, and thus the choice between the two above equivalent
expressions for inverse permuting is completely up to the user's taste.

Two permutation matrices can be multiplied or divided (if their sizes match), performing
a composition of permutations. Also a permutation matrix can be indexed by a permutation
vector (or two vectors), giving again a permutation matrix.
Any other operations do not generally yield a permutation matrix and will thus
trigger the implicit conversion.

@node Function Support
@section Functions That Are Aware of These Matrices

This section lists the built-in functions that are aware of diagonal and
permutation matrices on input, or can return them as output. Passed to other
functions, these matrices will in general trigger an implicit conversion.
(Of course, user-defined dynamically linked functions may also work with
diagonal or permutation matrices).

@menu
* Diagonal Matrix Functions::
* Permutation Matrix Functions::
@end menu

@node Diagonal Matrix Functions
@subsection Diagonal Matrix Functions

@dfn{inv} and @dfn{pinv} can be applied to a diagonal matrix, yielding again
a diagonal matrix. @dfn{det} will use an efficient straightforward calculation
when given a diagonal matrix, as well as @dfn{cond}.
The following mapper functions can be applied to a diagonal matrix
without converting it to a full one:
@dfn{abs}, @dfn{real}, @dfn{imag}, @dfn{conj}, @dfn{sqrt}. 
A diagonal matrix can also be returned from the @dfn{balance}
and @dfn{svd} functions.

@node Permutation Matrix Functions
@subsection Permutation Matrix Functions

@dfn{inv} and @dfn{pinv} will invert a permutation matrix, preserving its
specialness. @dfn{det} can be applied to a permutation matrix, efficiently
calculating the sign of the permutation (which is equal to the determinant).

A permutation matrix can also be returned from the built-in functions
@dfn{lu} and @dfn{qr}, if a pivoted factorization is requested.

@node Example Codes
@section Some Examples of Usage

The following can be used to solve a linear system @code{A*x = b}
using the pivoted LU factorization:
@example
  [L, U, P] = lu (A); ## now L*U = P*A
  x = U \ L \ P*b;
@end example

@noindent
This is how you normalize columns of a matrix @var{X} to unit norm:
@example
  s = norm (X, "columns");
  X = diag (s) \ X;
@end example

@noindent
The following expression is a way to efficiently calculate the sign of a
permutation, given by a permutation vector @var{p}. It will also work
in earlier versions of Octave, but slowly.
@example
  det (eye (length (p))(p, :))
@end example

@noindent
Finally, here's how you solve a linear system @code{A*x = b} 
with Tikhonov regularization (ridge regression) using SVD (a skeleton only):
@example
  m = rows (A); n = columns (A);
  [U, S, V] = svd (A);
  ## determine the regularization factor alpha
  ## alpha = ...
  ## transform to orthogonal basis
  b = U'*b;
  ## Use the standard formula, replacing A with S.
  ## S is diagonal, so the following will be very fast and accurate.
  x = (S'*S + alpha^2 * eye (n)) \ (S' * b);
  ## transform to solution basis
  x = V*x;
@end example


@node Zeros Treatment
@section The Differences in Treatment of Zero Elements

Making diagonal and permutation matrices special matrix objects in their own
right and the consequent usage of smarter algorithms for certain operations
implies, as a side effect, small differences in treating zeros.
The contents of this section applies also to sparse matrices, discussed in
the following chapter.

The IEEE standard defines the result of the expressions @code{0*Inf} and 
@code{0*NaN} as @code{NaN}, as it has been generally agreed that this is the
best compromise.
Numerical software dealing with structured and sparse matrices (including
Octave) however, almost always makes a distinction between a "numerical zero"
and an "assumed zero". 
A "numerical zero" is a zero value occuring in a place where any floating-point
value could occur. It is normally stored somewhere in memory as an explicit
value. 
An "assumed zero", on the contrary, is a zero matrix element implied by the
matrix structure (diagonal, triangular) or a sparsity pattern; its value is
usually not stored explicitly anywhere, but is implied by the underlying
data structure.

The primary distinction is that an assumed zero, when multiplied 
by any number, or divided by any nonzero number,
yields *always* a zero, even when, e.g., multiplied by @code{Inf}
or divided by @code{NaN}.
The reason for this behavior is that the numerical multiplication is not
actually performed anywhere by the underlying algorithm; the result is
just assumed to be zero. Equivalently, one can say that the part of the
computation involving assumed zeros is performed symbolically, not numerically.

This behavior not only facilitates the most straightforward and efficient
implementation of algorithms, but also preserves certain useful invariants,
like:
@itemize
@item scalar * diagonal matrix is a diagonal matrix
@item sparse matrix / scalar preserves the sparsity pattern
@item permutation matrix * matrix is equivalent to permuting rows
@end itemize
all of these natural mathematical truths would be invalidated by treating
assumed zeros as numerical ones.

Note that certain competing software does not strictly follow this principle
and converts assumed zeros to numerical zeros in certain cases, while not doing
so in other cases. As of today, there are no intentions to mimick such behavior 
in Octave.

Examples of effects of assumed zeros vs. numerical zeros:
@example
Inf * eye (3)
@result{}
   Inf     0     0
     0   Inf     0
     0     0   Inf

Inf * speye (3)
@result{}
Compressed Column Sparse (rows = 3, cols = 3, nnz = 3 [33%])

  (1, 1) -> Inf
  (2, 2) -> Inf
  (3, 3) -> Inf

Inf * full (eye (3))
@result{}
   Inf   NaN   NaN
   NaN   Inf   NaN
   NaN   NaN   Inf

@end example

@example
diag(1:3) * [NaN; 1; 1]
@result{}
   NaN
     2
     3

sparse(1:3,1:3,1:3) * [NaN; 1; 1]
@result{}
   NaN
     2
     3
[1,0,0;0,2,0;0,0,3] * [NaN; 1; 1]
@result{}
   NaN
   NaN
   NaN
@end example