changeset 8839:fcba62cc4549

add chapter about diagonal and permutation matrices to manual
author Jaroslav Hajek <highegg@gmail.com>
date Mon, 23 Feb 2009 13:55:44 +0100
parents dea5bd01e6d7
children c690e3772583
files doc/ChangeLog doc/interpreter/Makefile.in doc/interpreter/diagperm.txi doc/interpreter/matrix.txi doc/interpreter/octave.texi
diffstat 5 files changed, 420 insertions(+), 5 deletions(-) [+]
line wrap: on
line diff
--- a/doc/ChangeLog	Mon Feb 23 12:47:30 2009 +0100
+++ b/doc/ChangeLog	Mon Feb 23 13:55:44 2009 +0100
@@ -1,3 +1,11 @@
+2009-02-23  Jaroslav Hajek  <highegg@gmail.com>
+
+	* interpreter/diagperm.txi: New file.
+	* interpreter/octave.texi: Reference it.
+	* intepreter/Makefile.in: Include it.
+	* interpreter/matrix.txi: Move @DOCSTRING(diag) to diagperm.txi.
+
+
 2009-02-19  John W. Eaton  <jwe@octave.org>
 
 	* doc/interperter: Include @DOCSTRING commands for the following
--- a/doc/interpreter/Makefile.in	Mon Feb 23 12:47:30 2009 +0100
+++ b/doc/interpreter/Makefile.in	Mon Feb 23 13:55:44 2009 +0100
@@ -110,9 +110,9 @@
 
 SUB_SOURCE := arith.txi audio.txi basics.txi bugs.txi \
 	container.txi contrib.txi cp-idx.txi data.txi \
-	debug.txi diffeq.txi dynamic.txi emacs.txi errors.txi eval.txi \
-	expr.txi fn-idx.txi func.txi geometry.txi gpl.txi \
-	grammar.txi image.txi install.txi interp.txi \
+	debug.txi diffeq.txi diagperm.txi dynamic.txi emacs.txi \
+	errors.txi eval.txi expr.txi fn-idx.txi func.txi geometry.txi \
+	gpl.txi grammar.txi image.txi install.txi interp.txi \
 	intro.txi io.txi linalg.txi matrix.txi nonlin.txi numbers.txi \
 	oop.txi op-idx.txi optim.txi package.txi plot.txi poly.txi preface.txi \
 	quad.txi set.txi signal.txi sparse.txi stats.txi \
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/interpreter/diagperm.txi	Mon Feb 23 13:55:44 2009 +0100
@@ -0,0 +1,399 @@
+@c Copyright (C) 2009 Jaroslav Hajek
+@c
+@c This file is part of Octave.
+@c
+@c Octave is free software; you can redistribute it and/or modify it
+@c under the terms of the GNU General Public License as published by the
+@c Free Software Foundation; either version 3 of the License, or (at
+@c your option) any later version.
+@c 
+@c Octave is distributed in the hope that it will be useful, but WITHOUT
+@c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+@c FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+@c for more details.
+@c 
+@c You should have received a copy of the GNU General Public License
+@c along with Octave; see the file COPYING.  If not, see
+@c <http://www.gnu.org/licenses/>.
+
+@ifhtml
+@set htmltex
+@end ifhtml
+@iftex
+@set htmltex
+@end iftex
+
+@node Diagonal and Permutation Matrices 
+@chapter Diagonal and Permutation Matrices
+
+@menu
+* Basic Usage::          Creation and Manipulation of Diagonal and Permutation Matrices
+* Matrix Algebra::   	 Linear Algebra with Diagonal and Permutation Matrices
+* Function Support::     Functions That Are Aware of These Matrices
+* Example Codes::        Some Examples of Usage
+* Zeros Treatment::      The Differences in Treatment of Zero Elements
+@end menu
+
+@node Basic Usage
+@section The Creation and Manipulation of Diagonal and Permutation Matrices
+
+A diagonal matrix is defined as a matrix that has zero entries outside the main diagonal;
+that is, @code{D(i,j) == 0} if @code{i != j}.
+Most often, square diagonal matrices are considered; however, the definition can equally
+be applied to nonsquare matrices, in which case we usually speak of a rectangular diagonal 
+matrix. For more information, see @url{http://en.wikipedia.org/wiki/Diagonal_matrix}.
+
+A permutation matrix is defined as a square matrix that has a single element equal to unity
+in each row and each column; all other elements are zero. That is, there exists a 
+permutation (vector) @code{p} such that @code{P(i,j) == 1} if @code{j == p(i)} and
+@code{P(i,j) == 0} otherwise.  For more information, see 
+@url{http://en.wikipedia.org/wiki/Permutation_matrix}.
+
+Octave provides special treatment of real and complex rectangular diagonal matrices,
+as well as permutation matrices. They are stored as special objects, using efficient 
+storage and algorithms, facilitating writing both readable and efficient matrix algebra
+expressions in the Octave language.
+
+@menu
+* Creating Diagonal Matrices::
+* Creating Permutation Matrices::
+* Explicit and Implicit Conversions::
+@end menu
+
+@node Creating Diagonal Matrices
+@subsection Creating Diagonal Matrices
+
+The most common and easiest way to create a diagonal matrix is using the built-in
+function @dfn{diag}. The expression @code{diag (@var{v})}, with @var{v} a vector,
+will create a square diagonal matrix with elements on the main diagonal given
+by the elements of @var{v}, and size equal to the length of @var{v}.
+@code{diag (@var{v}, m, n)} can be used to construct a rectangular diagonal matrix.
+The result of these expressions will be a special diagonal matrix object, rather
+than a general matrix object.
+
+Some other built-in functions can also return diagonal matrices. Examples include
+@dfn{balance} or @dfn{inv}.
+
+@DOCSTRING(diag)
+
+@node Creating Permutation Matrices
+@subsection Creating Permutation Matrices
+
+For creating permutation matrices, Octave does not introduce a new function, but
+rather overrides an existing syntax. That is, if @var{q} is a permutation vector
+of length @var{n}, the expression
+@example
+  @var{P} = eye (@var{n}) (:, @var{q});
+@end example
+will create a permutation matrix - a special matrix object.
+@example
+eye (@var{n}) (@var{q}, :) 
+@end example
+will also work (a row permutation matrix), as well as 
+@example
+eye (@var{n}) (@var{q1}, @var{q2}).
+@end example
+Note that the identity, @code{eye (@var{n})}, is a diagonal matrix by definition,
+but should work in any place where a permutation matrix is requested.
+
+Some other built-in functions can also return permutation matrices. Examples include
+@dfn{inv} or @dfn{lu}.
+
+@node Explicit and Implicit Conversions
+@subsection Explicit and Implicit Conversions
+
+The diagonal and permutation matrices are special objects in their own right. A number
+of operations and built-in functions, are defined for these matrices to use special,
+more efficient code than would be used for a full matrix in the same place. Examples
+are given in further sections.
+
+To facilitate smooth mixing with full matrices, backward compatibility, and
+compatibility with Matlab, the diagonal and permutation matrices should allow
+any operation that works on full matrices, and will either treat is specially,
+or implicitly convert themselves to full matrices.
+
+Instances include matrix indexing (except for extracting a single element),
+indexed assignment, or applying most mapper functions, such as @dfn{exp}.
+
+An explicit conversion to a full matrix can be requested using the built-in
+function @dfn{full}. It should also be noted that the diagonal and permutation
+matrix objects will cache the result of the conversion after it is first
+requested (explicitly or implicitly), so that subsequent conversions will
+be very cheap.
+
+@node Matrix Algebra
+@section Linear Algebra with Diagonal and Permutation Matrices
+
+As has been already said, diagonal and permutation matrices make it
+possible to use efficient algorithms while preserving natural linear
+algebra syntax. This section describes in detail the operations that
+are treated specially when performed on these special matrix objects.
+
+@menu
+* Expressions Involving Diagonal Matrices::
+* Expressions Involving Permutation Matrices::
+@end menu
+
+@node Expressions Involving Diagonal Matrices
+@subsection Expressions Involving Diagonal Matrices
+
+Assume @var{D} is a diagonal matrix. If @var{M} is a full matrix,
+then @code{@var{D}*@var{M}} will scale the rows of @var{M}. That means,
+if @code{@var{S} = @var{D}*@var{M}}, then for each pair of indices
+i,j it holds 
+@example
+S(i,j) = D(i,i) * M(i,j).
+@end example
+Similarly, @code{M*D} will do a column scaling.
+
+The matrix @var{D} may also be rectangular, m-by-n where @code{m != n}.
+If @code{m < n}, then the expression @code{D*M} is equivalent to
+@example
+D(:,1:m) * M(1:m,:),
+@end example
+i.e. trailing @code{n-m} rows of @var{M} are ignored. If @code{m > n}, 
+then @code{D*M} is equivalent to 
+@example
+[D(1:n,n) * M; zeros(m-n, columns (M))],
+@end example
+i.e. null rows are appended to the result.
+The situation for right-multiplication @code{M*D} is analogous.
+
+The expressions @code{D \ M} and @code{M / D} perform inverse scaling.
+They are equivalent to solving a diagonal (or rectangular diagonal)
+in a least-squares minimum-norm sense. In exact arithmetics, this is
+equivalent to multiplying by a pseudoinverse. The pseudoinverse of
+a rectangular diagonal matrix is again a rectangular diagonal matrix
+of the same dimensions, where each nonzero diagonal element is replaced
+by its reciprocal.
+The matrix division algorithms do, in fact, use division rather than 
+multiplication by reciprocals for better numerical accuracy; otherwise, they
+honor the above definition.  Note that a diagonal matrix is never truncated due
+to ill-conditioning; otherwise, it would not be much useful for scaling. This
+is typically consistent with linear algebra needs. A full matrix that only
+happens to be diagonal (an is thus not a special object) is of course treated
+normally.
+
+If @var{D1} and @var{D2} are both diagonal matrices, then the expressions
+@example
+@var{D1} + @var{D2}
+@var{D1} - @var{D2} 
+@var{D1} * @var{D2} 
+@var{D1} / @var{D2} 
+@var{D1} \ @var{D2}
+@end example
+again produce diagonal matrices, provided that normal
+dimension matching rules are obeyed. The relations used are same as described above.
+
+Also, a diagonal matrix @var{D} can be multiplied or divided by a scalar, or raised
+to a scalar power if it is square, producing diagonal matrix result in all cases. 
+
+A diagonal matrix can also be transposed or conjugate-transposed, giving the expected
+result. Extracting a leading submatrix of a diagonal matrix, i.e. @code{@var{D}(1:m,1:n)},
+will produce a diagonal matrix, other indexing expressions will implicitly convert to
+full matrix.
+
+When involved in expressions with the element-by-element operators @code{.*},
+@code{./}, @code{.\} or @code{.^}, an implicit conversion to full matrix will
+also take place. This is not always strictly necessary but chosen to facilitate
+better consistency with Matlab.
+
+@node Expressions Involving Permutation Matrices
+@subsection Expressions Involving Permutation Matrices
+
+If @var{P} is a permutation matrix and @var{M} a matrix, the expression
+@code{P*M} will permute the rows of @var{M}. Similarly, @code{M*P} will
+yield a column permutation. 
+Matrix division @code{P\M} and @code{M/P} can be used to do inverse permutation.
+
+The previously described syntax for creating permutation matrices can actually
+help an user to understand the connection between a permutation matrix and
+a permuting vector. Namely, the following holds, where @code{I = eye (n)}
+is an identity matrix:
+@example
+  I(p,:) * M = (I*M) (p,:) = M(p,:)
+@end example
+Similarly,
+@example
+  M * I(:,p) = (M*I) (:,p) = M(:,p)
+@end example
+
+The expressions @code{I(p,:)} and @code{I(:,p)} are permutation matrices.
+
+A permutation matrix can be transposed (or conjugate-transposed, which is the
+same, because a permutation matrix is never complex), inverting the
+permutation, or equivalently, turning a row-permutation matrix into a
+column-permutation one. For permutation matrices, transpose is equivalent to
+inversion, thus @code{P\M} is equivalent to @code{P'*M}.  Transpose of a
+permutation matrix (or inverse) is a constant-time operation, flipping only a
+flag internally, and thus the choice between the two above equivalent
+expressions for inverse permuting is completely up to the user's taste.
+
+Two permutation matrices can be multiplied or divided (if their sizes match), performing
+a composition of permutations. Also a permutation matrix can be indexed by a permutation
+vector (or two vectors), giving again a permutation matrix.
+Any other operations do not generally yield a permutation matrix and will thus
+trigger the implicit conversion.
+
+@node Function Support
+@section Functions That Are Aware of These Matrices
+
+This section lists the built-in functions that are aware of diagonal and
+permutation matrices on input, or can return them as output. Passed to other
+functions, these matrices will in general trigger an implicit conversion.
+(Of course, user-defined dynamically linked functions may also work with
+diagonal or permutation matrices).
+
+@menu
+* Diagonal Matrix Functions::
+* Permutation Matrix Functions::
+@end menu
+
+@node Diagonal Matrix Functions
+@subsection Diagonal Matrix Functions
+
+@dfn{inv} and @dfn{pinv} can be applied to a diagonal matrix, yielding again
+a diagonal matrix. @dfn{det} will use an efficient straightforward calculation
+when given a diagonal matrix, as well as @dfn{cond}.
+The following mapper functions can be applied to a diagonal matrix
+without converting it to a full one:
+@dfn{abs}, @dfn{real}, @dfn{imag}, @dfn{conj}, @dfn{sqrt}. 
+A diagonal matrix can also be returned from the @dfn{balance}
+and @dfn{svd} functions.
+
+@node Permutation Matrix Functions
+@subsection Permutation Matrix Functions
+
+@dfn{inv} and @dfn{pinv} will invert a permutation matrix, preserving its
+specialness. @dfn{det} can be applied to a permutation matrix, efficiently
+calculating the sign of the permutation (which is equal to the determinant).
+
+A permutation matrix can also be returned from the built-in functions
+@dfn{lu} and @dfn{qr}, if a pivoted factorization is requested.
+
+@node Example Codes
+@section Some Examples of Usage
+
+The following can be used to solve a linear system @code{A*x = b}
+using the pivoted LU factorization:
+@example
+  [L, U, P] = lu (A); ## now L*U = P*A
+  x = U \ L \ P*b;
+@end example
+
+This is how you normalize columns of a matrix @var{X} to unit norm:
+@example
+  s = norm (X, "columns");
+  X = diag (s) \ X;
+@end example
+
+The following expression is a way to efficiently calculate the sign of a
+permutation, given by a permutation vector @var{p}. It will also work
+in earlier versions of Octave, but slowly.
+@example
+  det (eye (length (p))(p, :))
+@end example
+
+Finally, here's how you solve a linear system @code{A*x = b} 
+with Tikhonov regularization using SVD (a skeleton only):
+@example
+  m = rows (A); n = columns (A);
+  [U, S, V] = svd (A);
+  ## determine the regularization factor alpha
+  ## ...
+  ## and solve
+  x = U'*b;
+  x = (S + alpha*eye (m, n)) \ x; ## this will be very efficient
+  x = V*x;
+@end example
+
+@node Zeros Treatment
+@section The Differences in Treatment of Zero Elements
+
+Making diagonal and permutation matrices special matrix objects in their own
+right and the consequent usage of smarter algorithms for certain operations
+implies, as a side effect, small differences in treating zeros.
+The contents of this section applies also to sparse matrices, discussed in
+the following chapter.
+
+The IEEE standard defines the result of the expressions @code{0*Inf} and 
+@code{0*NaN} as @code{NaN}, as it has been generally agreed that this is the
+best compromise.
+Numerical software dealing with structured and sparse matrices (including
+Octave) however, almost always makes a distinction between a "numerical zero"
+and an "assumed zero". 
+A "numerical zero" is a zero value occuring in a place where any floating-point
+value could occur. It is normally stored somewhere in memory as an explicit
+value. 
+An "assumed zero", on the contrary, is a zero matrix element implied by the
+matrix structure (diagonal, triangular) or a sparsity pattern; its value is
+usually not stored explicitly anywhere, but is implied by the underlying
+data structure.
+
+The primary distinction is that an assumed zero, when multiplied or divided
+by any number, yields *always* a zero, even when, e.g., multiplied by @code{Inf}
+or divided by @code{NaN}.
+The reason for this behavior is that the numerical multiplication is not
+actually performed anywhere by the underlying algorithm; the result is
+just assumed to be zero. Equivalently, one can say that the part of the
+computation involving assumed zeros is performed symbolically, not numerically.
+
+This behavior not only facilitates the most straightforward and efficient
+implementation of algorithms, but also preserves certain useful invariants,
+like:
+@itemize
+@item scalar * diagonal matrix is a diagonal matrix
+@item sparse matrix / scalar preserves the sparsity pattern
+@item permutation matrix * matrix is equivalent to permuting rows
+@end itemize
+all of these natural mathematical truths would be invalidated by treating
+assumed zeros as numerical ones.
+
+Note that certain competing software does not strictly follow this principle
+and converts assumed zeros to numerical zeros in certain cases, while not doing
+so in other cases. As of today, there are no intentions to mimick such behavior 
+in Octave.
+
+Examples of effects of assumed zeros vs. numerical zeros:
+@example
+Inf * eye (3)
+@result{}
+   Inf     0     0
+     0   Inf     0
+     0     0   Inf
+
+Inf * speye (3)
+@result{}
+Compressed Column Sparse (rows = 3, cols = 3, nnz = 3 [33%])
+
+  (1, 1) -> Inf
+  (2, 2) -> Inf
+  (3, 3) -> Inf
+
+Inf * full (eye (3))
+@result{}
+   Inf   NaN   NaN
+   NaN   Inf   NaN
+   NaN   NaN   Inf
+
+@end example
+
+@example
+diag(1:3) * [NaN; 1; 1]
+@result{}
+   NaN
+     2
+     3
+
+sparse(1:3,1:3,1:3) * [NaN; 1; 1]
+@result{}
+   NaN
+     2
+     3
+[1,0,0;0,2,0;0,0,3] * [NaN; 1; 1]
+@result{}
+   NaN
+   NaN
+   NaN
+@end example
+
--- a/doc/interpreter/matrix.txi	Mon Feb 23 12:47:30 2009 +0100
+++ b/doc/interpreter/matrix.txi	Mon Feb 23 13:55:44 2009 +0100
@@ -239,8 +239,6 @@
 
 @DOCSTRING(randperm)
 
-@DOCSTRING(diag)
-
 The functions @code{linspace} and @code{logspace} make it very easy to
 create vectors with evenly or logarithmically spaced elements.
 @xref{Ranges}.
--- a/doc/interpreter/octave.texi	Mon Feb 23 12:47:30 2009 +0100
+++ b/doc/interpreter/octave.texi	Mon Feb 23 13:55:44 2009 +0100
@@ -165,6 +165,7 @@
 * Arithmetic::                  
 * Linear Algebra::              
 * Nonlinear Equations::         
+* Diagonal and Permutation Matrices::
 * Sparse Matrices::
 * Numerical Integration::                  
 * Differential Equations::      
@@ -419,6 +420,14 @@
 * Matrix Factorizations::       
 * Functions of a Matrix::       
 
+Diagonal and Permutation Matrices
+
+* Basic Usage::
+* Matrix Algebra::
+* Function Support::
+* Example Codes::
+* Zeros Treatment::
+
 Sparse Matrices
 
 * Basics::
@@ -594,6 +603,7 @@
 @include arith.texi
 @include linalg.texi
 @include nonlin.texi
+@include diagperm.texi
 @include sparse.texi
 @include quad.texi
 @include diffeq.texi