view doc/interpreter/numbers.txi @ 31069:43974344fe19 stable

doc: Expand documentation about type promotion and demotion (bug #62283)
author Arun Giridhar <arungiridhar@gmail.com>
date Sat, 04 Jun 2022 17:13:38 -0400
parents 796f54d4ddbf
children 2dee06f4635c
line wrap: on
line source

@c Copyright (C) 1996-2022 The Octave Project Developers
@c
@c This file is part of Octave.
@c
@c Octave is free software: you can redistribute it and/or modify it
@c under the terms of the GNU General Public License as published by
@c the Free Software Foundation, either version 3 of the License, or
@c (at your option) any later version.
@c
@c Octave is distributed in the hope that it will be useful, but
@c WITHOUT ANY WARRANTY; without even the implied warranty of
@c MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
@c GNU General Public License for more details.
@c
@c You should have received a copy of the GNU General Public License
@c along with Octave; see the file COPYING.  If not, see
@c <https://www.gnu.org/licenses/>.

@node Numeric Data Types
@chapter Numeric Data Types
@cindex numeric constant
@cindex numeric value

A @dfn{numeric constant} may be a scalar, a vector, or a matrix, and it may
contain complex values.

The simplest form of a numeric constant, a scalar, is a single number.  Note
that by default numeric constants are represented within Octave by IEEE 754
double precision (binary64) floating-point format (complex constants are
stored as pairs of binary64 values).  It is, however, possible to represent
real integers as described in @ref{Integer Data Types}.

If the numeric constant is a real integer, it can be defined in decimal,
hexadecimal, or binary notation.  Hexadecimal notation starts with @samp{0x} or
@samp{0X}, binary notation starts with @samp{0b} or @samp{0B}, otherwise
decimal notation is assumed.  As a consequence, @samp{0b} is not a hexadecimal
number, in fact, it is not a valid number at all.

For better readability, digits may be partitioned by the underscore separator
@samp{_}, which is ignored by the Octave interpreter.  Here are some examples
of real-valued integer constants, which all represent the same value and are
internally stored as binary64:

@example
@group
42            # decimal notation
0x2A          # hexadecimal notation
0b101010      # binary notation
0b10_1010     # underscore notation
round (42.1)  # also binary64
@end group
@end example

In decimal notation, the numeric constant may be denoted as decimal fraction
or even in scientific (exponential) notation.  Note that this is not possible
for hexadecimal or binary notation.  Again, in the following example all
numeric constants represent the same value:

@example
@group
.105
1.05e-1
.00105e+2
@end group
@end example

Unlike most programming languages, complex numeric constants are denoted as
the sum of real and imaginary parts.  The imaginary part is denoted by a
real-valued numeric constant followed immediately by a complex value indicator
(@samp{i}, @samp{j}, @samp{I}, or @samp{J} which represents
@tex
  $\sqrt{-1}$).
@end tex
@ifnottex
  @code{sqrt (-1)}).
@end ifnottex
No spaces are allowed between the numeric constant and the complex value
indicator.  Some examples of complex numeric constants that all represent the
same value:

@example
@group
3 + 42i
3 + 42j
3 + 42I
3 + 42J
3.0 + 42.0i
3.0 + 0x2Ai
3.0 + 0b10_1010i
0.3e1 + 420e-1i
@end group
@end example

@DOCSTRING(double)

@DOCSTRING(complex)

@menu
* Matrices::
* Ranges::
* Single Precision Data Types::
* Integer Data Types::
* Bit Manipulations::
* Logical Values::
* Promotion and Demotion of Data Types::
* Predicates for Numeric Objects::
@end menu

@node Matrices
@section Matrices
@cindex matrices

@opindex [
@opindex ]
@opindex ;
@opindex ,

It is easy to define a matrix of values in Octave.  The size of the
matrix is determined automatically, so it is not necessary to explicitly
state the dimensions.  The expression

@example
a = [1, 2; 3, 4]
@end example

@noindent
results in the matrix
@tex
$$ a = \left[ \matrix{ 1 & 2 \cr 3 & 4 } \right] $$
@end tex
@ifnottex

@example
@group

        /      \
        | 1  2 |
  a  =  |      |
        | 3  4 |
        \      /

@end group
@end example

@end ifnottex

Elements of a matrix may be arbitrary expressions, provided that the
dimensions all make sense when combining the various pieces.  For
example, given the above matrix, the expression

@example
[ a, a ]
@end example

@noindent
produces the matrix

@example
@group
ans =

  1  2  1  2
  3  4  3  4
@end group
@end example

@noindent
but the expression

@example
[ a, 1 ]
@end example

@noindent
produces the error

@example
error: number of rows must match (1 != 2) near line 13, column 6
@end example

@noindent
(assuming that this expression was entered as the first thing on line
13, of course).

Inside the square brackets that delimit a matrix expression, Octave
looks at the surrounding context to determine whether spaces and newline
characters should be converted into element and row separators, or
simply ignored, so an expression like

@example
@group
a = [ 1 2
      3 4 ]
@end group
@end example

@noindent
will work.  However, some possible sources of confusion remain.  For
example, in the expression

@example
[ 1 - 1 ]
@end example

@noindent
the @samp{-} is treated as a binary operator and the result is the
scalar 0, but in the expression

@example
[ 1 -1 ]
@end example

@noindent
the @samp{-} is treated as a unary operator and the result is the
vector @code{[ 1, -1 ]}.  Similarly, the expression

@example
[ sin (pi) ]
@end example

@noindent
will be parsed as

@example
[ sin, (pi) ]
@end example

@noindent
and will result in an error since the @code{sin} function will be
called with no arguments.  To get around this, you must omit the space
between @code{sin} and the opening parenthesis, or enclose the
expression in a set of parentheses:

@example
[ (sin (pi)) ]
@end example

Whitespace surrounding the single quote character (@samp{'}, used as a
transpose operator and for delimiting character strings) can also cause
confusion.  Given @code{a = 1}, the expression

@example
[ 1 a' ]
@end example

@noindent
results in the single quote character being treated as a
transpose operator and the result is the vector @code{[ 1, 1 ]}, but the
expression

@example
[ 1 a ' ]
@end example

@noindent
produces the error message

@example
@group
parse error:

  syntax error

>>> [ 1 a ' ]
              ^
@end group
@end example

@noindent
because not doing so would cause trouble when parsing the valid expression

@example
[ a 'foo' ]
@end example

For clarity, it is probably best to always use commas and semicolons to
separate matrix elements and rows.

The maximum number of elements in a matrix is fixed when Octave is compiled.
The allowable number can be queried with the function @code{sizemax}.  Note
that other factors, such as the amount of memory available on your machine,
may limit the maximum size of matrices to something smaller.

@DOCSTRING(sizemax)

When you type a matrix or the name of a variable whose value is a
matrix, Octave responds by printing the matrix in with neatly aligned
rows and columns.  If the rows of the matrix are too large to fit on the
screen, Octave splits the matrix and displays a header before each
section to indicate which columns are being displayed.  You can use the
following variables to control the format of the output.

@DOCSTRING(output_precision)

It is possible to achieve a wide range of output styles by using
different values of @code{output_precision}.  Reasonable combinations can be
set using the @code{format} function.  @xref{Basic Input and Output}.

@DOCSTRING(split_long_rows)

Octave automatically switches to scientific notation when values become
very large or very small.  This guarantees that you will see several
significant figures for every value in a matrix.  If you would prefer to
see all values in a matrix printed in a fixed point format, you can use
the function @code{fixed_point_format}.  But doing so is not
recommended, because it can produce output that can easily be
misinterpreted.

@DOCSTRING(fixed_point_format)

@menu
* Empty Matrices::
@end menu

@node Empty Matrices
@subsection Empty Matrices

A matrix may have one or both dimensions zero, and operations on empty
matrices are handled as described by @nospell{Carl de Boor} in
@cite{An Empty Exercise}, SIGNUM, Volume 25, pages 2--6, 1990 and
@nospell{C. N. Nett and W. M. Haddad}, in
@cite{A System-Theoretic Appropriate Realization of the Empty Matrix Concept},
IEEE Transactions on Automatic Control, Volume 38, Number 5, May 1993.
@tex
Briefly, given a scalar $s$, an $m\times n$ matrix $M_{m\times n}$,
and an $m\times n$ empty matrix $[\,]_{m\times n}$ (with either one or
both dimensions equal to zero), the following are true:
$$
\eqalign{%
s \cdot [\,]_{m\times n} = [\,]_{m\times n} \cdot s &= [\,]_{m\times n}\cr
[\,]_{m\times n} + [\,]_{m\times n} &= [\,]_{m\times n}\cr
[\,]_{0\times m} \cdot M_{m\times n} &= [\,]_{0\times n}\cr
M_{m\times n} \cdot [\,]_{n\times 0} &= [\,]_{m\times 0}\cr
[\,]_{m\times 0} \cdot [\,]_{0\times n} &=  0_{m\times n}}
$$
@end tex
@ifnottex
Briefly, given a scalar @var{s}, an @var{m} by
@var{n} matrix @code{M(mxn)}, and an @var{m} by @var{n} empty matrix
@code{[](mxn)} (with either one or both dimensions equal to zero), the
following are true:

@example
@group
s * [](mxn) = [](mxn) * s = [](mxn)

    [](mxn) + [](mxn) = [](mxn)

    [](0xm) *  M(mxn) = [](0xn)

     M(mxn) * [](nx0) = [](mx0)

    [](mx0) * [](0xn) =  0(mxn)
@end group
@end example

@end ifnottex

By default, dimensions of the empty matrix are printed along with the
empty matrix symbol, @samp{[]}.  The built-in variable
@code{print_empty_dimensions} controls this behavior.

@DOCSTRING(print_empty_dimensions)

Empty matrices may also be used in assignment statements as a convenient
way to delete rows or columns of matrices.
@xref{Assignment Ops,,Assignment Expressions}.

When Octave parses a matrix expression, it examines the elements of the
list to determine whether they are all constants.  If they are, it
replaces the list with a single matrix constant.

@node Ranges
@section Ranges
@cindex range expressions
@cindex expression, range

@opindex :, range expressions

A @dfn{range} is a convenient way to write a row vector with evenly
spaced elements.  A range expression is defined by the value of the first
element in the range, an optional value for the increment between
elements, and a maximum value which the elements of the range will not
exceed.  The base, increment, and limit are separated by colons (the
@samp{:} character) and may contain any arithmetic expressions and
function calls.  If the increment is omitted, it is assumed to be 1.
For example, the range

@example
1 : 5
@end example

@noindent
defines the set of values @code{[ 1, 2, 3, 4, 5 ]}, and the range

@example
1 : 3 : 5
@end example

@noindent
defines the set of values @code{[ 1, 4 ]}.

Although a range constant specifies a row vector, Octave does @emph{not}
normally convert range constants to vectors unless it is necessary to do so.
This allows you to write a constant like @code{1 : 10000} without using
80,000 bytes of storage on a typical 32-bit workstation.

A common example of when it does become necessary to convert ranges into
vectors occurs when they appear within a vector (i.e., inside square
brackets).  For instance, whereas

@example
x = 0 : 0.1 : 1;
@end example

@noindent
defines @var{x} to be a variable of type @code{range} and occupies 24
bytes of memory, the expression

@example
y = [ 0 : 0.1 : 1];
@end example

@noindent
defines @var{y} to be of type @code{matrix} and occupies 88 bytes of
memory.

This space saving optimization may be disabled using the function
@dfn{optimize_range}.

@DOCSTRING(optimize_range)

Note that the upper (or lower, if the increment is negative) bound on
the range is not always included in the set of values, and that ranges
defined by floating point values can produce surprising results because
Octave uses floating point arithmetic to compute the values in the
range.  If it is important to include the endpoints of a range and the
number of elements is known, you should use the @code{linspace} function
instead (@pxref{Special Utility Matrices}).

When adding a scalar to a range, subtracting a scalar from it (or subtracting a
range from a scalar) and multiplying by scalar, Octave will attempt to avoid
unpacking the range and keep the result as a range, too, if it can determine
that it is safe to do so.  For instance, doing

@example
a = 2*(1:1e7) - 1;
@end example

@noindent
will produce the same result as @code{1:2:2e7-1}, but without ever forming a
vector with ten million elements.

Using zero as an increment in the colon notation, as @code{1:0:1} is not
allowed, because a division by zero would occur in determining the number of
range elements.  However, ranges with zero increment (i.e., all elements equal)
are useful, especially in indexing, and Octave allows them to be constructed
using the built-in function @code{ones}.  Note that because a range must be a
row vector, @code{ones (1, 10)} produces a range, while @code{ones (10, 1)}
does not.

When Octave parses a range expression, it examines the elements of the
expression to determine whether they are all constants.  If they are, it
replaces the range expression with a single range constant.

@node Single Precision Data Types
@section Single Precision Data Types

Octave includes support for single precision data types, and most of the
functions in Octave accept single precision values and return single
precision answers.  A single precision variable is created with the
@code{single} function.

@DOCSTRING(single)

for example:

@example
@group
sngl = single (rand (2, 2))
     @result{} sngl =
        0.37569   0.92982
        0.11962   0.50876
class (sngl)
    @result{} single
@end group
@end example

Many functions can also return single precision values directly.  For
example

@example
@group
ones (2, 2, "single")
zeros (2, 2, "single")
eye (2, 2,  "single")
rand (2, 2, "single")
NaN (2, 2, "single")
NA (2, 2, "single")
Inf (2, 2, "single")
@end group
@end example

@noindent
will all return single precision matrices.

@node Integer Data Types
@section Integer Data Types

Octave supports integer matrices as an alternative to using double
precision.  It is possible to use both signed and unsigned integers
represented by 8, 16, 32, or 64 bits.  It should be noted that most
computations require floating point data, meaning that integers will
often change type when involved in numeric computations.  For this
reason integers are most often used to store data, and not for
calculations.

In general most integer matrices are created by casting
existing matrices to integers.  The following example shows how to cast
a matrix into 32 bit integers.

@example
@group
float = rand (2, 2)
     @result{} float = 0.37569   0.92982
                0.11962   0.50876
integer = int32 (float)
     @result{} integer = 0  1
                  0  1
@end group
@end example

@noindent
As can be seen, floating point values are rounded to the nearest integer
when converted.

@DOCSTRING(isinteger)

@DOCSTRING(int8)

@DOCSTRING(uint8)

@DOCSTRING(int16)

@DOCSTRING(uint16)

@DOCSTRING(int32)

@DOCSTRING(uint32)

@DOCSTRING(int64)

@DOCSTRING(uint64)

@DOCSTRING(intmax)

@DOCSTRING(intmin)

@DOCSTRING(flintmax)

@menu
* Integer Arithmetic::
@end menu

@node Integer Arithmetic
@subsection Integer Arithmetic

While many numerical computations can't be carried out in integers,
Octave does support basic operations like addition and multiplication
on integers.  The operators @code{+}, @code{-}, @code{.*}, and @code{./}
work on integers of the same type.  So, it is possible to add two 32 bit
integers, but not to add a 32 bit integer and a 16 bit integer.

When doing integer arithmetic one should consider the possibility of
underflow and overflow.  This happens when the result of the computation
can't be represented using the chosen integer type.  As an example it is
not possible to represent the result of @math{10 - 20} when using
unsigned integers.  Octave makes sure that the result of integer
computations is the integer that is closest to the true result.  So, the
result of @math{10 - 20} when using unsigned integers is zero.

When doing integer division Octave will round the result to the nearest
integer.  This is different from most programming languages, where the
result is often floored to the nearest integer.  So, the result of
@code{int32 (5) ./ int32 (8)} is @code{1}.

@DOCSTRING(idivide)

@node Bit Manipulations
@section Bit Manipulations

Octave provides a number of functions for the manipulation of numeric
values on a bit by bit basis.  The basic functions to set and obtain the
values of individual bits are @code{bitset} and @code{bitget}.

@DOCSTRING(bitset)

@DOCSTRING(bitget)

The arguments to all of Octave's bitwise operations can be scalar or
arrays, except for @code{bitcmp}, whose @var{k} argument must a
scalar.  In the case where more than one argument is an array, then all
arguments must have the same shape, and the bitwise operator is applied
to each of the elements of the argument individually.  If at least one
argument is a scalar and one an array, then the scalar argument is
duplicated.  Therefore

@example
bitget (100, 8:-1:1)
@end example

@noindent
is the same as

@example
bitget (100 * ones (1, 8), 8:-1:1)
@end example

It should be noted that all values passed to the bit manipulation
functions of Octave are treated as integers.  Therefore, even though the
example for @code{bitset} above passes the floating point value
@code{10}, it is treated as the bits @code{[1, 0, 1, 0]} rather than the
bits of the native floating point format representation of @code{10}.

As the maximum value that can be represented by a number is important
for bit manipulation, particularly when forming masks, Octave supplies
two utility functions: @code{flintmax} for floating point integers, and
@code{intmax} for integer objects (@code{uint8}, @code{int64}, etc.).

Octave also includes the basic bitwise 'and', 'or', and 'exclusive or'
operators.

@DOCSTRING(bitand)

@DOCSTRING(bitor)

@DOCSTRING(bitxor)

The bitwise 'not' operator is a unary operator that performs a logical
negation of each of the bits of the value.  For this to make sense, the
mask against which the value is negated must be defined.  Octave's
bitwise 'not' operator is @code{bitcmp}.

@DOCSTRING(bitcmp)

Octave also includes the ability to left-shift and right-shift values bitwise.

@DOCSTRING(bitshift)

Bits that are shifted out of either end of the value are lost.  Octave
also uses arithmetic shifts, where the sign bit of the value is kept
during a right shift.  For example:

@example
@group
bitshift (-10, -1)
@result{} -5
bitshift (int8 (-1), -1)
@result{} -1
@end group
@end example

Note that @code{bitshift (int8 (-1), -1)} is @code{-1} since the bit
representation of @code{-1} in the @code{int8} data type is @code{[1, 1,
1, 1, 1, 1, 1, 1]}.

@node Logical Values
@section Logical Values

Octave has built-in support for logical values, i.e., variables that
are either @code{true} or @code{false}.  When comparing two variables,
the result will be a logical value whose value depends on whether or
not the comparison is true.

The basic logical operations are @code{&}, @code{|}, and @code{!},
which correspond to ``Logical And'', ``Logical Or'', and ``Logical
Negation''.  These operations all follow the usual rules of logic.

It is also possible to use logical values as part of standard numerical
calculations.  In this case @code{true} is converted to @code{1}, and
@code{false} to 0, both represented using double precision floating
point numbers.  So, the result of @code{true*22 - false/6} is @code{22}.

Logical values can also be used to index matrices and cell arrays.
When indexing with a logical array the result will be a vector containing
the values corresponding to @code{true} parts of the logical array.
The following example illustrates this.

@example
@group
data = [ 1, 2; 3, 4 ];
idx = (data <= 2);
data(idx)
     @result{} ans = [ 1; 2 ]
@end group
@end example

@noindent
Instead of creating the @code{idx} array it is possible to replace
@code{data(idx)} with @w{@code{data( data <= 2 )}} in the above code.

Logical values can also be constructed by
casting numeric objects to logical values, or by using the @code{true}
or @code{false} functions.

@DOCSTRING(logical)

@DOCSTRING(true)

@DOCSTRING(false)

@node Promotion and Demotion of Data Types
@section Promotion and Demotion of Data Types

Many operators and functions can work with mixed data types.  For example,

@example
@group
uint8 (1) + 1
    @result{} 2
@end group
@end example

@noindent
where the above operator works with an 8-bit integer and a double precision
value and returns an 8-bit integer value.  Note that the type is demoted
to an 8-bit integer, rather than promoted to a double precision value as
might be expected.  The reason is that if Octave promoted values in
expressions like the above with all numerical constants would need to be
explicitly cast to the appropriate data type like

@example
@group
uint8 (1) + uint8 (1)
    @result{} 2
@end group
@end example

@noindent
which becomes difficult for the user to apply uniformly and might allow
hard to find bugs to be introduced.  The same applies to single precision
values where a mixed operation such as

@example
@group
single (1) + 1
    @result{} 2
@end group
@end example

@noindent
returns a single precision value.  The mixed operations that are valid
and their returned data types are

@multitable @columnfractions .2 .3 .3 .2
@headitem @tab Mixed Operation @tab Result @tab
@item @tab double OP single @tab single @tab
@item @tab double OP integer @tab integer @tab
@item @tab double OP char @tab double @tab
@item @tab double OP logical @tab double @tab
@item @tab single OP integer @tab integer @tab
@item @tab single OP char @tab single @tab
@item @tab single OP logical @tab single @tab
@end multitable

The same logic applies to functions with mixed arguments such as

@example
@group
min (single (1), 0)
   @result{} 0
@end group
@end example

@noindent
where the returned value is single precision.

Many functions and operators will also promote integer or logical types to
double, or single to double, especially if they take only one argument.

@example
@group
a = det (int8 ([1 2; 3 4]))
    @result{} a = -2
class (a)
    @result{} double
@end group
@end example

But there are also exceptions for promoting to double, especially if the
function or operator in question can take multiple arguments.

@example
@group
a = eig (int8 ([1 2; 3 4]))
    @result{} error: eig: wrong type argument 'int8 matrix'
@end group
@end example

When the two operands are both integers but of different widths, then the
behavior depends on the operator or the function in question.  For some
operators and functions, narrow-bitwidth operands are promoted to a wider
bitwidth:

@example
@group
a = min (int8 (100), int16 (200))
    @result{} 100
class (a)
    @result{} int16
@end group
@end example

However, not all functions or operators will accept integer operands of
differing types:

@example
@group
int8 (100) + int16 (200)
   @result{} error: binary operator '+' not implemented
   for 'int8 scalar' by 'int16 scalar' operations
@end group
@end example

Further, in most cases, both operands need to be signed or both need to be
unsigned.  Mixing signed and unsigned usually causes an error, even if they
are of the same bitwidth.

@example
@group
min (int8 (100), uint16 (200))
   @result{} error: min: cannot compute min (int8 scalar, uint16 scalar)
@end group
@end example

In the case of mixed type indexed assignments, the type is not
changed.  For example,

@example
@group
x = ones (2, 2);
x(1, 1) = single (2)
   @result{} x = 2   1
          1   1
@end group
@end example

@noindent
where @code{x} remains of the double precision type.

@node Predicates for Numeric Objects
@section Predicates for Numeric Objects

Since the type of a variable may change during the execution of a
program, it can be necessary to do type checking at run-time.  Doing this
also allows you to change the behavior of a function depending on the
type of the input.  As an example, this naive implementation of @code{abs}
returns the absolute value of the input if it is a real number, and the
length of the input if it is a complex number.

@example
@group
function a = abs (x)
  if (isreal (x))
    a = sign (x) .* x;
  elseif (iscomplex (x))
    a = sqrt (real(x).^2 + imag(x).^2);
  endif
endfunction
@end group
@end example

The following functions are available for determining the type of a
variable.

@DOCSTRING(isnumeric)

@DOCSTRING(islogical)

@DOCSTRING(isfloat)

@DOCSTRING(isreal)

@DOCSTRING(iscomplex)

@DOCSTRING(ismatrix)

@DOCSTRING(isvector)

@DOCSTRING(isrow)

@DOCSTRING(iscolumn)

@DOCSTRING(isscalar)

@DOCSTRING(issquare)

@DOCSTRING(issymmetric)

@DOCSTRING(ishermitian)

@DOCSTRING(isdefinite)

@DOCSTRING(isbanded)

@DOCSTRING(isdiag)

@DOCSTRING(istril)

@DOCSTRING(istriu)

@DOCSTRING(isprime)

If instead of knowing properties of variables, you wish to know which
variables are defined and to gather other information about the
workspace itself, @pxref{Status of Variables}.