view doc/interpreter/stats.txi @ 8920:eb63fbe60fab

update copyright notices
author John W. Eaton <jwe@octave.org>
date Sat, 07 Mar 2009 10:41:27 -0500
parents 8463d1a2e544
children 2d0f8692a82e
line wrap: on
line source

@c Copyright (C) 1996, 1997, 1999, 2000, 2002, 2004, 2005, 2006,
@c               2007, 2008, 2009 John W. Eaton
@c
@c This file is part of Octave.
@c
@c Octave is free software; you can redistribute it and/or modify it
@c under the terms of the GNU General Public License as published by the
@c Free Software Foundation; either version 3 of the License, or (at
@c your option) any later version.
@c 
@c Octave is distributed in the hope that it will be useful, but WITHOUT
@c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
@c FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
@c for more details.
@c 
@c You should have received a copy of the GNU General Public License
@c along with Octave; see the file COPYING.  If not, see
@c <http://www.gnu.org/licenses/>.

@node Statistics
@chapter Statistics

Octave has support for various statistical methods.  This includes
basic descriptive statistics, statistical tests, random number generation,
and much more.

The functions that analyze data all assume that multidimensional data
is arranged in a matrix where each row is an observation, and each
column is a variable.  So, the matrix defined by

@example
a = [ 0.9, 0.7;
      0.1, 0.1;
      0.5, 0.4 ];
@end example

@noindent
contains three observations from a two-dimensional distribution.
While this is the default data arrangement, most functions support
different arrangements.

It should be noted that the statistics functions don't test for data
containing NaN, NA, or Inf.  Such values need to be handled explicitly.

@menu
* Descriptive Statistics::
* Basic Statistical Functions:: 
* Statistical Plots:: 
* Tests::                       
* Models::                      
* Distributions::     
* Random Number Generation::          
@end menu

@node Descriptive Statistics
@section Descriptive Statistics

Octave can compute various statistics such as the moments of a data set.

@DOCSTRING(mean)

@DOCSTRING(median)

@DOCSTRING(quantile)

@DOCSTRING(prctile)

@DOCSTRING(meansq)

@DOCSTRING(std)

@DOCSTRING(var)

@DOCSTRING(mode)

@DOCSTRING(cov)

@DOCSTRING(cor)

@DOCSTRING(corrcoef)

@DOCSTRING(kurtosis)

@DOCSTRING(skewness)

@DOCSTRING(statistics)

@DOCSTRING(moment)

@node Basic Statistical Functions
@section Basic Statistical Functions

Octave also supports various helpful statistical functions.

@DOCSTRING(mahalanobis)

@DOCSTRING(center)

@DOCSTRING(studentize)

@DOCSTRING(nchoosek)

@DOCSTRING(perms)

@DOCSTRING(values)

@DOCSTRING(table)

@DOCSTRING(spearman)

@DOCSTRING(run_count)

@DOCSTRING(ranks)

@DOCSTRING(range)

@DOCSTRING(probit)

@DOCSTRING(logit)

@DOCSTRING(cloglog)

@DOCSTRING(kendall)

@DOCSTRING(iqr)

@DOCSTRING(cut)

@node Statistical Plots
@section Statistical Plots

@c Should hist be moved to here, or perhaps the qqplot and ppplot
@c functions should be moved to the Plotting Chapter?

Octave can create Quantile Plots (QQ-Plots), and Probability Plots
(PP-Plots).  These are simple graphical tests for determining if a
data set comes from a certain distribution.

Note that Octave can also show histograms of data
using the @code{hist} function as described in
@ref{Two-Dimensional Plots}.

@DOCSTRING(qqplot)

@DOCSTRING(ppplot)

@node Tests
@section Tests

Octave can perform several different statistical tests.  The following
table summarizes the available tests.

@iftex
@tex
\vskip 6pt
{\hbox to \hsize {\hfill\vbox{\offinterlineskip \tabskip=0pt 
\halign{
\vrule height2.0ex depth1.ex width 0.6pt #\tabskip=0.3em &
# \hfil & \vrule # & # \hfil & # \vrule width 0.6pt \tabskip=0pt\cr
\noalign{\hrule height 0.6pt}
& @strong{Hypothesis} && {\bf Test Functions} &\cr
\noalign{\hrule}
& Equal mean values && anova, hotelling\_test2, t\_test\_2, &\cr
&                   && welch\_test, wilcoxon\_test, z\_test\_2 &\cr
& Equal medians && kruskal\_wallis\_test, sign\_test &\cr
& Equal variances && bartlett\_test, manova, var\_test &\cr
& Equal distributions && chisquare\_test\_homogeneity, &\cr
&                     && kolmogorov\_smirnov\_test\_2, u\_test &\cr
& Equal marginal frequencies && mcnemar\_test &\cr
& Equal success probabilities && prop\_test\_2 &\cr
& Independent observations && chisquare\_test\_independence, &\cr
&                          && run\_test &\cr
& Uncorrelated observations && cor\_test &\cr
& Given mean value && hotelling\_test, t\_test, z\_test &\cr
& Observations from distribution && kolmogorov\_smirnov\_test &\cr
& Regression && f\_test\_regression, t\_test\_regression &\cr
\noalign{\hrule height 0.6pt}
}}\hfill}}
@end tex
@end iftex
@ifnottex
@multitable @columnfractions .4 .5
@item @strong{Hypothesis}
  @tab @strong{Test Functions}
@item Equal mean values
  @tab @code{anova}, @code{hotelling_test2}, @code{t_test_2},
       @code{welch_test}, @code{wilcoxon_test}, @code{z_test_2}
@item Equal medians
  @tab @code{kruskal_wallis_test}, @code{sign_test}
@item Equal variances
  @tab @code{bartlett_test}, @code{manova}, @code{var_test}
@item Equal distributions
  @tab @code{chisquare_test_homogeneity}, @code{kolmogorov_smirnov_test_2},
       @code{u_test}
@item Equal marginal frequencies
  @tab @code{mcnemar_test}
@item Equal success probabilities
  @tab @code{prop_test_2}
@item Independent observations
  @tab @code{chisquare_test_independence}, @code{run_test}
@item Uncorrelated observations
  @tab @code{cor_test}
@item Given mean value
  @tab @code{hotelling_test}, @code{t_test}, @code{z_test}
@item Observations from given distribution
  @tab @code{kolmogorov_smirnov_test}
@item Regression
  @tab @code{f_test_regression}, @code{t_test_regression}
@end multitable
@end ifnottex

The tests return a p-value that describes the outcome of the test.
Assuming that the test hypothesis is true, the p-value is the probability
of obtaining a worse result than the observed one.  So large p-values
corresponds to a successful test.  Usually a test hypothesis is accepted
if the p-value exceeds @math{0.05}.

@DOCSTRING(anova)

@DOCSTRING(bartlett_test)

@DOCSTRING(chisquare_test_homogeneity)

@DOCSTRING(chisquare_test_independence)

@DOCSTRING(cor_test)

@DOCSTRING(f_test_regression)

@DOCSTRING(hotelling_test)

@DOCSTRING(hotelling_test_2)

@DOCSTRING(kolmogorov_smirnov_test)

@DOCSTRING(kolmogorov_smirnov_test_2)

@DOCSTRING(kruskal_wallis_test)

@DOCSTRING(manova)

@DOCSTRING(mcnemar_test)

@DOCSTRING(prop_test_2)

@DOCSTRING(run_test)

@DOCSTRING(sign_test)

@DOCSTRING(t_test)

@DOCSTRING(t_test_2)

@DOCSTRING(t_test_regression)

@DOCSTRING(u_test)

@DOCSTRING(var_test)

@DOCSTRING(welch_test)

@DOCSTRING(wilcoxon_test)

@DOCSTRING(z_test)

@DOCSTRING(z_test_2)

@node Models
@section Models

@DOCSTRING(logistic_regression)

@node Distributions
@section Distributions

Octave has functions for computing the Probability Density Function
(PDF), the Cumulative Distribution function (CDF), and the quantile
(the inverse of the CDF) of a large number of distributions.

The following table summarizes the supported distributions (in 
alphabetical order).

@c Do the table explicitly in TeX if possible to get a better layout.
@iftex
@tex
\vskip 6pt
{\hbox to \hsize {\hfill\vbox{\offinterlineskip \tabskip=0pt 
\halign{
\vrule height2.0ex depth1.ex width 0.6pt #\tabskip=0.3em &
# \hfil & \vrule # & # \hfil & \vrule # & # \hfil & \vrule # & # \hfil &
# \vrule width 0.6pt \tabskip=0pt\cr
\noalign{\hrule height 0.6pt}
& {\bf Distribution} && {\bf PDF}      && {\bf CDF}     && {\bf Quantile}&\cr
\noalign{\hrule}
&Beta         && betapdf        && betacdf       && betainv&\cr
&Binomial     && binopdf        && binocdf       && binoinv&\cr
&Cauchy       && cauchy\_pdf    && cauchy\_cdf   && cauchy\_inv&\cr
&Chi-Square   && chi2pdf        && chi2cdf       && chi2inv&\cr
&Univariate Discrete       && discrete\_pdf  && discrete\_cdf && discrete\_inv&\cr
&Empirical    && empirical\_pdf  && empirical\_cdf && empirical\_inv&\cr
&Exponential  && exppdf         && expcdf        && expinv&\cr
&F            && fpdf           && fcdf          && finv&\cr
&Gamma        && gampdf         && gamcdf        && gaminv&\cr
&Geometric    && geopdf         && geocdf        && geoinv&\cr
&Hypergeometric            && hygepdf      && hygecdf       && hygeinv&\cr
&Kolmogorov Smirnov && {\it Not Available} && kolmogorov\_&& {\it Not Available}&\cr
&             &&                && smirnov\_cdf &&&\cr
&Laplace      && laplace\_pdf    && laplace\_cdf   && laplace\_inv&\cr
&Logistic     && logistic\_pdf   && logistic\_cdf  && logistic\_inv&\cr
&Log-Normal   && lognpdf        && logncdf       && logninv&\cr
&Pascal       && nbinpdf        && nbincdf       && nbininv&\cr
&Univariate Normal && normpdf   && normcdf       && norminv&\cr
&Poisson      && poisspdf       && poisscdf      && poissinv&\cr
&t (Student)  && tpdf           && tcdf          && tinv&\cr
&Univariate Discrete && unidpdf && unidcdf       && unidinv&\cr
&Uniform      && unifpdf        && unifcdf       && unifinv&\cr
&Weibull      && wblpdf         && wblcdf        && wblinv&\cr
\noalign{\hrule height 0.6pt}
}}\hfill}}
@end tex
@end iftex
@ifnottex
@multitable @columnfractions .31 .23 .23 .23
@item @strong{Distribution}
  @tab @strong{PDF}
  @tab @strong{CDF}
  @tab @strong{Quantile}
@item Beta Distribution
  @tab @code{betapdf}
  @tab @code{betacdf}
  @tab @code{betainv}
@item Binomial Distribution
  @tab @code{binopdf}
  @tab @code{binocdf}
  @tab @code{binoinv}
@item Cauchy Distribution
  @tab @code{cauchy_pdf}
  @tab @code{cauchy_cdf}
  @tab @code{cauchy_inv}
@item Chi-Square Distribution
  @tab @code{chi2pdf}
  @tab @code{chi2cdf}
  @tab @code{chi2inv}
@item Univariate Discrete Distribution
  @tab @code{discrete_pdf}
  @tab @code{discrete_cdf}
  @tab @code{discrete_inv}
@item Empirical Distribution
  @tab @code{empirical_pdf}
  @tab @code{empirical_cdf}
  @tab @code{empirical_inv}
@item Exponential Distribution
  @tab @code{exppdf}
  @tab @code{expcdf}
  @tab @code{expinv}
@item F Distribution
  @tab @code{fpdf}
  @tab @code{fcdf}
  @tab @code{finv}
@item Gamma Distribution
  @tab @code{gampdf}
  @tab @code{gamcdf}
  @tab @code{gaminv}
@item Geometric Distribution
  @tab @code{geopdf}
  @tab @code{geocdf}
  @tab @code{geoinv}
@item Hypergeometric Distribution
  @tab @code{hygepdf}
  @tab @code{hygecdf}
  @tab @code{hygeinv}
@item Kolmogorov Smirnov Distribution
  @tab @emph{Not Available}
  @tab @code{kolmogorov_smirnov_cdf}
  @tab @emph{Not Available}
@item Laplace Distribution
  @tab @code{laplace_pdf}
  @tab @code{laplace_cdf}
  @tab @code{laplace_inv}
@item Logistic Distribution
  @tab @code{logistic_pdf}
  @tab @code{logistic_cdf}
  @tab @code{logistic_inv}
@item Log-Normal Distribution
  @tab @code{lognpdf}
  @tab @code{logncdf}
  @tab @code{logninv}
@item Pascal Distribution
  @tab @code{nbinpdf}
  @tab @code{nbincdf}
  @tab @code{nbininv}
@item Univariate Normal Distribution
  @tab @code{normpdf}
  @tab @code{normcdf}
  @tab @code{norminv}
@item Poisson Distribution
  @tab @code{poisspdf}
  @tab @code{poisscdf}
  @tab @code{poissinv}
@item t (Student) Distribution
  @tab @code{tpdf}
  @tab @code{tcdf}
  @tab @code{tinv}
@item  Univariate Discrete Distribution
  @tab @code{unidpdf}
  @tab @code{unidcdf}
  @tab @code{unidinv}
@item Uniform Distribution
  @tab @code{unifpdf}
  @tab @code{unifcdf}
  @tab @code{unifinv}
@item Weibull Distribution
  @tab @code{wblpdf}
  @tab @code{wblcdf}
  @tab @code{wblinv}
@end multitable
@end ifnottex

@DOCSTRING(betacdf)

@DOCSTRING(betainv)

@DOCSTRING(betapdf)

@DOCSTRING(binocdf)

@DOCSTRING(binoinv)

@DOCSTRING(binopdf)

@DOCSTRING(cauchy_cdf)

@DOCSTRING(cauchy_inv)

@DOCSTRING(cauchy_pdf)

@DOCSTRING(chi2cdf)

@DOCSTRING(chi2inv)

@DOCSTRING(chi2pdf)

@DOCSTRING(discrete_cdf)

@DOCSTRING(discrete_inv)

@DOCSTRING(discrete_pdf)

@DOCSTRING(empirical_cdf)

@DOCSTRING(empirical_inv)

@DOCSTRING(empirical_pdf)

@DOCSTRING(expcdf)

@DOCSTRING(expinv)

@DOCSTRING(exppdf)

@DOCSTRING(fcdf)

@DOCSTRING(finv)

@DOCSTRING(fpdf)

@DOCSTRING(gamcdf)

@DOCSTRING(gaminv)

@DOCSTRING(gampdf)

@DOCSTRING(geocdf)

@DOCSTRING(geoinv)

@DOCSTRING(geopdf)

@DOCSTRING(hygecdf)

@DOCSTRING(hygeinv)

@DOCSTRING(hygepdf)

@DOCSTRING(kolmogorov_smirnov_cdf)

@DOCSTRING(laplace_cdf)

@DOCSTRING(laplace_inv)

@DOCSTRING(laplace_pdf)

@DOCSTRING(logistic_cdf)

@DOCSTRING(logistic_inv)

@DOCSTRING(logistic_pdf)

@DOCSTRING(logncdf)

@DOCSTRING(logninv)

@DOCSTRING(lognpdf)

@DOCSTRING(nbincdf)

@DOCSTRING(nbininv)

@DOCSTRING(nbinpdf)

@DOCSTRING(normcdf)

@DOCSTRING(norminv)

@DOCSTRING(normpdf)

@DOCSTRING(poisscdf)

@DOCSTRING(poissinv)

@DOCSTRING(poisspdf)

@DOCSTRING(tcdf)

@DOCSTRING(tinv)

@DOCSTRING(tpdf)

@DOCSTRING(unidcdf)

@DOCSTRING(unidinv)

@DOCSTRING(unidpdf)

@DOCSTRING(unifcdf)

@DOCSTRING(unifinv)

@DOCSTRING(unifpdf)

@DOCSTRING(wblcdf)

@DOCSTRING(wblinv)

@DOCSTRING(wblpdf)

@node Random Number Generation
@section Random Number Generation

Octave can generate random numbers from a large number of distributions.
The random number generators are based on the random number generators
described in @ref{Special Utility Matrices}.
@c Should rand, randn, rande, randp, and randg be moved to here?

The following table summarizes the available random number generators
(in alphabetical order).

@iftex
@tex
\vskip 6pt
{\hbox to \hsize {\hfill\vbox{\offinterlineskip \tabskip=0pt 
\halign{
\vrule height2.0ex depth1.ex width 0.6pt #\tabskip=0.3em &
# \hfil & \vrule # & # \hfil & # \vrule width 0.6pt \tabskip=0pt\cr
\noalign{\hrule height 0.6pt}
& {\bf Distribution}                && {\bf Function} &\cr
\noalign{\hrule}
& Beta Distribution                 && betarnd &\cr
& Binomial Distribution             && binornd &\cr
& Cauchy Distribution               && cauchy\_rnd &\cr
& Chi-Square Distribution           && chi2rnd &\cr
& Univariate Discrete Distribution  && discrete\_rnd &\cr
& Empirical Distribution            && empirical\_rnd &\cr
& Exponential Distribution          && exprnd &\cr
& F Distribution                    && frnd &\cr
& Gamma Distribution                && gamrnd &\cr
& Geometric Distribution            && geornd &\cr
& Hypergeometric Distribution       && hygernd &\cr
& Laplace Distribution              && laplace\_rnd &\cr
& Logistic Distribution             && logistic\_rnd &\cr
& Log-Normal Distribution           && lognrnd &\cr
& Pascal Distribution               && nbinrnd &\cr
& Univariate Normal Distribution    && normrnd &\cr
& Poisson Distribution              && poissrnd &\cr
& t (Student) Distribution          && trnd &\cr
& Univariate Discrete Distribution  && unidrnd &\cr
& Uniform Distribution              && unifrnd &\cr
& Weibull Distribution              && wblrnd &\cr
& Wiener Process                    && wienrnd &\cr
\noalign{\hrule height 0.6pt}
}}\hfill}}
@end tex
@end iftex
@ifnottex
@multitable @columnfractions .4 .3
@item @strong{Distribution}             @tab @strong{Function}
@item Beta Distribution                 @tab @code{betarnd}
@item Binomial Distribution             @tab @code{binornd}
@item Cauchy Distribution               @tab @code{cauchy_rnd}
@item Chi-Square Distribution           @tab @code{chi2rnd}
@item Univariate Discrete Distribution  @tab @code{discrete_rnd}
@item Empirical Distribution            @tab @code{empirical_rnd}
@item Exponential Distribution          @tab @code{exprnd}
@item F Distribution                    @tab @code{frnd}
@item Gamma Distribution                @tab @code{gamrnd}
@item Geometric Distribution            @tab @code{geornd}
@item Hypergeometric Distribution       @tab @code{hygernd}
@item Laplace Distribution              @tab @code{laplace_rnd}
@item Logistic Distribution             @tab @code{logistic_rnd}
@item Log-Normal Distribution           @tab @code{lognrnd}
@item Pascal Distribution               @tab @code{nbinrnd}
@item Univariate Normal Distribution    @tab @code{normrnd}
@item Poisson Distribution              @tab @code{poissrnd}
@item t (Student) Distribution          @tab @code{trnd}
@item Univariate Discrete Distribution  @tab @code{unidrnd}
@item Uniform Distribution              @tab @code{unifrnd}
@item Weibull Distribution              @tab @code{wblrnd}
@item Wiener Process                    @tab @code{wienrnd}
@end multitable
@end ifnottex

@DOCSTRING(betarnd)

@DOCSTRING(binornd)

@DOCSTRING(cauchy_rnd)

@DOCSTRING(chi2rnd)

@DOCSTRING(discrete_rnd)

@DOCSTRING(empirical_rnd)

@DOCSTRING(exprnd)

@DOCSTRING(frnd)

@DOCSTRING(gamrnd)

@DOCSTRING(geornd)

@DOCSTRING(hygernd)

@DOCSTRING(laplace_rnd)

@DOCSTRING(logistic_rnd)

@DOCSTRING(lognrnd)

@DOCSTRING(nbinrnd)

@DOCSTRING(normrnd)

@DOCSTRING(poissrnd)

@DOCSTRING(trnd)

@DOCSTRING(unidrnd)

@DOCSTRING(unifrnd)

@DOCSTRING(wblrnd)

@DOCSTRING(wienrnd)