CS 530 Octave

Octave Statistical Functions

- Function File:  mean (X, OPT)
     If X is a vector, compute the mean of the elements of X
 
          mean (x) = SUM_i x(i) / N
     If X is a matrix, compute the mean for each column and return them
     in a row vector.
 
     With the optional argument OPT, the kind of mean computed can be
     selected.  The following options are recognized:
 
    `"a"'
          Compute the (ordinary) arithmetic mean.  This is the default.
 
    `"g"'
          Computer the geometric mean.
 
    `"h"'          Compute the harmonic mean.
 
 - Function File:  median (X)
     If X is a vector, compute the median value of the elements of X.
 
                      x(ceil(N/2)),             N odd
          median(x) =
                      (x(N/2) + x((N/2)+1))/2,  N even
     If X is a matrix, compute the median value for each column
     and return them in a row vector.
 
 - Function File:  std (X)
     If X is a vector, compute the standard deviation of the elements
     of X.
 
          std (x) = sqrt (sumsq (x - mean (x)) / (n - 1))
     If X is a matrix, compute the standard deviation for each
     column and return them in a row vector.

- Function File:  cov (X, Y)
     If each row of X and Y is an observation and each column is a
     variable, the (I,J)-th entry of `cov (X, Y)' is the covariance
     between the I-th variable in X and the J-th variable in Y.  If
     called with one argument, compute `cov (X, X)'.
 
 - Function File:  corrcoef (X, Y)
     If each row of X and Y is an observation and each column is a
     variable, the (I,J)-th entry of `corrcoef (X, Y)' is the
     correlation between the I-th variable in X and the J-th variable
     in Y.  If called with one argument, compute `corrcoef (X, X)'.

 - Function File:  kurtosis (X)
     If X is a vector of length N, return the kurtosis
 
          kurtosis (x) = N^(-1) std(x)^(-4) sum ((x - mean(x)).^4) - 3
 
     of X.  If X is a matrix, return the row vector containing the
     kurtosis of each column.

 - Function File:  mahalanobis (X, Y)
     Return the Mahalanobis' D-square distance between the multivariate
     samples X and Y, which must have the same number of components
     (columns), but may have a different number of observations (rows).

 - Function File:  skewness (X)
     If X is a vector of length n, return the skewness
 
          skewness (x) = N^(-1) std(x)^(-3) sum ((x - mean(x)).^3)
 
     of X.  If X is a matrix, return the row vector containing the
     skewness of each column.
 
 - Function File:  values (X)
     Return the different values in a column vector, arranged in
     ascending order.

 - Function File:  var (X)
     For vector arguments, return the (real) variance of the values.
     For matrix arguments, return a row vector contaning the variance
     for each column.
 
 - Function File: [T, L_X] = table (X)
 - Function File: [T, L_X, L_Y] = table (X, Y)
     Create a contingency table T from data vectors.  The L vectors are
     the corresponding levels.
 
     Currently, only 1- and 2-dimensional tables are supported.

 - Function File:  studentize (X)
     If X is a vector, subtract its mean and divide by its standard
     deviation.
 
     If X is a matrix, do the above for each column.
 
 - Function File:  statistics (X)
     If X is a matrix, return a matrix with the minimum, first
     quartile, median, third quartile, maximum, mean, standard
     deviation, skewness and kurtosis of the columns of X as its rows.
 
     If X is a vector, treat it as a column vector.

 - Function File:  spearman (X, Y)
     Compute Spearman's rank correlation coefficient RHO for each of
     the variables specified by the input arguments.
 
     For matrices, each row is an observation and each column a
     variable; vectors are always observations and may be row or column
     vectors.
 
     `spearman (X)' is equivalent to `spearman (X, X)'.
 
     For two data vectors X and Y, Spearman's RHO is the correlation of
     the ranks of X and Y.

     If X and Y are drawn from independent distributions, RHO has zero
     mean and variance `1 / (n - 1)', and is asymptotically normally
     distributed.
 
 - Function File:  run_count (X, N)
     Count the upward runs in the columns of X of length 1, 2, ..., N-1
     and greater than or equal to N.

 - Function File:  ranks (X)
     If X is a vector, return the (column) vector of ranks of X
     adjusted for ties.
 
     If X is a matrix, do the above for each column of X.
 
 - Function File:  range (X)
     If X is a vector, return the range, i.e., the difference between
     the maximum and the minimum, of the input data.
 
     If X is a matrix, do the above for each column of X.
 
 - Function File: [Q, S] = qqplot (X, DIST, PARAMS)
     Perform a QQ-plot (quantile plot).

     If F is the CDF of the distribution DIST with parameters PARAMS
     and G its inverse, and X a sample vector of length N, the QQ-plot
     graphs ordinate S(I) = I-th largest element of x versus abscissa
     Q(If) = G((I - 0.5)/N).
 
     If the sample comes from F except for a transformation of location
     and scale, the pairs will approximately follow a straight line.
 
     The default for DIST is the standard normal distribution.  The
     optional argument PARAMS contains a list of parameters of DIST.
     For example, for a quantile plot of the uniform distribution on
     [2,4] and X, use
 
          qqplot (x, "uniform", 2, 4)
 
     If no output arguments are given, the data are plotted directly.
 
 - Function File:  probit (P)
     For each component of P, return the probit (the quantile of the
     standard normal distribution) of P.

 Function File: [P, Y] = ppplot (X, DIST, PARAMS)
     Perform a PP-plot (probability plot).
 
     If F is the CDF of the distribution DIST with parameters PARAMS
     and X a sample vector of length N, the PP-plot graphs ordinate
     Y(I) = F (I-th largest element of X) versus abscissa P(I) = (I -
     0.5)/N.  If the sample comes from F, the pairs will approximately
     follow a straight line.
 
     The default for DIST is the standard normal distribution.  The
     optional argument PARAMS contains a list of parameters of DIST.
     For example, for a probability plot of the uniform distribution on
     [2,4] and X, use
 
          ppplot (x, "uniform", 2, 4)

 
     If no output arguments are given, the data are plotted directly.
 
 - Function File:  moment (X, P, OPT)
     If X is a vector, compute the P-th moment of X.
 
     If X is a matrix, return the row vector containing the P-th moment
     of each column.
 
     With the optional string opt, the kind of moment to be computed can
     be specified.  If opt contains `"c"' or `"a"', central and/or
     absolute moments are returned.  For example,
 
          moment (x, 3, "ac")
 
     computes the third central absolute moment of X.
 
 - Function File:  meansq (X)
     For vector arguments, return the mean square of the values.  For
     matrix arguments, return a row vector contaning the mean square of
     each column.

 - Function File:  logit (P)
     For each component of P, return the logit `log (P / (1-P))' of P.
 
 - Function File:  kendall (X, Y)
     Compute Kendall's TAU for each of the variables specified by the
     input arguments.
 
     For matrices, each row is an observation and each column a
     variable; vectors are always observations and may be row or column
     vectors.
 
     `kendall (X)' is equivalent to `kendall (X, X)'.
 
     For two data vectors X, Y of common length N, Kendall's TAU is the
     correlation of the signs of all rank differences of X and Y;
     i.e., if both X and Y have distinct entries, then
 
                   1
          tau = -------   SUM sign (q(i) - q(j)) * sign (r(i) - r(j))
                n (n-1)   i,j
 
     in which the Q(I) and R(I)  are the ranks of X and Y, respectively.
 
     If X and Y are drawn from independent distributions, Kendall's TAU
     is asymptotically normal with mean 0 and variance `(2 * (2N+5)) /
     (9 * N * (N-1))'.

 - Function File:  iqr (X)
     If X is a vector, return the interquartile range, i.e., the
     difference between the upper and lower quartile, of the input data.
 
     If X is a matrix, do the above for each column of X.
 
 - Function File:  cut (X, BREAKS)
     Create categorical data out of numerical or continuous data by
     cutting into intervals.
 
     If BREAKS is a scalar, the data is cut into that many equal-width
     intervals.  If BREAKS is a vector of break points, the category
     has `length (BREAKS) - 1' groups.
 
     The returned value is a vector of the same size as X telling which
     group each point in X belongs to.  Groups are labelled from 1 to
     the number of groups; points outside the range of BREAKS are
     labelled by `NaN'.

 - Function File:  cor (X, Y)
     The (I,J)-th entry of `cor (X, Y)' is the correlation between the
     I-th variable in X and the J-th variable in Y.
 
     For matrices, each row is an observation and each column a
     variable; vectors are always observations and may be row or column
     vectors.
 
     `cor (X)' is equivalent to `cor (X, X)'.
 
 - Function File:  cloglog (X)
     Return the complementary log-log function of X, defined as
 
          - log (- log (X))

 
 - Function File:  center (X)
     If X is a vector, subtract its mean.  If X is a matrix, do the
     above for each column.
&nbps