The Statistics module contains basic statistics functionality.

`Statistics.std`

Function
std(itr; corrected::Bool=true, mean=nothing[, dims])

Compute the sample standard deviation of collection `itr`

.

The algorithm returns an estimator of the generative distribution's standard deviation under the assumption that each entry of `itr`

is an IID drawn from that generative distribution. For arrays, this computation is equivalent to calculating `sqrt(sum((itr .- mean(itr)).^2) / (length(itr) - 1))`

. If `corrected`

is `true`

, then the sum is scaled with `n-1`

, whereas the sum is scaled with `n`

if `corrected`

is `false`

with `n`

the number of elements in `itr`

.

A pre-computed `mean`

may be provided.

If `itr`

is an `AbstractArray`

, `dims`

can be provided to compute the standard deviation over dimensions, and `means`

may contain means for each dimension of `itr`

.

Note

If array contains `NaN`

or `missing`

values, the result is also `NaN`

or `missing`

(`missing`

takes precedence if array contains both). Use the `skipmissing`

function to omit `missing`

entries and compute the standard deviation of non-missing values.

`Statistics.stdm`

Function
stdm(itr, m; corrected::Bool=true)

Compute the sample standard deviation of collection `itr`

, with known mean(s) `m`

.

The algorithm returns an estimator of the generative distribution's standard deviation under the assumption that each entry of `itr`

is an IID drawn from that generative distribution. For arrays, this computation is equivalent to calculating `sqrt(sum((itr .- mean(itr)).^2) / (length(itr) - 1))`

. If `corrected`

is `true`

, then the sum is scaled with `n-1`

, whereas the sum is scaled with `n`

if `corrected`

is `false`

with `n`

the number of elements in `itr`

.

A pre-computed `mean`

may be provided.

If `itr`

is an `AbstractArray`

, `dims`

can be provided to compute the standard deviation over dimensions, and `m`

may contain means for each dimension of `itr`

.

Note

If array contains `NaN`

or `missing`

values, the result is also `NaN`

or `missing`

(`missing`

takes precedence if array contains both). Use the `skipmissing`

function to omit `missing`

entries and compute the standard deviation of non-missing values.

`Statistics.var`

Function
var(itr; dims, corrected::Bool=true, mean=nothing)

Compute the sample variance of collection `itr`

.

The algorithm returns an estimator of the generative distribution's variance under the assumption that each entry of `itr`

is an IID drawn from that generative distribution. For arrays, this computation is equivalent to calculating `sum((itr .- mean(itr)).^2) / (length(itr) - 1)). If`

corrected`is`

true`, then the sum is scaled with`

n-1`, whereas the sum is scaled with`

n`if`

corrected`is`

false`with`

n`the number of elements in`

itr`.

A pre-computed `mean`

may be provided.

If `itr`

is an `AbstractArray`

, `dims`

can be provided to compute the variance over dimensions, and `mean`

may contain means for each dimension of `itr`

.

Note

If array contains `NaN`

or `missing`

values, the result is also `NaN`

or `missing`

(`missing`

takes precedence if array contains both). Use the `skipmissing`

function to omit `missing`

entries and compute the variance of non-missing values.

`Statistics.varm`

Function
varm(itr, m; dims, corrected::Bool=true)

Compute the sample variance of collection `itr`

, with known mean(s) `m`

.

The algorithm returns an estimator of the generative distribution's variance under the assumption that each entry of `itr`

is an IID drawn from that generative distribution. For arrays, this computation is equivalent to calculating `sum((itr .- mean(itr)).^2) / (length(itr) - 1)`

. If `corrected`

is `true`

, then the sum is scaled with `n-1`

, whereas the sum is scaled with `n`

if `corrected`

is `false`

with `n`

the number of elements in `itr`

.

If `itr`

is an `AbstractArray`

, `dims`

can be provided to compute the variance over dimensions, and `m`

may contain means for each dimension of `itr`

.

Note

If array contains `NaN`

or `missing`

values, the result is also `NaN`

or `missing`

(`missing`

takes precedence if array contains both). Use the `skipmissing`

function to omit `missing`

entries and compute the variance of non-missing values.

`Statistics.cor`

Function
cor(x::AbstractVector)

Return the number one.

sourcecor(X::AbstractMatrix; dims::Int=1)

Compute the Pearson correlation matrix of the matrix `X`

along the dimension `dims`

.

cor(x::AbstractVector, y::AbstractVector)

Compute the Pearson correlation between the vectors `x`

and `y`

.

cor(X::AbstractVecOrMat, Y::AbstractVecOrMat; dims=1)

Compute the Pearson correlation between the vectors or matrices `X`

and `Y`

along the dimension `dims`

.

`Statistics.cov`

Function
cov(x::AbstractVector; corrected::Bool=true)

Compute the variance of the vector `x`

. If `corrected`

is `true`

(the default) then the sum is scaled with `n-1`

, whereas the sum is scaled with `n`

if `corrected`

is `false`

where `n = length(x)`

.

cov(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)

Compute the covariance matrix of the matrix `X`

along the dimension `dims`

. If `corrected`

is `true`

(the default) then the sum is scaled with `n-1`

, whereas the sum is scaled with `n`

if `corrected`

is `false`

where `n = size(X, dims)`

.

cov(x::AbstractVector, y::AbstractVector; corrected::Bool=true)

Compute the covariance between the vectors `x`

and `y`

. If `corrected`

is `true`

(the default), computes $\frac{1}{n-1}\sum_{i=1}^n (x_i-\bar x) (y_i-\bar y)^*$ where $*$ denotes the complex conjugate and `n = length(x) = length(y)`

. If `corrected`

is `false`

, computes $\frac{1}{n}\sum_{i=1}^n (x_i-\bar x) (y_i-\bar y)^*$.

cov(X::AbstractVecOrMat, Y::AbstractVecOrMat; dims::Int=1, corrected::Bool=true)

Compute the covariance between the vectors or matrices `X`

and `Y`

along the dimension `dims`

. If `corrected`

is `true`

(the default) then the sum is scaled with `n-1`

, whereas the sum is scaled with `n`

if `corrected`

is `false`

where `n = size(X, dims) = size(Y, dims)`

.

`Statistics.mean!`

Function
mean!(r, v)

Compute the mean of `v`

over the singleton dimensions of `r`

, and write results to `r`

.

**Examples**

julia> v = [1 2; 3 4] 2×2 Array{Int64,2}: 1 2 3 4 julia> mean!([1., 1.], v) 2-element Array{Float64,1}: 1.5 3.5 julia> mean!([1. 1.], v) 1×2 Array{Float64,2}: 2.0 3.0source

`Statistics.mean`

Function
mean(itr)

Compute the mean of all elements in a collection.

Note

If `itr`

contains `NaN`

or `missing`

values, the result is also `NaN`

or `missing`

(`missing`

takes precedence if array contains both). Use the `skipmissing`

function to omit `missing`

entries and compute the mean of non-missing values.

**Examples**

julia> mean(1:20) 10.5 julia> mean([1, missing, 3]) missing julia> mean(skipmissing([1, missing, 3])) 2.0source

mean(f::Function, itr)

Apply the function `f`

to each element of collection `itr`

and take the mean.

julia> mean(√, [1, 2, 3]) 1.3820881233139908 julia> mean([√1, √2, √3]) 1.3820881233139908source

mean(A::AbstractArray; dims)

Compute the mean of an array over the given dimensions.

Julia 1.1

`mean`

for empty arrays requires at least Julia 1.1.

**Examples**

julia> A = [1 2; 3 4] 2×2 Array{Int64,2}: 1 2 3 4 julia> mean(A, dims=1) 1×2 Array{Float64,2}: 2.0 3.0 julia> mean(A, dims=2) 2×1 Array{Float64,2}: 1.5 3.5source

`Statistics.median!`

Function
median!(v)

Like `median`

, but may overwrite the input vector.

`Statistics.median`

Function
median(itr)

Compute the median of all elements in a collection. For an even number of elements no exact median element exists, so the result is equivalent to calculating mean of two median elements.

Note

If `itr`

contains `NaN`

or `missing`

values, the result is also `NaN`

or `missing`

(`missing`

takes precedence if `itr`

contains both). Use the `skipmissing`

function to omit `missing`

entries and compute the median of non-missing values.

**Examples**

julia> median([1, 2, 3]) 2.0 julia> median([1, 2, 3, 4]) 2.5 julia> median([1, 2, missing, 4]) missing julia> median(skipmissing([1, 2, missing, 4])) 2.0source

median(A::AbstractArray; dims)

Compute the median of an array along the given dimensions.

**Examples**

julia> median([1 2; 3 4], dims=1) 1×2 Array{Float64,2}: 2.0 3.0source

`Statistics.middle`

Function
middle(x)

Compute the middle of a scalar value, which is equivalent to `x`

itself, but of the type of `middle(x, x)`

for consistency.

middle(x, y)

Compute the middle of two reals `x`

and `y`

, which is equivalent in both value and type to computing their mean (`(x + y) / 2`

).

middle(range)

Compute the middle of a range, which consists of computing the mean of its extrema. Since a range is sorted, the mean is performed with the first and last element.

julia> middle(1:10) 5.5source

middle(a)

Compute the middle of an array `a`

, which consists of finding its extrema and then computing their mean.

julia> a = [1,2,3.6,10.9] 4-element Array{Float64,1}: 1.0 2.0 3.6 10.9 julia> middle(a) 5.95source

`Statistics.quantile!`

Function
quantile!([q::AbstractArray, ] v::AbstractVector, p; sorted=false)

Compute the quantile(s) of a vector `v`

at a specified probability or vector or tuple of probabilities `p`

on the interval [0,1]. If `p`

is a vector, an optional output array `q`

may also be specified. (If not provided, a new output array is created.) The keyword argument `sorted`

indicates whether `v`

can be assumed to be sorted; if `false`

(the default), then the elements of `v`

will be partially sorted in-place.

Quantiles are computed via linear interpolation between the points `((k-1)/(n-1), v[k])`

, for `k = 1:n`

where `n = length(v)`

. This corresponds to Definition 7 of Hyndman and Fan (1996), and is the same as the R default.

Note

An `ArgumentError`

is thrown if `v`

contains `NaN`

or `missing`

values.

- Hyndman, R.J and Fan, Y. (1996) "Sample Quantiles in Statistical Packages",
*The American Statistician*, Vol. 50, No. 4, pp. 361-365

**Examples**

julia> x = [3, 2, 1]; julia> quantile!(x, 0.5) 2.0 julia> x 3-element Array{Int64,1}: 1 2 3 julia> y = zeros(3); julia> quantile!(y, x, [0.1, 0.5, 0.9]) === y true julia> y 3-element Array{Float64,1}: 1.2 2.0 2.8source

`Statistics.quantile`

Function
quantile(itr, p; sorted=false)

Compute the quantile(s) of a collection `itr`

at a specified probability or vector or tuple of probabilities `p`

on the interval [0,1]. The keyword argument `sorted`

indicates whether `itr`

can be assumed to be sorted.

Quantiles are computed via linear interpolation between the points `((k-1)/(n-1), v[k])`

, for `k = 1:n`

where `n = length(itr)`

. This corresponds to Definition 7 of Hyndman and Fan (1996), and is the same as the R default.

Note

An `ArgumentError`

is thrown if `itr`

contains `NaN`

or `missing`

values. Use the `skipmissing`

function to omit `missing`

entries and compute the quantiles of non-missing values.

- Hyndman, R.J and Fan, Y. (1996) "Sample Quantiles in Statistical Packages",
*The American Statistician*, Vol. 50, No. 4, pp. 361-365

**Examples**

julia> quantile(0:20, 0.5) 10.0 julia> quantile(0:20, [0.1, 0.5, 0.9]) 3-element Array{Float64,1}: 2.0 10.0 18.0 julia> quantile(skipmissing([1, 10, missing]), 0.5) 5.5source

© 2009–2019 Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and other contributors

Licensed under the MIT License.

https://docs.julialang.org/en/v1.2.0/stdlib/Statistics/