Statistics
For the following functions, when the input is a collection, it can take the following forms:
- Multiple arguments, e.g.
["Mean", 1, 2, 3]
- A list of numbers, e.g.
["Mean", ["List", 1, 2, 3]]
- A matrix, e.g.
["Mean", ["List", ["List", 1, 2], ["List", 3, 4]]]
- A range, e.g.
["Mean", ["Range", 1, 10]]
- A linear space:
["Mean", ["Linspace", 1, 5, 10]]
Functions
Mean: (collection)
Evaluate to the arithmetic mean of a collection of numbers.
The arithmetic mean is the average of the list of numbers. The mean is calculated by dividing the sum of the numbers by the number of numbers in the list.
The formula for the mean of a list of numbers is
\bar{x} = \frac{1}{n}
\sum\_{i=1}^n x_i
n
is the number of numbers in the list, and
x_i
is the i
-th number in the list.["Mean", ["List", 7, 8, 3.1, 12, 77]]
// 21.02
Median: (collection)
Evaluate to the median of a collection of numbers.
The median is the value separating the higher half from the lower half of a data sample. For a list of numbers sorted in ascending order, the median is the middle value of the list. If the list has an odd number of elements, the median is the middle element. If the list has an even number of elements, the median is the average of the two middle elements.
["Median", ["List", 1, 2, 3, 4, 5]]
// 3
Mode: (collection)
Evaluate to the mode of a collection of numbers.
The mode is the value that appears most often in a list of numbers. A list of
numbers can have more than one mode. If there are two modes, the list is called
bimodal. For example \lbrack 2, 5, 5, 3, 2\rbrack
. If there are three
modes, the list is called trimodal. If there are more than three modes, the
list is called multimodal.
["Mode", ["List", 1, 2, 2, 3, 4, 4, 5, 5]]
// 2
Variance: (collection)
Evaluate to the variance of a collection of numbers.
The variance is a measure of the amount of variation or dispersion of a set of values. A low variance indicates that the values tend to be close to the mean of the set, while a high variance indicates that the values are spread out over a wider range.
The formula for the variance of a list of numbers is
\frac{1}{n} \sum_{i=1}^n(x_i - \mu)^2
where \mu
is the mean of the list.
PopulationVariance: (collection)
Evaluate to the population variance of a collection of numbers.
The population variance is the variance calculated by dividing the sum of squared differences from the mean by the number of elements in the population.
The formula for the population variance is
\frac{1}{N} \sum_{i=1}^N (x_i - \mu)^2
where N
is the size of the population, and \mu
is the population mean.
StandardDeviation: (collection)
Evaluate to the standard deviation of a collection of numbers.
The standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.
The formula for the standard deviation of a collection of numbers is
\sqrt{\frac{1}{n} \sum_{i=1}^n (x_i - \mu)^2}
where \mu
is the mean of the list.
PopulationStandardDeviation: (collection)
Evaluate to the population standard deviation of a collection of numbers.
The population standard deviation is the square root of the population variance.
The formula for the population standard deviation is
\sqrt{\frac{1}{N} \sum_{i=1}^N (x_i - \mu)^2}
where N
is the size of the population, and \mu
is the population mean.
Skewness: (collection)
Evaluate to the skewness of a list of numbers.
The skewness is a measure of the asymmetry of the distribution of a real-valued random variable about its mean. The skewness value can be positive or negative, or undefined.
The formula for the skewness of a collection of numbers is:
\frac{1}{n}
\sum_{i=1}^n \left(\frac{x_i - \mu}{\sigma}\right)^3
where \mu
is the mean of the collection, and \sigma
is the
standard deviation of the collection.
Kurtosis: (collection)
Evaluate to the kurtosis of a collection of numbers.
The kurtosis is a measure of the "tailedness" of the distribution of a real-valued random variable. The kurtosis value can be positive or negative, or undefined.
The formula for the kurtosis of a collection of numbers is
\frac{1}{n} \sum_{i=1}^n \left(\frac{x_i - \mu}{\sigma}\right)^4
where \mu
is the mean of the list, and \sigma
is the standard
deviation of the list.
Quantile: (collection, q:number)
Evaluate to the quantile of a collection of numbers.
The quantile is a value that divides a collection of numbers into equal-sized groups. The quantile is a generalization of the median, which divides a collection of numbers into two equal-sized groups.
So, \operatorname{median} = \operatorname{quantile}(0.5)
.
Quartiles: (collection)
Evaluate to the quartiles of a collection of numbers.
The quartiles are the three points that divide a collection of numbers into four equal groups, each group comprising a quarter of the collection.
["Quartiles", ["List", 1, 2, 3, 4, 5, 6, 7, 8]]
// [2.5, 4.5, 6.5]
InterquartileRange: (collection)
Evaluate to the interquartile range (IRQ) of a collection of numbers.
The interquartile range is the difference between the third quartile and the first quartile.
Histogram: (collection, bins:number)
Evaluate to the histogram of a collection of numbers.
The histogram groups the data into a specified number of bins and counts the number of elements in each bin.
["Histogram", ["List", 1, 2, 2, 3, 4, 5, 5, 5], 3]
// [2, 2, 5]
BinCounts: (collection, bins:number)
Evaluate to the bin counts of a collection of numbers.
Bin counts are the counts of the number of elements in each bin for a given number of bins.
["BinCounts", ["List", 1, 2, 2, 3, 4, 5, 5, 5], 3]
// [2, 2, 5]
SlidingWindow: (collection, windowSize:number)
Evaluate to the sliding windows of a collection of numbers.
A sliding window is a moving subset of the data of a specified window size.
["SlidingWindow", ["List", 1, 2, 3, 4, 5], 3]
// [[1, 2, 3], [2, 3, 4], [3, 4, 5]]
Sample: (collection, size:number)
Evaluate to a random sample of a specified size from a collection of numbers.
Sampling is done without replacement unless otherwise specified.
Rank: (collection)
Evaluate to the rank of each element in a collection of numbers.
The rank is the position of each element in the sorted order of the collection.
Argsort: (collection)
Evaluate to the indices that would sort a collection of numbers.
This returns a list of indices that sorts the collection.