median, mean, mode as minimizers
2024年5月9日Median, mean, and mode are the most common measures of central tendency.
Norms
In mathematics, a norm measures the “distance” from a point to another. A norm is denoted by double vertical lines: \(||x||\).
Absolute-value norm is a norm on a single value, see \(\eqref{absolute}\).
$$
\begin{equation}
||x||=|x|
\label{absolute}
\end{equation}
$$
Euclidean norm is a norm on a list, see \(\eqref{euclidean}\).
$$
\begin{equation}
||\mathbf{x}||= \sqrt{x_1^2 + \cdots + x_n^2}
\label{euclidean}
\end{equation}
$$
p-norm is a generalization of Euclidean norm. It is a norm on a list.
Let \(p \geq 1\) be a real number. The p-norm of list \(\mathbf{x} = (x_1, \ldots, x_n)\) is
$$ ||\mathbf{x}||_p := \left(\sum_{i=1}^n \left|x_i\right|^p\right)^{1/p}. $$
When p=1, p-norm reduces to \(||\mathbf{x}||_1 := \sum_{i=1}^n \left|x_i\right| \). It takes absolute values, but it is not the absolute-value norm due to the \(\sum\) symbol.
When p=2, we get Euclidean norm.
As p approaches \(\infty\), the p-norm approaches the infinity norm or maximum norm: \(||\mathbf{x}||_\infty := \max_i \left|x_i\right|\) .
When p=0, 0-norm is not defined.
Statistical dispersions
In statistics, dispersion is the extent to which a distribution is stretched or squeezed.
For a given (finite) data set \(\mathbf{x}\), the dispersion about a point c is the “distance” from \(x_i\) to the c in the p-norm:
$$ f_p(c) = \left\| \mathbf{x} – \mathbf{c} \right\|_p = \left( \frac{1}{n} \sum_{i=1}^n \left| x_i – c\right| ^p \right) ^{1/p} $$
p | dispersion | central tendency |
---|---|---|
0 | variation ratio | mode |
1 | average absolute deviation | median |
2 | standard deviation | mean |
\(\infty\) | maximum deviation | mid-range |
Average Absolute Deviation has the identical form of Mean Absolute Error (MAE) which is from the viewpoint of prediction.
The 2-norm is Mean Squared Error (MSE) in addition with the power of \(\frac{1}{2}\). MSE is \(\frac{1}{n} \sum_{i=1}^n \left(x_i-c \right)^2\).
The 0-norm counts the number of unequal points. Mode minimizes this 0-norm.
For \( p = \infty\) the largest number dominates, and thus mid-range \(M=\frac{\max x + \min x}{2} \) minimizes \(\infty\)-norm.
See also
Minimum Cost to Make Array Equalindromic
[…] 向量的模表示向量的大小,记为(|boldsymbol{a}|) 或(|vec{a}|)。[1]模是绝对值在二维和三维空间的推广,可以认为是向量的长度。推广到高维空间中称为范数(norm)。[2](另见《median, mean, mode as minimizers》) […]