Returns the Mahalanobis distance between x, which is
either a vector or matrix of row vectors, and the
centroid of the observations in the matrix :y.
Arguments:
x -- either a vector or a matrix of row vectors
Options:
:y -- Defaults to x, must be a matrix of row vectors which will be used to calculate a centroid
:W -- Defaults to (solve (covariance y)), if an identity matrix is provided, the mahalanobis-distance
function will be equal to the Euclidean distance.
:centroid -- Defaults to (map mean (trans y))
References:
http://en.wikipedia.org/wiki/Mahalanobis_distance
Examples:
(use '(incanter core stats charts))
;; generate some multivariate normal data with a single outlier.
(def data (bind-rows
(bind-columns
(sample-mvn 100
:sigma (matrix [[1 0.9]
[0.9 1]])))
[-1.75 1.75]))
;; view a scatter plot of the data
(let [[x y] (trans data)]
(doto (scatter-plot x y)
(add-points [(mean x)] [(mean y)])
(add-pointer -1.75 1.75 :text "Outlier")
(add-pointer (mean x) (mean y) :text "Centroid")
view))
;; calculate the distances of each point from the centroid.
(def dists (mahalanobis-distance data))
;; view a bar-chart of the distances
(view (bar-chart (range 102) dists))
;; Now contrast with the Euclidean distance.
(def dists (mahalanobis-distance data :W (matrix [[1 0] [0 1]])))
;; view a bar-chart of the distances
(view (bar-chart (range 102) dists))
;; another example
(mahalanobis-distance [-1.75 1.75] :y data)
(mahalanobis-distance [-1.75 1.75]
:y data
:W (matrix [[1 0]
[0 1]]))
Comments top
No comments for mahalanobis-distance. Log in to add a comment.