• incanter

1.2.3-SNAPSHOT

sorensen-index

incanter.stats

• (sorensen-index a b)

http://en.wikipedia.org/wiki/S%C3%B8rensen_similarity_index#cite_note-4

The Sørensen index, also known as Sørensen’s similarity coefficient, is a statistic used for comparing the similarity of two samples. where A and B are the species numbers in samples A and B, respectively, and C is the number of species shared by the two samples.

The Sørensen index is identical to Dice's coefficient which is always in [0, 1] range. Sørensen index used as a distance measure, 1 ? QS, is identical to Hellinger distance and Bray–Curtis dissimilarity.

The Sørensen coefficient is mainly useful for ecological community data (e.g. Looman & Campbell, 1960[3]). Justification for its use is primarily empirical rather than theoretical (although it can be justified theoretically as the intersection of two fuzzy sets[4]). As compared to Euclidean distance, Sørensen distance retains sensitivity in more heterogeneous data sets and gives less weight to outliers

This function assumes you pass in a and b as sets.

The sorensen index extended to abundance instead of incidence of species is called the Czekanowski index.

Source incanter/stats.clj:3148 top

```(defn sorensen-index
"
http://en.wikipedia.org/wiki/S%C3%B8rensen_similarity_index#cite_note-4

The S??rensen index, also known as S??rensen‚Äôs similarity coefficient, is a statistic used for comparing the similarity of two samples. where A and B are the species numbers in samples A and B, respectively, and C is the number of species shared by the two samples.

The S??rensen index is identical to Dice's coefficient which is always in [0, 1] range. S??rensen index used as a distance measure, 1 ‚àí QS, is identical to Hellinger distance and Bray‚ÄìCurtis dissimilarity.

The S??rensen coefficient is mainly useful for ecological community data (e.g. Looman & Campbell, 1960[3]). Justification for its use is primarily empirical rather than theoretical (although it can be justified theoretically as the intersection of two fuzzy sets[4]). As compared to Euclidean distance, S??rensen distance retains sensitivity in more heterogeneous data sets and gives less weight to outliers

This function assumes you pass in a and b as sets.

The sorensen index extended to abundance instead of incidence of species is called the Czekanowski index."
[a b]
(dice-coefficient a b))```
Vars in incanter.stats/sorensen-index: defn
Used in 0 other vars