1.2.3-SNAPSHOT Arrow_down_16x16

dice-coefficient-str

incanter.stats

  • (dice-coefficient-str a b)

http://en.wikipedia.org/wiki/Dice%27s_coefficient

When taken as a string similarity measure, the coefficient may be calculated for two strings, x and y using bigrams. here nt is the number of character bigrams found in both strings, nx is the number of bigrams in string x and ny is the number of bigrams in string y. For example, to calculate the similarity between:

night
nacht

We would find the set of bigrams in each word:

{ni,ig,gh,ht}
{na,ac,ch,ht}

Each set has four elements, and the intersection of these two sets has only one element: ht.

Plugging this into the formula, we calculate, s = (2 1) / (4 + 4) = 0.25.

0 Examples top

Log in to add / edit an example.

See Also top

Log in to add a see also.

Plus_12x12 Minus_12x12 Source incanter/stats.clj:3087 top

(defn dice-coefficient-str
"
http://en.wikipedia.org/wiki/Dice%27s_coefficient

When taken as a string similarity measure, the coefficient may be calculated for two strings, x and y using bigrams.  here nt is the number of character bigrams found in both strings, nx is the number of bigrams in string x and ny is the number of bigrams in string y. For example, to calculate the similarity between:

    night
    nacht

We would find the set of bigrams in each word:

    {ni,ig,gh,ht}
    {na,ac,ch,ht}

Each set has four elements, and the intersection of these two sets has only one element: ht.

Plugging this into the formula, we calculate, s = (2 ? 1) / (4 + 4) = 0.25.
"
[a b]
(dice-coefficient 
 (bigrams a)
 (bigrams b)))
Vars in incanter.stats/dice-coefficient-str: defn
Used in 0 other vars

Comments top

No comments for dice-coefficient-str. Log in to add a comment.