Geoffrey Falk wrote:Suppose I give you a 4-letter word with letters chosen from an alphabet of 26 letters. What is the probability that a random 13-letter string contains the word as a substring?
Alternatively, write a Java program to calculate the probability.
Geoffrey
Notation:
Pn = prob that word occurs at position n of string.
For 0 <= n <= 9, Pn = (1/26) ^ 4 = 1/456976
For 9 < n, Pn = 0.
P(word does NOT appear in string) = [(1 - P1) * (1-P2) * ... (1 - P12)]
= (1- (1/456976)) ^ 13
= (456975/456976) ^ 13
P(word DOES exist in string) = 1 - P(word does NOT appear in string)
= 1 - (456975/456976)^13
approx= .0000284475
...which is about 1 in 35,000
According to this formula, you reach the 50-50 point around with a string around 300,000 letters long. That seems long.