Given two independent sequences of letters, we seek the probability distribution of the length of the longest matching word. This word can be in different positions in the two sequences and we consider both perfect and nearly perfect matching. We derive bounds and approximations for the probability and compare them with other bounds and approximations. The results can be applied to DNA sequences in molecular biology and generalized matching between two independent random sequences.
All Science Journal Classification (ASJC) codes
- Agricultural and Biological Sciences(all)
- Environmental Science(all)
- Biochemistry, Genetics and Molecular Biology(all)
- Computational Theory and Mathematics