Symbols of String Pattern Matching

时间:2022-07-25 13:56:09

Symbols of String Pattern Matching in Introduction to Algorithms.

As it's important to be clear when discussing the problem of string matching, we can use the meticulous symbols used in Introduction to Algorithms.

Text:    $T[1, ..., n]$.

Pattern:  $P[1, ..., m]$.

Thus, as $T[i], P[j] \in \Sigma$, the array of letters, like $T[1, ..., n]$ or $P[1, ..., m]$, is called string.

Alphabet:  $\Sigma$.

Set of all finite length of strings:  $\Sigma^*$.

Define the empty string $\epsilon$, as it is a string, we also have $\epsilon \in \Sigma^*$.

Shifting $s$ matching:  $T[s+1, s+2, ..., s+m] = P[1, 2, ..., m]$.

$\omega$ is the prefix of string x:  Existing a string $y\in \Sigma^*$, such that $x=\omega y$, marked as $\omega \sqsubset x$.

$\omega$ is the suffix of string x:  Existing a string $y\in \Sigma^*$, such that $x=y \omega$, marked as $\omega \sqsupset x$.

Define $P_k$ as the prefix $P[1, ..., k]$ of string $P[1, ..., m]$. Thus we have: $P_0 = \epsilon, P_m = P = P[1, ..., m]$.

Based on the above symbols, the matching problem can be restated as:

Finding all the possible shifting values $s$, such that $P \sqsupset T_{m+s}$.