Rabin – Karp Algorithm :

Rabin-karp
In computer science, the Rabin–Karp algorithm or Karp–Rabin algorithm is a string searching algorithm created by Richard M. Karp and Michael O. Rabin that uses hashing to find any one of a set of pattern strings in a text.

The Rabin-karp algorithm is a better way to solve string matching problem. The main feature of this algorithm which makes it best is that :
* It uses hashing function.
* It can deal with multiple pattern matching.
*  Effective and easy to implement.

Description :-

* p denote the decimal value of P[1….m] where m is the length of the pattern .
* t(s) denote the decimal value of Text txt[s+1…s+m] where s is the shift value.
* We can compute p in time O(m) using HORNER’S RULE.
p = P[m]+ d(P[m-1] + d(P[m-2]+……d(P[2] + dP[1]….)))  [ Where d is decimal value. If we take string as numbers from 0 to 9 d will be 10 so as for characters d will be 256]
* Similarly when shift value is s=0 we can calculate txt[1….m] as t(0) in O(m) time .
t(1) is txt[2….m+1] which can again be calculated in O(m) time .
To Compute remaining ti we can say that :

ts+1 = d[ts – dm-1 T[s+1] ] + T[s+m+1]

* Subtracting d(m-1) T[s+1] removes the higher order digit and adding T[s+m+1] brings in the appropriate lower digit.

  • We compute p and t(s) values modulo a suitable prime number in O(n-m+1) to overcome the problem of overflow when decimal value is too large.
  • We compute hash values for all the pattern If the two hash values are equal but the characters are not the same it is called a SPURIOUS HIT . As:
    Fig-32-5-Rabin-Karp-Examples-b
    How do we calculate ts+1 %q from ts % q (q=prime number) ??? –
    3 : old – high order value, 2 : new – low order value
    14152 ≡ ((31415 – 3 * 10000)*10 +2) mod 13
    ≡ 8 ( mod 13 )
    now c++ implementation of Rabin- Karp Algorithm :
#include<iostream>
#include<cstring>
#include<cstdio>
#define d 256 // Number of characters in input alphabet
using namespace std;
void search(char *txt , char *pat , int q){
int m=strlen(pat);
int n=strlen(txt);

int i,j;
int p=0; // hash value for pattern
int t=0; // hash value for text
int h=1; // d^(m-1) value

// The value of h would be "pow(d, M-1)%q"

for(i=0;i<m-1;i++){
h=(d*h)%q;
}

// Calculate the hash value of pattern and first window of text

for(i=0;i<m;i++){
p=(d*p+pat[i])%q;
t=(d*t+txt[i])%q;
}

for(i=0;i<=n-m;i++){

// Check if the current sliding window of text and pattern have same hash values

if(t==p){
// Check if all characters are same or it's a SPURIOUS HIT !

for(j=0;j<m;j++){
if(txt[i+j]!=pat[j]) {
break;
}
}
if(j==m)
cout<<endl<<"Pattern matched at index "<<i<<endl;
}

// Calculate hash value for next window of text: Removing Old-high oder digit
// adding new-low order digit

if(i<n-m){
t = (d*(t - txt[i]*h) + txt[i+m])%q;

// We might get negative value of t, converting it to positive

if(t < 0)
t = (t + q);
}
}
}

int main(){
char a[50],b[50];
cout<<"Enter the txt string :"<<endl;
gets(a);
cout<<endl<<"Enter the sub-string :"<<endl;
gets(b);
int q=101; // A Prime Number
search(a,b,q);
return 0;
}

Output :-
rk

Leave a comment