C++ Program to Implement Bitap Algorithm for String Matching

This is a C++ Program to Implement Bitap Algorithm. The bitap algorithm (also known as the shift-or, shift-and or Baeza-Yates–Gonnet algorithm) is an approximate string matching algorithm. The algorithm tells whether a given text contains a substring which is “approximately equal” to a given pattern, where approximate equality is defined in terms of Levenshtein distance — if the substring and pattern are within a given distance k of each other, then the algorithm considers them equal. The algorithm begins by precomputing a set of bitmasks containing one bit for each element of the pattern. Then it is able to do most of the work with bitwise operations, which are extremely fast.

Here is source code of the C++ Program to Implement Bitap Algorithm for String Matching. The C++ program is successfully compiled and run on a Linux system. The program output is also shown below.

  1. #include <string>
  2. #include <map>
  3. #include <iostream>
  4.  
  5. using namespace std;
  6. int bitap_search(string text, string pattern)
  7. {
  8.     int m = pattern.length();
  9.     long pattern_mask[256];
  10.     /** Initialize the bit array R **/
  11.     long R = ~1;
  12.     if (m == 0)
  13.         return -1;
  14.     if (m > 63)
  15.     {
  16.         cout<<"Pattern is too long!";
  17.         return -1;
  18.     }
  19.  
  20.     /** Initialize the pattern bitmasks **/
  21.     for (int i = 0; i <= 255; ++i)
  22.         pattern_mask[i] = ~0;
  23.     for (int i = 0; i < m; ++i)
  24.         pattern_mask[pattern[i]] &= ~(1L << i);
  25.     for (int i = 0; i < text.length(); ++i)
  26.     {
  27.         /** Update the bit array **/
  28.         R |= pattern_mask[text[i]];
  29.         R <<= 1;
  30.         if ((R & (1L << m)) == 0)
  31.  
  32.             return i - m + 1;
  33.     }
  34.     return -1;
  35. }
  36. void findPattern(string t, string p)
  37. {
  38.     int pos = bitap_search(t, p);
  39.     if (pos == -1)
  40.         cout << "\nNo Match\n";
  41.     else
  42.         cout << "\nPattern found at position : " << pos;
  43. }
  44.  
  45. int main(int argc, char **argv)
  46. {
  47.  
  48.     cout << "Bitap Algorithm Test\n";
  49.     cout << "Enter Text\n";
  50.     string text;
  51.     cin >> text;
  52.     cout << "Enter Pattern\n";
  53.     string pattern;
  54.     cin >> pattern;
  55.     findPattern(text, pattern);
  56. }

Output:

$ g++ BitapStringMatching.cpp
$ a.out
 
Bitap Algorithm Test
Enter Text
DharmendraHingu
Enter Pattern
Hingu
 
Pattern found at position : 10
------------------
(program exited with code: 0)
Press return to continue

Sanfoundry Global Education & Learning Series – 1000 C++ Programs.

advertisement

Here’s the list of Best Books in C++ Programming, Data Structures and Algorithms.

advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
I’m Manish - Founder and CTO at Sanfoundry. I’ve been working in tech for over 25 years, with deep focus on Linux kernel, SAN technologies, Advanced C, Full Stack and Scalable website designs.

You can connect with me on LinkedIn, watch my Youtube Masterclasses, or join my Telegram tech discussions.

If you’re in your 20s–40s and exploring new directions in your career, I also offer mentoring. Learn more here.