1.String Pattern Search: Naive Algorithm:
Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that n > m.
Examples:
1) Input:
Pattern found at index 10
2) Input:
Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that n > m.
Examples:
1) Input:
txt[] = "THIS IS A TEST TEXT" pat[] = "TEST"Output:
Pattern found at index 10
2) Input:
txt[] = "AABAACAADAABAAABAA" pat[] = "AABA"Output:
Pattern found at index 0 Pattern found at index 9 Pattern found at index 13
http://www.geeksforgeeks.org/searching-for-patterns-set-1-naive-pattern-searching/
2.String Pattern Search-II -- Rabin Karp Algorithm:
- uses an hashing function;
- preprocessing phase in O(m) time complexity and constant space;
- searching phase in O(mn) time complexity;
- O(n+m) expected running time.
Instead of checking at each position of the text if the pattern occurs or not, it is better to check first if the contents of the current string "window" looks like the pattern or not. In order to check the resemblance between these two patterns, a hashing function is used.
Hashing a string involves computing a numerical value from the value of its characters using a hash function. The Rabin-Karp method uses the rule that if two strings are equal, their hash values must also be equal. Note that the converse of this statement is not always true, but a good hash function tries to reduce the number of such hash collisions.
Rabin-Karp computes hash value of the pattern, and then goes through the string computing hash values of all of its substrings and checking if the pattern's hash value is equal to the substring hash value, and advancing by 1 character every time. If the two hash values are the same, then the algorithm verifies if the two string really are equal, rather than this being a fluke of the hashing scheme. It uses regular string comparison for this final check.
Rabin-Karp is an algorithm of choice for multiple pattern search. If we want to find any of a large number, say k, fixed length patterns in a text, a variant Rabin-Karp that uses a hash table to check whether the hash of a given string belongs to a set of hash values of patterns we are looking for. Other algorithms can search for a single pattern in time order O(n), hence they will search for k patterns in time order O(n*k). The variant Rabin-Karp will still work in time order O(n) in the best and average case because a hash table allows to check whether or not substring hash equals any of the pattern hashes in time order of O(1).
3.String Pattern Search-III -- KMP Algorithm:
Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that n > m.
Input: txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"
Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
http://www.geeksforgeeks.org/searching-for-patterns-set-2-kmp-algorithm/
Strategy:
void KMPSearch(char *pat, char *txt)
{
int M = strlen(pat);
int N = strlen(txt);
// create lps[] that will hold the longest prefix suffix values for pattern
int *lps = (int *)malloc(sizeof(int)*M);
int j = 0; // index for pat[]
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
while(i < N)
{
if(pat[j] == txt[i])
{
j++;
i++; }
if (j == M)
{
printf("Found pattern at index %d \n", i-j);
j = lps[j-1]; }
// mismatch after j matches
else if(pat[j] != txt[i])
{
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if(j != 0)
j = lps[j-1];
else
i = i+1;
}
}
free(lps); // to avoid memory leak
}
void computeLPSArray(char *pat, int M, int *lps)
{
int len = 0; // lenght of the previous longest prefix suffix
int i;
lps[0] = 0; // lps[0] is always 0
i = 1;
// the loop calculates lps[i] for i = 1 to M-1
while(i < M)
{
if(pat[i] == pat[len])
{
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
if( len != 0 )
{
// This is tricky. Consider the example AAACAAAA and i = 7.
len = lps[len-1];
// Also, note that we do not increment i here
}
else // if (len == 0)
{
lps[i] = 0;
i++;
}
}
}
}
int main()
{
char *txt = "ABABDABACDABABCABAB";
char *pat = "ABABCABAB";
KMPSearch(pat, txt);
getchar();
return 0;
}
|
#include < stdio.h >
ReplyDelete#include < limits.h >
#include < conio.h >
/* Check if s1 and s2 are anagrams of each other */
int isAnagram(const char *s1, const char *s2)
{
int ha[CHAR_MAX] = {0};
int i;
while ( *s1 && *s2 )
{
ha[*s1++]++;
ha[*s2++]--;
}
if ( *s1 || *s2 )
{
return 0;
}
for ( i = 0; i < CHAR_MAX; ++i )
{
if ( ha[i] )
{
return 0;
}
}
return 1;
}
void anagram_test(const char *s1, const char *s2)
{
printf("isAnagram(\"%s\",\"%s\") = %d\n", s1, s2, isAnagram(s1, s2));
}
int main(void)
{
/** http://en.wikipedia.org/wiki/Anagram */
anagram_test("orchestra", "carthorse");
anagram_test("A decimal point", "I'm a dot in place");
getch();
return 0;
}
/* my output
isAnagram("orchestra","carthorse") = 1
isAnagram("A decimal point","I'm a dot in place") = 0
*/