Sunday, March 11, 2012

Find a string/pattern in a 2d character array

You are given a 2D array of characters and a character pattern. WAP to find if pattern is present in 2D array. Pattern can be in any way (all 8 neighbors to be considered) but you can’t use same character twice while matching. Return 1 if match is found, 0 if not.

Example: Find “microsoft” in below matrix.

We can see the read color character, which form “Microsoft” in above 2D array.


Strategy:
Manually finding the solution of this problem is relatively intuitive, we just need to describe an algorithm for it. Ironically, describing the algorithm is not the easy part.

How do we do it manually? First we match the first element; if it matched we matched the 2nd element in the 8 neighbors of first match; do this process recursively; when last character of input pattern matches, return true.

During above process, you take care not to use any cell in 2D array twice. For this purpose, you mark every visited cell with some sign. If your pattern matching fails at some place, you start matching from the beginning (of pattern) in remaining cells. While returning, you unmark visited cells.

Lets convert above intuitive method in algorithm. Since we are doing similar checks every time for pattern matching, a recursive solution is what we need here.

In recursive solution, we need to check if the substring passed is matched in the given matrix or not. The condition is not to use the already used cell. For finding already used cell, we need to have another 2D array to the function (or we can use an unused bit in input array itself.) Also, we need the current position of input matrix, from where we need to start. Since we need to pass a lot more information than actually given, we should be having a wrapper function to initialize that extra information to be passed



Algorithm:

If you are past the last character in pattern
    Return true


If you got an used cell again
    Return falseIf you got past the 2D matrix  
    Return false


If searching for first element and cell doesn’t match
    findmatch with next cell in row-first order (or column first order)


otherwise if character matches
    mark this cell as used
    res = findmatch  with next position of pattern in 8 neighbors
    mark this cell as unused
    Return res


Otherwise
    Return false


Code:


#define MAX 100


bool findmatch_wrapper(char mat[MAX][MAX], char *pat, int nrow, int ncol)
{
    if (strlen(pat) > nrow*ncol)
        return false;
    
    int used[MAX][MAX] = {{0,},};


    return findmatch(mat, pat, used, 0, 0, nrow, ncol, 0);
}


//level: index till which pattern is matched
//x, y: current position in 2D array


bool findmatch(char mat[MAX][MAX], char *pat, int used[MAX][MAX], int x, int y, int nrow, int ncol, int level)
{
    if (level == strlen(pat)) //pattern matched
        return true;
        
    if (nrow == x || ncol == y)
        return false;
        
    if (used[x][y])
        return false;    
    
    if (mat[x][y] != pat[level] && level == 0)
    {
        if (x < (nrow - 1))
            return findmatch(mat, pat, used, x+1, y, nrow, ncol, level); //next element in same row
        else if (y < (ncol - 1))
            return findmatch(mat, pat, used, 0, y+1, nrow, ncol, level); //first element from same column
        else 
            return false;
    }    
    else if (mat[x][y] == pat[level])
    {
        bool res;
        //marking this cell as used
        used[x][y] = 1;       
             
        //finding subpattern in 8 neighbours     
        res = (x > 0                                    ? findmatch(mat, pat, used, x-1, y, nrow, ncol, level+1) :   false) ||
              (res = x < (nrow - 1)                     ? findmatch(mat, pat, used, x+1, y, nrow, ncol, level+1) :   false) ||
              (res = y > 0                              ? findmatch(mat, pat, used, x, y-1, nrow, ncol, level+1) :   false) ||
              (res = y < (ncol - 1)                     ? findmatch(mat, pat, used, x, y+1, nrow, ncol, level+1) :   false) ||
              (res = x < (nrow - 1) && (y < ncol -1)    ? findmatch(mat, pat, used, x+1, y+1, nrow, ncol, level+1) : false) ||
              (res = x < (nrow - 1) && y > 0            ? findmatch(mat, pat, used, x+1, y-1, nrow, ncol, level+1) : false) ||
              (res = x > 0 && y < (ncol - 1)            ? findmatch(mat, pat, used, x-1, y+1, nrow, ncol, level+1) : false) ||
              (res = x > 0 && y > 0                     ? findmatch(mat, pat, used, x-1, y-1, nrow, ncol, level+1) : false);


        //marking this cell as unused
        used[x][y] = 0;
        return res;
    }
    else
        return false;
}

1 comment: