What's time complexity of this algorithm for Wildcard Matching?
Wildcard Matching
Implement wildcard pattern matching with support for ' ? ' and ' * '.
The matching should cover the entire input string (not partial).
The function prototype should be:
bool isMatch(const char *s, const char *p)
Some examples:
Question:
Personally, I think
Have tried:
Write out Time complexity Expression, then draw recursion tree:
TC Expression => T(n) = T(n - 1) + O(1), when pChar == '?' or pChar == sChar,
= T(n - 1) + T(n - 1) + O(1), when pChar == '*'.
I tried to draw recursion tree, but can not figure out how to draw it based on this kind of Time Complexity Expression.
Additional question:
Accurately, I hope to know how to calculate the time complexity for this kind of recursion , which has multi-unforeseen-branch based on input.
Note:
Code: Java, Solution: Recursion.
public class Solution {
public boolean isMatch(String s, String p) {
// Input checking.
if (s == null || p == null) return false;
int sLen = s.length();
int pLen = p.length();
return helper(s, 0, sLen, p, 0, pLen);
}
private boolean helper(String s, int sIndex, int sLen,
String p, int pIndex, int pLen) {
// Base case.
if (sIndex >= sLen && pIndex >= pLen) return true;
else if (sIndex >= sLen) {
// Check whether the remaining part of p all "*".
while (pIndex < pLen) {
if (p.charAt(pIndex) != '*') return false;
pIndex ++;
}
return true;
} else if (pIndex >= pLen) {
return false;
}
char sc = s.charAt(sIndex);
char pc = p.charAt(pIndex);
if (pc == '?' || pc == sc) {
return helper(s, sIndex + 1, sLen, p, pIndex + 1, pLen);
} else if (pc == '*') {
return helper(s, sIndex, sLen, p, pIndex + 1, pLen) ||
helper(s, sIndex + 1, sLen, p, pIndex, pLen);
} else return false;
}
}
In order to get an upper bound (ie, big-O) on the worst case running time, you need to assume the very worst. The correct recurrence for an upper bound on the asymptotic running time of matching a string of length s
with a pattern of length p
is as follows.
T(s, p) | s == 0 || p == 0 = 1
| s > 0 && p > 0 = 1 + max(T(s, p - 1) + T(s - 1, p), // *
T(s - 1, p - 1)) // ? or literal
Solving two-variable recurrences like this can be tricky. In this particular case, one can show fairly easily by induction that T
is non-decreasing in both arguments, and so we can simplify the max.
T(s, p) | s == 0 || p == 0 = 1
| s > 0 && p > 0 = 1 + T(s, p - 1) + T(s - 1, p)
Now one, with experience, can recognize the strong resemblance to a recurrence for binomial coefficients and make the (admittedly slightly magical) substitutions s = n - k
and p = k
and T(s, p) = 2 U(n, k) - 1
.
2 U(n, k) - 1 | n == k || k == 0 = 1
| n > k && k > 0 = 1 + 2 U(n - 1, k - 1) - 1 + 2 U(n - 1, k) - 1
U(n, k) | n == k || k == 0 = 1
| n > k && k > 0 = U(n - 1, k - 1) + U(n - 1, k)
We conclude that T(s, p) = 2 U(s + p, p) - 1 = 2 ((s + p) choose p) - 1 = O(2^(s + p)/sqrt(s + p))
by Stirling's approximation (that's the best big-O bound possible in the single quantity s + p
, but it's confusing if I write big-Theta).
So far we have proved only that T(s, p)
is an upper bound. Since *
was the more troublesome case, an idea for the worst case presents itself: make the pattern all *
s. We have to be a little bit careful, because if the match succeeds, then there's some short-circuiting possible. However, it takes very little to prevent a match: consider the string 0000000000
and the pattern **********1
(adjust the number of 0
s and *
as desired). This example shows that the quoted bound is tight to within a polynomial factor (negligible, since the running time already is exponential).
For the purpose of getting just an upper bound, it's not necessary to work out these recurrences nearly so precisely. For example, I might guess that T(s, p) <= 3^(s + p)
and proceed to verify that claim by induction.
T(s, p) | s = 0 || p = 0 = 1 <= 3^(s + p) // base case
| s > 0 || p > 0 = 1 + T(s, p - 1) + T(s - 1, p) // induction
<= 3^(s + p - 1) + 3^(s + p - 1) + 3^(s + p - 1)
= 3^(s + p)
Now, 3^(s + p)
is a valid upper bound, though in light of the rest of this answer it's not tight. One now can look for waste in the bounds; 1 <= 3^(s + p - 1)
, for example, is a gross overestimate, and with some tricks, we can get the exponential base 2
.
The more important order of business, however, is to get an exponential lower bound. From drawing the recursion tree for the bad example above, I might conjecture that T(s, p) >= 2^min(s, p)
. This can be verified by induction.
T(s, p) | s = 0 || p = 0 = 1 >= 2^min(s, p) = 2^0 = 1 // base case
| s > 0 && p > 0 = 1 + T(s, p - 1) + T(s - 1, p) // induction
>= 2^min(s, p - 1) + 2^min(s - 1, p)
>= 2^(min(s, p) - 1) + 2^(min(s, p) - 1)
= 2^min(s, p)
链接地址: http://www.djcxy.com/p/39334.html
上一篇: O递归解决方案的复杂性
下一篇: 通配符匹配算法的时间复杂度是多少?