What would cause an algorithm to have O(log n) complexity?
My knowledge of big-O is limited, and when log terms show up in the equation it throws me off even more.
Can someone maybe explain to me in simple terms what a O(log n)
algorithm is? Where does the logarithm come from?
This specifically came up when I was trying to solve this midterm practice question:
Let X(1..n) and Y(1..n) contain two lists of integers, each sorted in nondecreasing order. Give an O(log n)-time algorithm to find the median (or the nth smallest integer) of all 2n combined elements. For ex, X = (4, 5, 7, 8, 9) and Y = (3, 5, 8, 9, 10), then 7 is the median of the combined list (3, 4, 5, 5, 7, 8, 8, 9, 9, 10). [Hint: use concepts of binary search]
I have to agree that it's pretty weird the first time you see an O(log n) algorithm... where on earth does that logarithm come from? However, it turns out that there's several different ways that you can get a log term to show up in big-O notation. Here are a few:
Repeatedly dividing by a constant
Take any number n; say, 16. How many times can you divide n by two before you get a number less than or equal to one? For 16, we have that
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
Notice that this ends up taking four steps to complete. Interestingly, we also have that log2 16 = 4. Hmmm... what about 128?
128 / 2 = 64
64 / 2 = 32
32 / 2 = 16
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
This took seven steps, and log2 128 = 7. Is this a coincidence? Nope! There's a good reason for this. Suppose that we divide a number n by 2 i times. Then we get the number n / 2i. If we want to solve for the value of i where this value is at most 1, we get
n / 2i ≤ 1
n ≤ 2i
log2 n ≤ i
In other words, if we pick an integer i such that i ≥ log2 n, then after dividing n in half i times we'll have a value that is at most 1. The smallest i for which this is guaranteed is roughly log2 n, so if we have an algorithm that divides by 2 until the number gets sufficiently small, then we can say that it terminates in O(log n) steps.
An important detail is that it doesn't matter what constant you're dividing n by (as long as it's greater than one); if you divide by the constant k, it will take logk n steps to reach 1. Thus any algorithm that repeatedly divides the input size by some fraction will need O(log n) iterations to terminate. Those iterations might take a lot of time and so the net runtime needn't be O(log n), but the number of steps will be logarithmic.
So where does this come up? One classic example is binary search, a fast algorithm for searching a sorted array for a value. The algorithm works like this:
For example, to search for 5 in the array
1 3 5 7 9 11 13
We'd first look at the middle element:
1 3 5 7 9 11 13
^
Since 7 > 5, and since the array is sorted, we know for a fact that the number 5 can't be in the back half of the array, so we can just discard it. This leaves
1 3 5
So now we look at the middle element here:
1 3 5
^
Since 3 < 5, we know that 5 can't appear in the first half of the array, so we can throw the first half array to leave
5
Again we look at the middle of this array:
5
^
Since this is exactly the number we're looking for, we can report that 5 is indeed in the array.
So how efficient is this? Well, on each iteration we're throwing away at least half of the remaining array elements. The algorithm stops as soon as the array is empty or we find the value we want. In the worst case, the element isn't there, so we keep halving the size of the array until we run out of elements. How long does this take? Well, since we keep cutting the array in half over and over again, we will be done in at most O(log n) iterations, since we can't cut the array in half more than O(log n) times before we run out of array elements.
Algorithms following the general technique of divide-and-conquer (cutting the problem into pieces, solving those pieces, then putting the problem back together) tend to have logarithmic terms in them for this same reason - you can't keep cutting some object in half more than O(log n) times. You might want to look at merge sort as a great example of this.
Processing values one digit at a time
How many digits are in the base-10 number n? Well, if there are k digits in the number, then we'd have that the biggest digit is some multiple of 10k. The largest k-digit number is 999...9, k times, and this is equal to 10k + 1 - 1. Consequently, if we know that n has k digits in it, then we know that the value of n is at most 10k + 1 - 1. If we want to solve for k in terms of n, we get
n ≤ 10k+1 - 1
n + 1 ≤ 10k+1
log10 (n + 1) ≤ k + 1
(log10 (n + 1)) - 1 ≤ k
From which we get that k is approximately the base-10 logarithm of n. In other words, the number of digits in n is O(log n).
For example, let's think about the complexity of adding two large numbers that are too big to fit into a machine word. Suppose that we have those numbers represented in base 10, and we'll call the numbers m and n. One way to add them is through the grade-school method - write the numbers out one digit at a time, then work from the right to the left. For example, to add 1337 and 2065, we'd start by writing the numbers out as
1 3 3 7
+ 2 0 6 5
==============
We add the last digit and carry the 1:
1
1 3 3 7
+ 2 0 6 5
==============
2
Then we add the second-to-last ("penultimate") digit and carry the 1:
1 1
1 3 3 7
+ 2 0 6 5
==============
0 2
Next, we add the third-to-last ("antepenultimate") digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
4 0 2
Finally, we add the fourth-to-last ("preantepenultimate"... I love English) digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
3 4 0 2
Now, how much work did we do? We do a total of O(1) work per digit (that is, a constant amount of work), and there are O(max{log n, log m}) total digits that need to be processed. This gives a total of O(max{log n, log m}) complexity, because we need to visit each digit in the two numbers.
Many algorithms get an O(log n) term in them from working one digit at a time in some base. A classic example is radix sort, which sorts integers one digit at a time. There are many flavors of radix sort, but they usually run in time O(n log U), where U is the largest possible integer that's being sorted. The reason for this is that each pass of the sort takes O(n) time, and there are a total of O(log U) iterations required to process each of the O(log U) digits of the largest number being sorted. Many advanced algorithms, such as Gabow's shortest-paths algorithm or the scaling version of the Ford-Fulkerson max-flow algorithm, have a log term in their complexity because they work one digit at a time.
As to your second question about how you solve that problem, you may want to look at this related question which explores a more advanced application. Given the general structure of problems that are described here, you now can have a better sense of how to think about problems when you know there's a log term in the result, so I would advise against looking at the answer until you've given it some thought.
Hope this helps!
When we talk about big-Oh descriptions, we are usually talking about the time it takes to solve problems of a given size. And usually, for simple problems, that size is just characterized by the number of input elements, and that's usually called n, or N. (Obviously that's not always true-- problems with graphs are often characterized in numbers of vertices, V, and number of edges, E; but for now, we'll talk about lists of objects, with N objects in the lists.)
We say that a problem "is big-Oh of (some function of N)" if and only if:
For all N > some arbitrary N_0, there is some constant c, such that the runtime of the algorithm is less than that constant c times (some function of N.)
In other words, don't think about small problems where the "constant overhead" of setting up the problem matters, think about big problems. And when thinking about big problems, big-Oh of (some function of N) means that the run-time is still always less than some constant times that function. Always.
In short, that function is an upper bound, up to a constant factor.
So, "big-Oh of log(n)" means the same thing that I said above, except "some function of N" is replaced with "log(n)."
So, your problem tells you to think about binary search, so let's think about that. Let's assume you have, say, a list of N elements that are sorted in increasing order. You want to find out if some given number exists in that list. One way to do that which is not a binary search is to just scan each element of the list and see if it's your target number. You might get lucky and find it on the first try. But in the worst case, you'll check N different times. This is not binary search, and it is not big-Oh of log(N) because there's no way to force it into the criteria we sketched out above.
You can pick that arbitrary constant to be c=10, and if your list has N=32 elements, you're fine: 10*log(32) = 50, which is greater than the runtime of 32. But if N=64, 10*log(64) = 60, which is less than the runtime of 64. You can pick c=100, or 1000, or a gazillion, and you'll still be able to find some N that violates that requirement. In other words, there is no N_0.
If we do a binary search, though, we pick the middle element, and make a comparison. Then we throw out half the numbers, and do it again, and again, and so on. If your N=32, you can only do that about 5 times, which is log(32). If your N=64, you can only do this about 6 times, etc. Now you can pick that arbitrary constant c, in such a way that the requirement is always met for large values of N.
With all that background, what O(log(N)) usually means is that you have some way to do a simple thing, which cuts your problem size in half. Just like the binary search is doing above. Once you cut the problem in half, you can cut it in half again, and again, and again. But, critically, what you can't do is some preprocessing step that would take longer than that O(log(N)) time. So for instance, you can't shuffle your two lists into one big list, unless you can find a way to do that in O(log(N)) time, too.
(NOTE: Nearly always, Log(N) means log-base-two, which is what I assume above.)
In the following solution, all the lines with a recursive call are done on half of the given sizes of the sub-arrays of X and Y. Other lines are done in a constant time. The recursive function is T(2n)=T(2n/2)+c=T(n)+c=O(lg(2n))=O(lgn).
You start with MEDIAN(X, 1, n, Y, 1, n).
MEDIAN(X, p, r, Y, i, k)
if X[r]<Y[i]
return X[r]
if Y[k]<X[p]
return Y[k]
q=floor((p+r)/2)
j=floor((i+k)/2)
if r-p+1 is even
if X[q+1]>Y[j] and Y[j+1]>X[q]
if X[q]>Y[j]
return X[q]
else
return Y[j]
if X[q+1]<Y[j-1]
return MEDIAN(X, q+1, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j+1, k)
else
if X[q]>Y[j] and Y[j+1]>X[q-1]
return Y[j]
if Y[j]>X[q] and X[q+1]>Y[j-1]
return X[q]
if X[q+1]<Y[j-1]
return MEDIAN(X, q, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j, k)
链接地址: http://www.djcxy.com/p/40030.html
上一篇: 为什么二进制搜索是一个分而治之的算法?