what to do after recursion ends

I apologize if this question does not belong here, my problem is not with the code, it's with the algorithm, so perhaps it is better suited for another website, but the good people of stackoverflow never let me down.

Here is the question :

Given 2 sorted arrays A and B such that they have the same number of elements, lets say n , and such that they do not share elements, and no element appears twice in the same array, find the median of the union of the arrays in logarithmic time complexity.

Very Important note: if n is odd, then the median is the middle element. But if n is even, the median is not the average of the middle elements. it is defined as the minimum of the middle elements.

Solution : The idea is quite simple. since they are sorted, we can find the median of A (called med1 ) and the median of B (called med2 ) in O(1) . if med1>med2 then we know that the median of the union is an element of A that is smaller than med1 or an element of B that is larger than med2 , and the reverse if med2>med1 . So we throw away the redundant element and do the same process, until A and B are sufficiently small, say with 2 elements each, and then we just need to find the median between these 4 numbers. The median of 4 numbers would be the second minimum, since 4 is an even number, which would be O(1) .

this is my code

#include<stdio.h>
#include<stdlib.h>
#include<conio.h>
int *scan_array(int* array_length);
int second_min_four_numbers(int a,int b,int c,int d);
int first_question(int *arr1,int *arr2,int left1,int right1,int left2,int right2);
void main()
{
    int *arr1,*arr2,length_arr1=0,length_arr2=0;
    printf("For the first sorted array:n");
    arr1=scan_array(&length_arr1);
    printf("nFor the second sorted array, enter %d numbers:n",length_arr1);
    arr2=scan_array(&length_arr2);
    if(length_arr1==1) //edge case, arrays are length one. return the min
    {
        if(arr1[0] > arr2[0])
            printf("The Median is %d",arr2[0]);
        else
            printf("The Median is %d",arr1[0]);
    }
    else
        printf("The Median is %d",first_question(arr1,arr2,0,length_arr1-1,0,length_arr2-1));
    getch();
}
int *scan_array(int* array_length) //nothing fancy. just scan the arrays.
{
    int* temp,temp_length,array_element,i=0,*real_array;
    temp=(int*)malloc(50*sizeof(int));
    printf("Enter positive numbers. To stop enter negative or zero.nDon't enter more than 50 numbersn");
    scanf("%d",&array_element);
    while(array_element>0)
    {
        (*array_length)++;
        temp[i]=array_element;
        i++;
        scanf("%d",&array_element);
    }
    real_array=(int*)malloc((*array_length)*sizeof(int));
    for(i=0;i<*array_length;i++)
        real_array[i]=temp[i];
    free(temp);
    return real_array;
}
int first_question(int *arr1,int *arr2,int left1,int right1,int left2,int right2) 
{
    int med1,med2;
    if(right1-left1+right2-left2 == 2) //we are done. reached 4 elements. we will always be here for arrays larger than 1 element each
        return second_min_four_numbers(arr1[left1],arr1[right1],arr2[left2],arr2[right2]);
    med1=arr1[(left1+right1)/2]; //not done. find the medians in O(1).
    med2=arr2[(left2+right2)/2];
    if(med1 < med2)//the median of the union is somewhere between them
        return first_question(arr1,arr2,(left1+right1)/2,right1,left2,(left2+right2)/2);
    else
        return first_question(arr1,arr2,left1,(left1+right1)/2,(left2+right2)/2,right2);
}
int second_min_four_numbers(int a,int b,int c,int d) //find second min between four numbers
{
    int min=0,second_min=0; //very crude, and inefficient but simple to understand and still O(1)
    min = a;
    if(min > b)
        min = b;
    if(min > c)
        min = c;
    if(min > d)
        min = d;
    if(a == min) 
    {
        second_min=b;
        if(second_min > c)
            second_min = c;
        if(second_min > d)
            second_min = d;
        return second_min;
    }
    if(b == min)
    {
        second_min=a;
        if(second_min > c)
            second_min=c;
        if(second_min > d)
            second_min = d;
        return second_min;
    }
    if(c == min)
    {
        second_min=a;
        if(second_min > b)
            second_min = b;
        if(second_min > d)
            second_min = d;
        return second_min;
    }
    if(d == min)
    {
        second_min=a;
        if(second_min > b)
            second_min=b;
        if(second_min > c)
            second_min=c;
        return second_min;
    }
}

It is working as intended and compiles. As I said, the problem is not with my code, it's with the algorithm. Let's see an example that will demonstrate the problem:

Suppose our input was A=[1,3,5] and B=[2,4,6] . Then med1=3 and med2=4 . Throw away the redundant elements and now we have A=[3,5] and B=[2,4] . Now we have only 4 elements overall, the data is sufficiently small, so just find the median of these 4 numbers [3,5,2,4] . The median would be 3 , which is also the correct result for the median of the union of A and B , so the result is correct.

Now let's assume our input was A=[1,3,5,7] and B=[2,4,6,8] . med1=3 and med2=4 . Throw away the redundant elements to get A=[3,5,7] and B=[2,4] . Now med1=5 and med2=2 . Again throw away redundancy to get A=[3,5] and B=[2,4] . Now our data is sufficiently small, find the median of [3,5,2,4] which would again give us 3 . But that result is incorrect. 3 is not the median of the union of A and B . The correct result would be 4 .

How can we fix this problem?


The algorithm needs to implement a binary search for the median, ie propose a possible value for the median. If that value is too low, then choose a higher value on the next iteration. If too high, then choose a lower value.

At each iteration, we choose a candidate from A, and choose a candidate from B. The smaller candidate is proposed as the median, and evaluated. If the proposed median is too small, then all smaller values from A and B can be removed from consideration. Likewise, if the proposed median is too large, then larger values from A and B can be ignored.

For example, given A=[1,2,7,19,22] the candidate from A would be 7. Assume that B proposes a larger candidate, so 7 is chosen as the possible median. If 7 is too low, then we can eliminate all elements <= 7 in both A and B as possible candidates. So A becomes A=[1,2,7,{19,22}] where the elements in curly braces are the remaining possible candidates for the median. The process is repeated, but this time the candidate from A would be 19.

To continue the example, let's say that B=[20,25,26,27] . The proposed candidate from B is 25. A's candidate is lower so we evaluate 19. List A has 3 values lower than 19, and 1 higher. List B has 4 values higher. Total 3 lower, 5 higher. Conclusion: 19 is too low, so eliminate as possible candidates all numbers <= 19. After two passes we have

A=[1,2,7,19,{22}]  B=[{20,25,26,27}]

A's candidate is 22, B's is 25, propose 22 as the median. 22 is too high so numbers >= 22 can be ignored and we have

A=[1,2,7,19,{},22]  // 19 was too low and 22 was too high, so no candidates are left in A
B=[{20},25,26,27]   // 22 was too high, so the only remaining candidate in B is 20

20 is the only remaining candidate in either list, and is therefore the answer.


Let me suggest a different way of conceptualizing this problem. Suppose there are 4 elements in each array. Consider this grid:

a1 a2 a3 a4
b1 b2 b3 b4

We are looking for a line through the center of the arrangement, which guarantees that the number of entries left of the line and the number of entries right of the line are equal. Note also that there are two different horizontal lines as a possible way of dividing the entries (smaller above or smaller below). So the number of lines we need to consider is 5 in this case, n+1 in general. Now, a binary search through the lines ought to do the trick.

链接地址: http://www.djcxy.com/p/83830.html

上一篇: 查找持续时间事件的中间位置点

下一篇: 递归完成后要做什么