Problem
There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).Algorithm
If A.length + B.length is odd, the median is well defined, otherwise return the average of lower median and upper median.
Find the median of merged array has an O(m+n) running time.
The kth Smallest in Two Sorted Arrays.
The kth smallest element is defined as the element larger than k-1 other elements in the array.
Assume, array A has length m and array B has length n, both are larger than k/2.
Let ka = k/2 and kb = k-ka. Compare A[ka-1] and B[kb-1],
- A[ka-1] == B[kb-1]. There are ka-1 (kb-1) elements in A (B) smaller than A[ka-1], therefore ka+kb-1 = k -1 elements smaller than A[ka-1], which is the kth smallest.
- A[ka-1] < B[kb-1], drop A[start to ka] and B[kb to end].
- A[ka-1] > B[kb-1], drop A[ka to end] and B[start to kb].
- Let x be an element in the range A[start to ka], x < A[ka-1] < B[kb-1]. There are less than ka -1 elements smaller than x in A, and less than kb - 1 elements in B smaller than x. In total, there are less than k -1 elements smaller than x. Therefore, x cannot be the kth smallest.
- Let y be an element in the range B[kb to end], y > B[kb-1] > A[ka-1]. There are more than kb elements smaller than y and more than ka elements smaller than y, therefore in total more than k elements smaller than y. So y cannot be the kth smallest.
So, we can drop region A[start to ka] and region B[kb to end] in case 2.
Run Time Complexity
Assume A[ka-1] < B[kb-1]. If k is small, we drop a small part of A, and a large part of B. If k is large, we drop large part of A and small part of B. Therefore, each divide, we reduce the size of problem by 2. Initially, the size is m+n, therefore, the running time is O(log(m+n)).
Assume A[ka-1] < B[kb-1]. If k is small, we drop a small part of A, and a large part of B. If k is large, we drop large part of A and small part of B. Therefore, each divide, we reduce the size of problem by 2. Initially, the size is m+n, therefore, the running time is O(log(m+n)).
T(m+n) = T((m+n)/2) + O(1) = log(m+n).
No comments:
Post a Comment