BFTRP algorithm

Keywords: less

1, Let's start with a question

Looking for the k-th small value in a disordered array?

Many people use big top heap as the first solution, and then poll k is the answer, but the time complexity is O (nlogn). Is there an O (n) algorithm? The answer is yes, our BFTRP algorithm

2, Bedding

Here we need you to know the solution to the Dutch flag problem, that is, to select a base value, put the smaller one on the left, the larger one on the right, and put the equal one in the middle, which is actually the partition process of quick sorting. Array {5,7,12,4,2,7,2,5,6,0,1,2}. If the selected base value is 6, then the array represents

This is actually a Dutch flag. At this time, we save the left boundary l and the right boundary r of the middle 6, and judge whether k-1 is in the middle of l and R. if it is in the middle, it means that it is 6, equal to the value in the middle. If not, judge whether the size of the left array continues to recurse or the right array continues to recurse. So you can always find that value. This kind of time complexity O (n), which is an expectation of probability based time complexity.

Consider an extreme case, if every value found is no right but only left, suppose that the sample n, T (n) = T(n - 1) + n. Then the time complexity is O (n2). In fact, it's meaningless. How to find the benchmark value of 6? If we can ensure that the length of the left and right arrays of reference values is approximately equal, then we can achieve o (n). At the same time, this method is stable, not based on probability. Our BFTRP algorithm is used to find the reference value.

3, Serve

1. A little analysis

For example, there is now an array of arr = [1,2,4,1,2,5,7,3,21,7,3,1,6,0,6,3,1,34,65,62,2,1,5,7,10].

I am not divided into five groups (if the remaining number is less than 5, I will form a group separately):

For sorting each array, there are 5 numbers in each array, which is a constant length. So the time complexity O (1), because there is n/5, is O (n). Note that this sort is only related to the group, not the array. After sorting, it's like this bear

Then take the median to form a new array:

New array [2,7,3,34,5]. Then we use this method to enter BFTRP in groups of five, but now the k we are looking for has become the middle position number of the new array length. It's easy to understand. Now we need to find the median, that is, the number with the smaller middle position is the median. In this example, we find the median 7, and then we take 7 as the reference value, and put the smaller one on the left. Then it depends on whether the k to be found is in the middle area. If it is, it directly returns 7. If it is not recursive to the left or right.

Let's see why this can reach O (n). For example, if the array is n at the beginning, then there are n/5 arrays in a group of five, then these arrays will take out the median to form new There are n/5 numbers in an array of int [n/5], among which n/10 numbers are greater than or equal to the median, that is, N / (5 * 2), n/10. In the original array, there are 3 numbers greater than or equal to the median, so at least 3n/10 numbers are greater than the median you are looking for, that is to say, at most 7n/10 numbers are less than the median, so it can be ensured that There will be no such extremes.

Time complexity T(N) = T(N/5) + T(7N /10) + O(N). Finally, the time complexity O (n) is calculated.

Actually, the most important thing is to find the reference value, a median

2. Code

   public static void main(String[] args) {
        int[] arr = { 6, 9, 1, 3, 1, 2, 2, 5, 6, 1, 3, 5, 9, 7, 2, 5, 6, 1, 9 };
        System.out.println(getMinKthByBFPRT(arr,10));

    } 
   /**
     * Get the number k
     */
    public static int getMinKthByBFPRT(int[] arr, int K) {
        return select(arr, 0, copyArr.length - 1, K - 1);
    }

The main function calls this method, passing in the array, the start position, the end position, and the index to find.

    /**
     * bftrp Find the median.
     */
    public static int select(int[] arr, int begin, int end, int i) {
        if (begin == end) {
            return arr[begin];
        }
        int pivot = medianOfMedians(arr, begin, end);
        int[] pivotRange = partition(arr, begin, end, pivot);
        if (i >= pivotRange[0] && i <= pivotRange[1]) {
            return arr[i];
        } else if (i < pivotRange[0]) {
            return select(arr, begin, pivotRange[0] - 1, i);
        } else {
            return select(arr, pivotRange[1] + 1, end, i);
        }
    }

select method is the core of BFTRP algorithm

(1) If the array you are looking for starts and ends at the same position, you can directly return

(2) medianOfMedians gets the median, which is our benchmark.

(3) Organize the array with this benchmark value. The smaller one is on the left, the larger one is on the right, and the equal one is in the middle.

(4) pivotRange[0] represents the left boundary of the intermediate datum value, and pivotRange[1] represents the right boundary of the intermediate datum value. If it is returned directly within this range, otherwise it will be called recursively to the left or right.

   /**
     * The median array is used as the partition value to find the median of the median array
     */
    public static int medianOfMedians(int[] arr, int begin, int end) {
        int num = end - begin + 1;
        //Take five numbers as a group
        int offset = num % 5 == 0 ? 0 : 1;
        //Median array
        int[] mArr = new int[num / 5 + offset];
        //Fill the array of median
        for (int i = 0; i < mArr.length; i++) {
            int beginI = begin + i * 5;
            int endI = beginI + 4;
            mArr[i] = getMedian(arr, beginI, Math.min(end, endI));
        }
        //Recursively get median
        return select(mArr, 0, mArr.length - 1, mArr.length / 2);
    }

Get the median, that is to say, sort by 5 as a group, or the median of each group, and then call BFTRP algorithm again, but the middle of the search becomes the middle of the array, and this value is the median!!! Finally, you can go back to it. Then operate.

    /**
     * Get median
     */
    public static int getMedian(int[] arr, int begin, int end) {
        insertionSort(arr, begin, end);
        int sum = end + begin;
        int mid = (sum / 2) + (sum % 2);
        return arr[mid];
    }

    /**
     * Sort small array
     */
    public static void insertionSort(int[] arr, int begin, int end) {
        for (int i = begin + 1; i != end + 1; i++) {
            for (int j = i; j != begin; j--) {
                if (arr[j - 1] > arr[j]) {
                    swap(arr, j - 1, j);
                } else {
                    break;
                }
            }
        }
    }

    public static void swap(int[] arr, int index1, int index2) {
        int tmp = arr[index1];
        arr[index1] = arr[index2];
        arr[index2] = tmp;
    }

Sort to get median code!!

    /**
     * Center on pivot value, put the smaller one on the left and the larger one on the right
     * Returns an array of two elements. Position 0 is the left side of the middle number and position 1 is the right side of the middle number
     */
    public static int[] partition(int[] arr, int begin, int end, int pivotValue) {
        int small = begin - 1;
        int cur = begin;
        int big = end + 1;
        while (cur != big) {
            if (arr[cur] < pivotValue) {
                swap(arr, ++small, cur++);
            } else if (arr[cur] > pivotValue) {
                swap(arr, cur, --big);
            } else {
                cur++;
            }
        }
        int[] range = new int[2];
        range[0] = small + 1;
        range[1] = big - 1;
        return range;
    }

partition is actually the code of the Dutch flag.

At last we got the answer.

 

 

 

 

Posted by Mistat2000 on Mon, 08 Jun 2020 22:22:38 -0700