Sorting - Minimum k number, median of data stream, heap module, double heap

Keywords: Algorithm leetcode

Sword finger Offer 40. Minimum number of k

Just in the last article, we talked about the two sorting methods of small top heap and fast row, which can solve the k-th problem.

Official fast platoon use ij exchange

Here is my wrong idea

To find the k-th, you should first be able to write fast rows.
For a round of fast scheduling, the left (out of order) is less than the benchmark number, and the right (out of order) is greater than the benchmark number:
If the benchmark number is K, it's done.
If the reference number is less than k, go to the right and find the K minus the reference number.
If the benchmark number is > k, you need to find it on the left. At this time, it is still the K.
...
Encountered a problem: K starts from 1 and the benchmark number starts from 0

The mistake is to look on the right. At this time, look for the K minus the benchmark number, because the subscript is unchanged during recursion. This may be the practice for some topics.

The official is to judge whether the benchmark number is k+1, so the left side is what you want.
And I judge whether the benchmark number is k,

class Solution:
    def getLeastNumbers(self, arr: List[int], k: int) -> List[int]:
        def quick_sort(arr, l, r, key):
            # Terminate recursion when subarray length is 1
            if l >= r: return
            # Sentinel division operation (taking arr[l] as the reference number)
            i, j = l, r
            while i < j:
                while i < j and arr[j] >= arr[l]: j -= 1
                while i < j and arr[i] <= arr[l]: i += 1 # < = OK
                arr[i], arr[j] = arr[j], arr[i]
            arr[l], arr[i] = arr[i], arr[l]
            # # Recursive left (right) subarray performs sentinel partitioning
            # quick_sort(arr, l, i - 1)
            # quick_sort(arr, i + 1, r)
            # Recursive depending on the situation
            if i==key:
                return
            elif i<key:
                quick_sort(arr, i+1, r, key)
            else:
                quick_sort(arr, l, i-1, key)
        # if k >= len(arr): return arr
        quick_sort(arr, 0, len(arr) - 1, k-1)  # Both k-1 and K are OK
        return arr[:k] # id from 0 to k-1

official:
Strangely, k here can be called inside the function

class Solution:
    def getLeastNumbers(self, arr: List[int], k: int) -> List[int]:
        if k >= len(arr): return arr
        def quick_sort(l, r):
            i, j = l, r
            while i < j:
                while i < j and arr[j] >= arr[l]: j -= 1
                while i < j and arr[i] <= arr[l]: i += 1
                arr[i], arr[j] = arr[j], arr[i]
            arr[l], arr[i] = arr[i], arr[l]
            if k < i: return quick_sort(l, i - 1) 
            if k > i: return quick_sort(i + 1, r)
            return arr[:k]
            
        return quick_sort(0, len(arr) - 1)

Directly call the small top heap and fix it in four lines:
Parameter interpretation

from heapq import *
class Solution:
    def getLeastNumbers(self, arr: List[int], k: int) -> List[int]:
        heapify(arr)
        res = nsmallest(k,arr)
        return res

from heapq import *
class Solution:
    def getLeastNumbers(self, arr: List[int], k: int) -> List[int]:
        heapify(arr)
        res = []
        for _ in range(k):
            res.append(heappop(arr))
        return res

Sword finger Offer 41. Median in data stream

How to get the median in a data stream? If an odd number of values are read out from the data stream, the median is the value in the middle after all values are sorted. If an even number of values are read from the data stream, the median is the average of the middle two numbers after all values are sorted.

Design a data structure that supports the following two operations:

void addNum(int num) - adds an integer from the data stream to the data structure.
double findMedian() - returns the median of all current elements.

I can only think of the init initialization, first arrange the whole sequence, insert the elements into the corresponding position every time, maintain a len record length, and directly take the median.

Look at how the official uses two heap implementations to find the median.

Binary insertion sort. Is the position to be inserted left or right
LeetCode binary insertion


Key:
Whether you insert A or B, you must pass the verification of the other party,
A is the minimum value, which can be given to the smaller general value; B is the maximum value, which can be given to the larger half.

python does not have a large top heap, and the heapq module is a small top heap. Method to realize large top heap: insert and pop-up the small top heap by inverting the elements.

from heapq import *

class MedianFinder:
    def __init__(self):
        """
        initialize your data structure here.
        """
        self.A = [] # Small top pile, keep the larger half
        self.B = [] # Large top pile, save the smaller half
        # Number of records
        self.nums = 0

    def addNum(self, num: int) -> None:
        if self.nums % 2 == 0: # When the number is even, add the small top pile
            heappush(self.B, -num) 
            heappush(self.A, -heappop(self.B))
        else:  # In case of odd number, the small top pile is more than 1, and the large top pile is added
            heappush(self.A, num)
            heappush(self.B, -heappop(self.A))
        self.nums += 1
    def findMedian(self) -> float:
        return self.A[0] if self.nums % 2 == 1 else (self.A[0] - self.B[0]) / 2.0
        # Not minus self.B[0], because the minimum value in B is actually the original maximum value

# Your MedianFinder object will be instantiated and called as such:
# obj = MedianFinder()
# obj.addNum(num)
# param_2 = obj.findMedian()

Posted by DangerousDave86 on Sat, 09 Oct 2021 21:58:01 -0700