Selection sorting and insertion sorting and comparison - sorting - algorithm version 4

Keywords: Java Algorithm data structure

1, Select sort

1. Algorithm description

  • First, find the smallest element in the array
  • Second, it interacts with the first element of the array
  • Again, find the smallest element in the remaining array elements and interact it with the second element of the array
  • Repeat until the entire array is sorted

2. Analysis

  • Compare elements: compare the current element with the smallest known element (and add 1 to the current index and detect whether the code is out of bounds)
  • Interactive elements: only one element is interactive in each round, so the total number of exchanges is N (array length)

The efficiency of the indexing algorithm depends on the number of comparisons.

Proposition A: for arrays of length N, it takes about N 2 2 \frac{N^2}{2} 2N2 # comparisons and N interactions.

Proof: it is known by looking at the code that the number of interactions is N

The comparison times are (N-1)+(N-2) +... + 3+2+1 = N(N-1)/2~ N 2 N^2 N2/2

3. Characteristics

  • Run time independent of input (initial state)
  • The data movement is the least: each interaction will change the values of two array elements, so the selection sorting uses N exchanges - the number of exchanges is linear with the array size.

3. Code implementation

import edu.princeton.cs.algs4.StdIn;
import edu.princeton.cs.algs4.StdOut;

/**
 * Select sort
 * Algorithm:
 * First, find the smallest element in the array
 * Second, it interacts with the first element of the array
 * Again, find the smallest element in the remaining array elements and interact it with the second element of the array
 * Repeat until the entire array is sorted
 */
public class Selection {

    /**
     * Sorting method
     * @param a     The array to be sorted of the Comparable interface is implemented
     */
    public static void sort(Comparable[] a) {
        int N = a.length;
        for (int i = 0; i < N; i++) {
            int min = i;
            for (int j = i + 1; j < N; j++) {
                if (less(a[j], a[min])) min = j;
            }
            exch(a, i, min);
        }
    }

    /**
     * Compare size
     * @param a     Objective a
     * @param b     Objective b
     * @return      Returns a Boolean value
     */
    private static  boolean less(Comparable a, Comparable b) {
        return a.compareTo(b) < 0;
    }

    /**
     * Swap array elements
     * @param a     array
     * @param i     Indexes
     * @param j     Indexes
     */
    private static void exch(Comparable[] a, int i, int j) {
        Comparable t = a[i];
        a[i] = a[j];
        a[j] = t;
    }

    /**
     * Print array
     * @param a     array
     */
    private static void show(Comparable[] a) {
        // Single line print array
        for (int i = 0; i < a.length; i++) {
            StdOut.print(a[i] + " ");
        }
        StdOut.println();
    }

    /**
     * Test whether the array is ordered
     * @param a     With test array
     * @return      Test result: true - array order; false - array unordered
     */
    public static boolean isSorted(Comparable[] a) {
        // Test whether the array is ordered
        for (int i = 1; i < a.length; i++) {
            if (less(a[i], a[i-1])) return false;
        }
        return  true;
    }

    public static void main(String[] args) {
        // Read strings from standard input, sort them and output them
        String[] a = StdIn.readAllStrings();
        sort(a);
        assert isSorted(a);
        show(a);
    }
}

The main method can also be tested separately in a test class. In this case, you need to change the show() modifier to public

2, Insert sort

1. Algorithm description

  1. Insertion sorting means that the first n-1 (where n > = 2) of the elements to be sorted are already in good order
  2. Now insert the nth number into the previously arranged sequence, and then find a suitable position so that the sequence inserted into the nth number is also arranged in order
  3. Insert all elements according to this method until the whole sequence is ordered

2. Analysis

  • All elements on the left of the current index are ordered, but their final position is uncertain; When the index reaches the right end of the array, the array sorting is completed
  • The time required to insert a sort depends on the initial order of elements in the input.

Proposition B. For randomly arranged arrays with length N and non duplicate primary keys, insertion sorting is required on average~ N 2 N^2 N2/4 comparisons and~ N 2 N^2 N2/4 exchanges. Worst case needs~ N 2 N^2 N2/2 Comparisons and~ N 2 N^2 N2/2 exchanges, preferably N-1 comparisons and 0 exchanges.

3. Code implementation

import edu.princeton.cs.algs4.StdIn;
import edu.princeton.cs.algs4.StdOut;

/**
 * Insert sort
 * Algorithm:
 * 1. Insertion sorting means that the first n-1 (where n > = 2) of the elements to be sorted are already in good order
 * 2. Now insert the nth number into the previously arranged sequence, and then find a suitable position so that the sequence inserted into the nth number is also arranged in order
 * 3. Insert all elements according to this method until the whole sequence is ordered
 */
public class Insertion {

    /**
     * Sorting method
     * @param a     The array to be sorted of the Comparable interface is implemented
     */
    public static void sort(Comparable[] a) {
        int N = a.length;
        for (int i = 1; i < N; i++) {
            for (int j = i ; j > 0 && less(a[j], a[j - 1]);  j--) {
                exch(a, j, j - 1);
            }

        }
    }

    /**
     * Compare size
     * @param a     Objective a
     * @param b     Objective b
     * @return      Returns a Boolean value
     */
    private static  boolean less(Comparable a, Comparable b) {
        return a.compareTo(b) < 0;
    }

    /**
     * Swap array elements
     * @param a     array
     * @param i     Indexes
     * @param j     Indexes
     */
    private static void exch(Comparable[] a, int i, int j) {
        Comparable t = a[i];
        a[i] = a[j];
        a[j] = t;
    }

    /**
     * Print array
     * @param a     array
     */
    private static void show(Comparable[] a) {
        // Single line print array
        for (int i = 0; i < a.length; i++) {
            StdOut.print(a[i] + " ");
        }
        StdOut.println();
    }

    /**
     * Test whether the array is ordered
     * @param a     With test array
     * @return      Test result: true - array order; false - array unordered
     */
    public static boolean isSorted(Comparable[] a) {
        // Test whether the array is ordered
        for (int i = 1; i < a.length; i++) {
            if (less(a[i], a[i-1])) return false;
        }
        return  true;
    }

    public static void main(String[] args) {
        // Read strings from standard input, sort them and output them
        String[] a = StdIn.readAllStrings();
        sort(a);
        assert isSorted(a);
        show(a);
    }
}

4. Summary

A more general case to consider is a partially ordered array.

Inversion refers to two elements in an array that are in reverse order. For EXAMPLE, there are 11 pairs of inversion in EXAMPLE: E-A, X-A, X-M, X-P, X-L, X-E, M-L, M-E, P-L, P-E and L-E. If the number of inversions in the array is a multiple of the size of the array, then we say that the array is partially ordered.

Partial ordering:

  • Each element in the array is not far from its final position
  • An ordered large array followed by a small array
  • Several elements in the array are not in the correct position

Insert sort is very effective for such arrays.

Proposition C. The number of exchange operations required for insertion sorting is the same as the number of inversions in the array. The number of comparisons required is greater than or equal to the number of inversions, less than or equal to the number of inversions plus the size of the array minus 1

prove. Each exchange changes the position of two elements whose order is reversed, which is equivalent to reducing a pair of inversions. When the number of inversions is 0, the sorting is completed. Each exchange corresponds to this comparison, and each I between 1 and N-1 may require an additional comparison (when a[i] this round of exchange is completed, but does not reach the leftmost end of the array).

To improve the speed of insertion sorting, you only need to move the larger elements to the right in the inner loop instead of always exchanging two elements (so that the number of accesses to the array can be halved)

3, Comparison of two sorting algorithms

1. Comparison steps

  1. Implement and debug them
  2. Analyze their basic properties
  3. Guess their relative performance
  4. Test our conjecture with experiments

Now the first step has been realized. Proposition A, proposition B and proposition C constitute the second step. The following property D is the third step. After that, the SortCompare class of "comparing two sorting algorithms" will complete the fourth step.

These simple steps are followed by a lot of algorithm implementation, debugging, analysis and testing. Every programmer knows that such code can only be obtained after long-term debugging and improvement. Only experts who study the most important algorithms will experience a complete research process. Only programmers who use algorithms should understand the scientific process behind the performance characteristics of algorithms.

For sorting, the natural input model used in proposition A, proposition B and proposition C assumes that the elements in the array are sorted randomly and the primary key will not be repeated. For applications with many duplicate primary keys, we need A more complex model.

How to estimate the performance of insertion sort and selection sort under random sort array? Through the implementation of the algorithm and proposition A, proposition B and proposition C, it can be found that the running time of the corresponding randomly sorted array is the square level. In other words, insert the running time and time of sorting under such input N 2 N^2 N2 is multiplied by a small constant, which is proportional to the running time and N 2 N^2 N2 is proportional to another small constant. These two constants depend on the cost of comparing and exchanging elements in the computer used. For many data types and general computers, it can be assumed that these costs are similar. So let's draw a guess directly.

Property D. For an array with no duplicate primary key corresponding to random sorting, the running time of insertion sorting and selection sorting is square, and the ratio of the two is a small constant.

2. Code implementation

Code implementation of "comparing two sorting algorithms" class SortCompare

import edu.princeton.cs.algs4.*;

/**
 * Time spent comparing the two algorithms
 */
public class SortCompare {

    /**
     * Calculating the time of an algorithm
     * @param alg       Algorithm name
     * @param a         Random array
     * @return          time
     */
    public static double time(String alg, Double[] a) {
        Stopwatch timer = new Stopwatch();
        if (alg.equals("Insertion")) Insertion.sort(a);
        if (alg.equals("Selection")) Selection.sort(a);
        if (alg.equals("Shell")) Shell.sort(a);
        if (alg.equals("Merge")) Merge.sort(a);
        if (alg.equals("Quick")) Quick.sort(a);
        if (alg.equals("Heap")) Heap.sort(a);
        return timer.elapsedTime();
    }

    /**
     * Time taken to test the sorting of arrays with length N T times
     * @param alg      Algorithm name
     * @param N        Array length
     * @param T        Number of tests
     * @return          Total time
     */
    public static double timeRandomInput(String alg, int N, int T) {
        // The algorithm alg is used to sort T arrays of length N
        double total = 0.0;
        Double[] a = new Double[N];
        for (int t = 0; t < T; t++) {
            //Perform a test (generate an array and sort)
            for (int i = 0; i < N; i++)
                a[i] = StdRandom.uniform();
            total += time(alg, a);
        }

        return total;
    }

    public static void main(String[] args) {
        String alg1 = args[0];
        String alg2 = args[1];

        int N = Integer.parseInt(args[2]);
        int T = Integer.parseInt(args[3]);

        // Algorithm 1 total time
        double t1 = timeRandomInput(alg1, N, T);
        // Algorithm 2 total time
        double t2 = timeRandomInput(alg2, N, T);

        StdOut.printf("For %d random Doubles\n %s is ", N, alg1);
        StdOut.printf("%.1f times faster then %s\n", t2/t1, alg2);
    }
}

Property D does not explain the value of small constants and the assumption that the cost of comparison and exchange is similar. In this way, property D can be widely applicable to various situations and try to grasp the essence of each algorithm.

For practical applications, another important step is to verify our conjecture in experiments with actual data. We will consider this in Section 2.5 and exercises. In this case, when the primary key duplication is not random, property D may not be tenable. The case of a large number of repeated primary keys requires more detailed analysis.

Posted by kbaker on Thu, 23 Sep 2021 03:47:57 -0700