11.1 Sorting algorithms¶

Sorting algorithms (sorting algorithm) are used to arrange a set of data in a specific order. Sorting algorithms have a wide range of applications because ordered data can usually be searched, analyzed, and processed more efficiently.

As shown in Figure 11-1, the data types in sorting algorithms can be integers, floating point numbers, characters, or strings, etc. Sorting rules can be set according to needs, such as numerical size, character ASCII order, or custom rules.

Figure 11-1 Data types and comparator examples

11.1.1 Evaluation dimensions¶

Execution efficiency: We expect the time complexity of sorting algorithms to be as low as possible, with a lower number of overall operations (reduction in the constant factor of time complexity). For large data volumes, execution efficiency is particularly important.

In-place property: As the name implies, in-place sorting is achieved by directly manipulating the original array, without the need for additional auxiliary arrays, thus saving memory. Generally, in-place sorting involves fewer data movement operations and is faster.

Stability: Stable sorting ensures that the relative order of equal elements in the array does not change after sorting.

Stable sorting is a necessary condition for multi-level sorting scenarios. Suppose we have a table storing student information, with the first and second columns being name and age, respectively. In this case, unstable sorting might lead to a loss of orderedness in the input data:

# Input data is sorted by name
# (name, age)
  ('A', 19)
  ('B', 18)
  ('C', 21)
  ('D', 19)
  ('E', 23)

# Assuming an unstable sorting algorithm is used to sort the list by age,
# the result changes the relative position of ('D', 19) and ('A', 19),
# and the property of the input data being sorted by name is lost
  ('B', 18)
  ('D', 19)
  ('A', 19)
  ('C', 21)
  ('E', 23)

Adaptability: Adaptive sorting has a time complexity that depends on the input data, i.e., the best time complexity, worst time complexity, and average time complexity are not exactly equal.

Adaptability needs to be assessed according to the specific situation. If the worst time complexity is worse than the average, it suggests that the performance of the sorting algorithm might deteriorate under certain data, hence it is seen as a negative attribute; whereas, if the best time complexity is better than the average, it is considered a positive attribute.

Comparison-based: Comparison-based sorting relies on comparison operators (\(<\), \(=\), \(>\)) to determine the relative order of elements and thus sort the entire array, with the theoretical optimal time complexity being \(O(n \log n)\). Meanwhile, non-comparison sorting does not use comparison operators and can achieve a time complexity of \(O(n)\), but its versatility is relatively poor.

11.1.2 Ideal sorting algorithm¶

Fast execution, in-place, stable, positively adaptive, and versatile. Clearly, no sorting algorithm that combines all these features has been found to date. Therefore, when selecting a sorting algorithm, it is necessary to decide based on the specific characteristics of the data and the requirements of the problem.

Next, we will learn about various sorting algorithms together and analyze the advantages and disadvantages of each based on the above evaluation dimensions.

11.1 Sorting algorithms¶

11.1.1 Evaluation dimensions¶

11.1.2 Ideal sorting algorithm¶

Feel free to drop your insights, questions or suggestions