Sorting is the ordering of objects in a list, with emphasis on “ordering.” We often instinctively associate sorting with alphabetizing—and this is only natural because hearing our name read out from an alphabetically-arranged list is one of our first exposures to the concept of sorting. However, one can order items by date, size, location, etc. The task of sorting is inherently algorithmic, that is to say, it can be done by applying a set of steps repeatedly. For this reason, sorting is well suited to be implemented using computers, and so it is at the very heart of computer science.

According to Brian Christian and Tom Griffiths, authors of the bestselling *Algorithms to Live By*, “in many ways it was sorting that brought the computer into being.” This is why all algorithm courses focus heavily on analyzing different sorting algorithms, and why I spent one dismal Spring Break agonizing over a project comparing different sorting algorithms. But the struggle to find the most efficient sorting algorithm was essentially born out the human desire to organize. Some of us are a little OCD in this regard. We must sort everything because we think it will make life easier in the long run. However, sorting isn’t always necessary.

So why *do *we sort? Sorting enables quicker searching. In fact, searching algorithms like the binary search only perform efficiently when the input is sorted. Binary search was one of the first improvements in developing more efficient sorting algorithms. Given a sorted list, a binary search algorithm repeatedly divides the input into two (hence “binary”), narrowing down the location of the target. Computer scientists measure the efficiency of algorithms using what is called Big-O notation which is an analysis of the worst-case run time of an algorithm. However, algorithm performance is one of the mathematically-laden concepts that the everyday technology users don’t need to know. Simply put, a binary search only ever checks half the input, giving it a performance edge. Keep in mind that this edge is solely based on the assumption that the input is *sorted*.

In today’s world of information overload, the swiftness with which we can find a particular piece of information depends on its position in the wider pool of data. The notion that sorting only simplifies searching leads to one conclusion: it is better to err on the side of messiness. Sorting something you will never search is futile, but searching something you never sorted is downright inefficient. The trick, therefore, is to know ahead of time the extent to which you will need to search.

Let’s analyze two situations. Billions of people are attempting to access information online via Google at any given moment. Google *knows* a number of facts ahead of time. (a) its data will be searched (b) the data will be searched *repeatedly* and (c) the time it takes to return results is more highly valued than the order in which the results are returned. This is why Google and its sister search engines focus heavily on sorting data before-hand. On the other hand, is it really necessary to alphabetize your bookshelves? None of the above conditions are readily true in this case. In fact, if you have a rough idea of where you put a book in the first place, it will be fairly easy to locate. But what actually renders sorting useless in this domestic case? The fact is the time your eyes take to scan a bookshelf is significantly less than the time it will take to sort by hand.

Computer science reveals that we can quantify the cost of having a mess and the cost of sorting, where time is the unit of measurement. Sometimes having a mess isn’t just procrastinating, taking time away from your future self. A bit of unfortunate news for us compulsive sorters: Sometimes a mess is the optimal choice.

## One thought on “The Heart of Computer Science: Demystifying Sorting”