BINARY SEARCH

The Core Concept

Binary search is an elegant algorithm that finds a target value in a sorted array by repeatedly eliminating half of the remaining possibilities. Imagine searching for a word in a dictionary—you don't start at page 1 and flip through sequentially. You open somewhere in the middle, check if you're too far ahead or behind, then repeat the process with the correct half.

This "divide-and-conquer" strategy is what makes binary search so powerful: every comparison reduces the search space by 50%, leading to logarithmic time complexity.

The essential requirement: The array must be sorted. Without order, we can't make logical deductions about which half contains our target.

The Problem

Searching is everywhere in computing. Every time you:

Look up a contact in your phone
Find a file on your computer
Query a database
Use autocomplete

...you're using a search algorithm.

Why Linear Search Falls Short

Linear search (checking each element one by one) works, but it doesn't scale:

arduino
[object Object]

For a billion elements, linear search might take seconds. In a world where users expect millisecond response times, this is unacceptable.

The Key Insight

If data is sorted, we can be strategic. When we check the middle element:

If target < middle: target must be in the left half (if it exists)
If target > middle: target must be in the right half (if it exists)
If target == middle: found it!

We've just eliminated half the array with a single comparison. Do this repeatedly, and we can search a billion elements in about 30 comparisons.

The trade-off: Sorting costs $O (n lo g n)$ , but if we search many times, the cost amortizes. One sort, many searches—that's the sweet spot for binary search.

Historical Context

The Surprisingly Tricky History

Binary search seems simple, but correct implementation eluded programmers for years:

1946: John Mauchly first described the algorithm in "Theory and Techniques for Design of Electronic Digital Computers"

Context: Searching sorted punch card systems
Problem: Computers had tiny memories; efficiency was critical

1962: Donald Knuth published the first provably correct implementation

That's a 16-year gap!
Why? Off-by-one errors in boundary conditions are notoriously subtle

1960s-1980s: Studies found that most published binary search implementations had bugs

2006: A bug was found in Java's Arrays.binarySearch() that had existed since 1994

The issue: mid = (low + high) / 2 causes integer overflow for large arrays
The fix: mid = low + (high - low) / 2

This history reminds us: simple concepts can have complex implementations.

How Binary Search Works: The Mechanics

The Setup

We maintain three pointers into the array:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],  ,[object Object],
        ↑                   ↑                   ↑
       ,[object Object],              middle              ,[object Object],

Initial state:

left = 0 (first index)
right = n - 1 (last index)
middle = left + (right - left) / 2 (midpoint)

The Algorithm: Step by Step

Step 1: Calculate middle

We use this formula:

middle = left + ⌊ \frac{right - left}{2} ⌋

Why not just (left + right) / 2?

Consider left = 2,000,000,000 and right = 2,000,000,100:

left + right = 4,000,000,100
In 32-bit systems, max int = 2,147,483,647
Overflow! Result wraps to negative number
Bug: returns wrong index or crashes

Using left + (right - left) / 2:

right - left = 100
100 / 2 = 50
left + 50 = 2,000,000,050
No overflow, correct answer

This is the infamous Java bug mentioned earlier.

Step 2: Compare

python
[object Object], array[middle] == target:
    ,[object Object], middle  ,[object Object],

If we're lucky, we found it immediately. Best case: $O (1)$ .

Step 3: Eliminate half

python
[object Object], target < array[middle]:
    right = middle - ,[object Object],  ,[object Object],
,[object Object],:
    left = middle + ,[object Object],   ,[object Object],

Critical detail: We set right = middle - 1, not middle.

Why? We already checked middle, so we can exclude it. If we set right = middle, we might loop forever.

Step 4: Repeat

Continue while left <= right. When left > right, the search space is empty—target doesn't exist.

Complete Visual Walkthrough

Let's search for target = 45 in [2, 5, 8, 12, 16, 23, 38, 45, 56, 67, 78]

Initial state:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],  ,[object Object],
        ↑                   ↑                   ↑
       ,[object Object],              middle              ,[object Object],
       (,[object Object],)                (,[object Object],)                (,[object Object],)

middle ,[object Object], ,[object Object], ,[object Object], (,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object], ,[object Object], ,[object Object],
,[object Object],[,[object Object],] ,[object Object], ,[object Object],
Compare: ,[object Object], ,[object Object], ,[object Object],
Action: ,[object Object], ,[object Object], half
,[object Object],: ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object],

After iteration 1:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],  ,[object Object],
                                 ↑   ↑           ↑
                                ,[object Object], mid        ,[object Object],
                                (,[object Object],)  (,[object Object],)        (,[object Object],)

middle ,[object Object], ,[object Object], ,[object Object], (,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object], ,[object Object], ,[object Object],
,[object Object],[,[object Object],] ,[object Object], ,[object Object],
Compare: ,[object Object], ,[object Object], ,[object Object],
Action: ,[object Object], ,[object Object], half
,[object Object],: ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object],

After iteration 2:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],  ,[object Object],
                                 ↑  ↑
                                ,[object Object],[object Object],mid,[object Object],[object Object],
                                 (,[object Object],) (,[object Object],)

middle ,[object Object], ,[object Object], ,[object Object], (,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object], ,[object Object], ,[object Object],
,[object Object],[,[object Object],] ,[object Object], ,[object Object],
Compare: ,[object Object], ,[object Object], ,[object Object],
Action: ,[object Object], ,[object Object], half
,[object Object],: ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object],

After iteration 3:

ini
Array: ,[object Object],
Index:  0  1  2   3   4   5   6     7    8   9  10
                                      ↑
                                  left/mid/right
                                      (7)

,[object Object], = ,[object Object], + (,[object Object], - ,[object Object],) / ,[object Object], = ,[object Object],
array,[object Object], = 45
Compare: ,[object Object], == ,[object Object],
FOUND! Return 7

Total comparisons: 3 (compared to 8 for linear search if we started from index 0)

Mathematical Formalization

Formal Algorithm Definition

Given:

A sorted array $A = [a_{0}, a_{1}, a_{2}, \dots, a_{n - 1}]$ where

a_{i} \leq a_{i + 1}

for all $i$

A target value $x$

Find:

An index $i$ such that $A [i] = x$ , or
Return $- 1$ (or null) if no such $i$ exists

Algorithm:

sql
BinarySearch(A, x):
    ,[object Object], ← ,[object Object],
    ,[object Object], ← length(A) ,[object Object], ,[object Object],
    
    while ,[object Object], ≤ ,[object Object], do:
        middle ← ,[object Object], ,[object Object], ⌊(,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object],⌋
        
        if A[middle] ,[object Object], x ,[object Object],:
            ,[object Object], middle
        ,[object Object], if A[middle] ,[object Object], x ,[object Object],:
            ,[object Object], ← middle ,[object Object], ,[object Object],
        ,[object Object],:
            ,[object Object], ← middle ,[object Object], ,[object Object],
    
    ,[object Object], ,[object Object],  ,[object Object],[object Object], ,[object Object], found

Correctness Proof

We prove binary search is correct using a loop invariant:

Invariant: If $x$ exists in $A$ , then $x$ is in the subarray

A [left \dots right]

Proof by induction:

Base case (initialization):

Initially, left = 0 and right = n - 1
Subarray $A [0 \dots n - 1]$ is the entire array
If $x$ exists anywhere, it's in this range
Invariant holds

Inductive step (maintenance):

Assume invariant holds before an iteration. Three cases:

Case 1:

A [middle] = x

We return middle immediately
Correct!

Case 2:

A [middle] < x

Since array is sorted, all elements left of middle are ≤

A [middle] < x

So $x$ cannot be in

A [left \dots middle]

If $x$ exists, it must be in

A [middle + 1 \dots right]

We set left = middle + 1
Invariant maintained

Case 3:

A [middle] > x

Since array is sorted, all elements right of middle are ≥

A [middle] > x

So $x$ cannot be in

A [middle \dots right]

If $x$ exists, it must be in

A [left \dots middle - 1]

We set right = middle - 1
Invariant maintained

Termination:

Each iteration, either:
- We find $x$ and return, or
- Search space reduces by at least half
Since search space size decreases, eventually left > right
When left > right, search space is empty
By invariant, if we haven't found $x$ , it doesn't exist
Returning -1 is correct

Thus, binary search is correct.

Time Complexity Analysis

Let $T (n)$ be the worst-case number of comparisons for an array of size $n$ .

Recurrence relation:

After one comparison, we search a subarray of size

⌊ n /2 ⌋

T (n) = T (⌊ \frac{n}{2} ⌋) + 1

where the "+1" is for the comparison at this level.

Base case: $T (1) = 1$ (one element, one comparison)

Expanding the recurrence:

T (n) = T (n /2) + 1 = (T (n /4) + 1) + 1 = T (n /4) + 2 = (T (n /8) + 1) + 2 = T (n /8) + 3 ⋮ = T (n / 2^{k}) + k

We stop when $n / 2^{k} = 1$ , which means $2^{k} = n$ , so $k = lo g_{2} n$ .

Therefore:

T (n) = T (1) + lo g_{2} n = 1 + lo g_{2} n

Result: $T (n) = O (lo g n)$

Space Complexity

Iterative implementation:

Space = O (1)

We only use a constant number of variables (left, right, middle, regardless of $n$ ).

Recursive implementation:

Space = O (lo g n)

Each recursive call adds a frame to the call stack. Maximum recursion depth is $lo g n$ (the height of the binary search tree).

Example: For $n = 1024$ :

Iterative: 3-4 integer variables ≈ 12-16 bytes
Recursive: 10 stack frames × ~20 bytes/frame ≈ 200 bytes

For large $n$ , iterative is more memory-efficient.

Why Logarithmic is Powerful

Let's see how $lo g_{2} n$ grows:

bash
n              ,[object Object],₂(n)    Comparisons needed
10                3.3      4
100               6.6      7
1,000            10        10
10,000           13        13
100,000          17        17
1,000,000        20        20
1,000,000,000    30        30

Notice: Every time we add a zero to $n$ (multiply by 10), we add only ~3 comparisons.

Compare to linear search:

sql
n              Linear    ,[object Object],    Speedup
,[object Object],,,[object Object],          ,[object Object],,,[object Object],     ,[object Object],        ,[object Object],×
,[object Object],,,[object Object],,,[object Object],      ,[object Object],,,[object Object],,,[object Object], ,[object Object],        ,[object Object],,,[object Object],×
,[object Object],,,[object Object],,,[object Object],,,[object Object],  ,[object Object], billion ,[object Object],        ,[object Object], million×

This is why binary search is called "logarithmic" and why it's so effective for large datasets.

Worked Example: Complete Trace

Problem

Find target = 31 in array [3, 7, 12, 18, 24, 31, 42, 55, 63]

Array size: $n = 9$

Expected comparisons: $⌈ lo g_{2} 9 ⌉ = ⌈ 3.17 ⌉ = 4$ (worst case)

Detailed Execution

Initial State:

sql
[object Object],:  [,[object Object],,  ,[object Object],,  ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],
         ↑                   ↑               ↑
        ,[object Object],             middle          ,[object Object],

Variables:

left = 0
right = 8
target = 31

Iteration 1:

Calculate middle:

middle = 0 + ⌊ \frac{8 - 0}{2} ⌋ = 0 + 4 = 4

Compare:

array [4] = 24 = ? 31

$24 < 31$ , so target is in the right half.

Update:

left = middle + 1 = 5
right = 8 (unchanged)

Search space reduced:

sql
[object Object],:  [,[object Object],,  ,[object Object],,  ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],
                                ↑           ↑
                               ,[object Object],       ,[object Object],

New size: 4 elements (was 9)

Iteration 2:

Calculate middle:

middle = 5 + ⌊ \frac{8 - 5}{2} ⌋ = 5 + 1 = 6

Compare:

array [6] = 42 = ? 31

$42 > 31$ , so target is in the left half (of our current range).

Update:

left = 5 (unchanged)
right = middle - 1 = 5

Search space reduced:

sql
[object Object],:  [,[object Object],,  ,[object Object],,  ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object], ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],]
Index:   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],    ,[object Object],   ,[object Object],   ,[object Object],
                                ↑
                            ,[object Object],[object Object],[object Object],

New size: 1 element (was 4)

Iteration 3:

Calculate middle:

middle = 5 + ⌊ \frac{5 - 5}{2} ⌋ = 5 + 0 = 5

Compare:

array [5] = 31 = ? 31

Match! Return middle = 5.

Summary:

Total comparisons: 3
Predicted max: 4
Linear search would need: 6 comparisons (if starting from left)

Binary search is twice as fast in this example, and the advantage grows exponentially with array size.

Practical Implementation Examples

Iterative Implementation (Recommended)

python
[object Object], ,[object Object],(,[object Object],):
    ,[object Object],
    left = ,[object Object],
    right = ,[object Object],(array) - ,[object Object],
    
    ,[object Object], left <= right:
        ,[object Object],
        middle = left + (right - left) // ,[object Object],
        
        ,[object Object], array[middle] == target:
            ,[object Object], middle  ,[object Object],
        ,[object Object], array[middle] < target:
            left = middle + ,[object Object],  ,[object Object],
        ,[object Object],:
            right = middle - ,[object Object],  ,[object Object],
    
    ,[object Object], -,[object Object],  ,[object Object],


,[object Object],
data = [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
,[object Object],(binary_search(data, ,[object Object],))  ,[object Object],
,[object Object],(binary_search(data, ,[object Object],))  ,[object Object],

Recursive Implementation

python
[object Object], ,[object Object],(,[object Object],):
    ,[object Object],
    ,[object Object], right ,[object Object], ,[object Object],:
        right = ,[object Object],(array) - ,[object Object],
    
    ,[object Object],
    ,[object Object], left > right:
        ,[object Object], -,[object Object],
    
    middle = left + (right - left) // ,[object Object],
    
    ,[object Object], array[middle] == target:
        ,[object Object], middle
    ,[object Object], array[middle] < target:
        ,[object Object],
        ,[object Object], binary_search_recursive(array, target, middle + ,[object Object],, right)
    ,[object Object],:
        ,[object Object],
        ,[object Object], binary_search_recursive(array, target, left, middle - ,[object Object],)

Recursion tree visualization for n = 8:

ini
                    ,[object Object], ,[object Object],=,[object Object],
                   /            \
            ,[object Object], ,[object Object],=,[object Object],        [,[object Object],...,[object Object],] n=,[object Object],
           /        \          /        \
      ,[object Object],    ,[object Object],  ,[object Object],    ,[object Object],  ,[object Object],=,[object Object],
      /    \     /    \    /    \     /    \
    ,[object Object],   ,[object Object], ,[object Object],   ,[object Object], ,[object Object],   ,[object Object],  ,[object Object],   ,[object Object],  ,[object Object],=,[object Object],

,[object Object], = log₂(,[object Object],) = ,[object Object], levels

Each level halves the problem size—classic divide-and-conquer.

Advanced: Finding Insertion Point (Lower Bound)

python
[object Object], ,[object Object],(,[object Object],):
    ,[object Object],
    left = ,[object Object],
    right = ,[object Object],(array)
    
    ,[object Object], left < right:  ,[object Object],
        middle = left + (right - left) // ,[object Object],
        
        ,[object Object], array[middle] < target:
            left = middle + ,[object Object],
        ,[object Object],:
            ,[object Object],
            right = middle
    
    ,[object Object], left


,[object Object],
data = [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
,[object Object],(binary_search_insert_position(data, ,[object Object],))   ,[object Object],
,[object Object],(binary_search_insert_position(data, ,[object Object],))   ,[object Object],
,[object Object],(binary_search_insert_position(data, ,[object Object],))  ,[object Object],

This variant is useful for:

Maintaining sorted lists with insertions
Finding ranges of duplicate elements
Database indexing operations

Variations & Extensions

Lower Bound and Upper Bound

Lower bound: First position where element >= target

Upper bound: First position where element > target

These let you find the range of duplicates in $O (lo g n)$ time.

Example:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Target: ,[object Object],

Lower bound: ,[object Object], (,[object Object], ,[object Object],)
Upper bound: ,[object Object], (,[object Object], element ,[object Object], ,[object Object],, which ,[object Object], ,[object Object],)
,[object Object], ,[object Object], ,[object Object],[object Object]

Useful for database queries like "find all records between dates A and B."

Binary Search on Answer Space

Concept: Binary search doesn't only work on arrays—it works on any monotonic function.

Problem: Find the square root of $x$ to 6 decimal places.

Solution: Binary search on the range $[0, x]$ :

python
[object Object], ,[object Object],(,[object Object],):
    left, right = ,[object Object],, x
    
    ,[object Object], right - left > precision:
        mid = (left + right) / ,[object Object],
        ,[object Object], mid * mid < x:
            left = mid
        ,[object Object],:
            right = mid
    
    ,[object Object], (left + right) / ,[object Object],

We're searching for the value where

value^{2} = x

, not an array index.

Applications:

Finding roots of equations
Minimizing/maximizing functions
Optimization problems ("minimize cost such that constraint is satisfied")

Binary Search on Rotated Arrays

Problem: Array is sorted, then rotated at an unknown pivot.

Example: [4, 5, 6, 7, 0, 1, 2] (rotated at index 4)

Challenge: Normal binary search fails because ordering is broken.

Solution: Modified binary search that determines which half is sorted:

python
[object Object], ,[object Object],(,[object Object],):
    left, right = ,[object Object],, ,[object Object],(array) - ,[object Object],
    
    ,[object Object], left <= right:
        mid = left + (right - left) // ,[object Object],
        
        ,[object Object], array[mid] == target:
            ,[object Object], mid
        
        ,[object Object],
        ,[object Object], array[left] <= array[mid]:  ,[object Object],
            ,[object Object], array[left] <= target < array[mid]:
                right = mid - ,[object Object],  ,[object Object],
            ,[object Object],:
                left = mid + ,[object Object],   ,[object Object],
        ,[object Object],:  ,[object Object],
            ,[object Object], array[mid] < target <= array[right]:
                left = mid + ,[object Object],   ,[object Object],
            ,[object Object],:
                right = mid - ,[object Object],  ,[object Object],
    
    ,[object Object], -,[object Object],

Still $O (lo g n)$ , but requires careful logic to handle the rotation.

Common Misconceptions

Misconception 1: "Binary search works on unsorted data"

False. Binary search's correctness relies on the sorted property.

Counterexample:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Target: ,[object Object],

,[object Object], ,[object Object],:
,[object Object], middle ,[object Object], ,[object Object],, ,[object Object],[,[object Object],] ,[object Object], ,[object Object],
,[object Object], ,[object Object], ,[object Object], ,[object Object],, so ,[object Object], ,[object Object], half: [,[object Object],, ,[object Object],]
,[object Object], middle ,[object Object], ,[object Object],, ,[object Object],[,[object Object],] ,[object Object], ,[object Object],
,[object Object], ,[object Object], ,[object Object], ,[object Object],, so ,[object Object], left... but ,[object Object], ,[object Object], ,[object Object],, ,[object Object], ,[object Object], ,[object Object],
,[object Object], ,[object Object], found,[object Object],

But ,[object Object], ,[object Object], ,[object Object], the ,[object Object], ,[object Object], index ,[object Object],

Binary search fails silently on unsorted data—no error, just wrong results.

Misconception 2: "(left + right) / 2 is fine"

Risky. This causes integer overflow in languages with fixed-size integers.

Example in Java (32-bit ints):

java
[object Object], ,[object Object], ,[object Object], ,[object Object],;
,[object Object], ,[object Object], ,[object Object], ,[object Object],;
,[object Object], ,[object Object], ,[object Object], (left + right) / ,[object Object],;  ,[object Object],

,[object Object],
,[object Object],
,[object Object],
,[object Object],

Correct:

java
[object Object], ,[object Object], ,[object Object], left + (right - left) / ,[object Object],;

,[object Object],
,[object Object],
,[object Object],

This bug existed in Java's standard library for 12 years (1994-2006)!

Misconception 3: "Binary search is always faster than linear"

Not for small arrays. Binary search has higher overhead:

sql
Operations ,[object Object], comparison:
,[object Object], Linear ,[object Object],: ,[object Object], comparison, ,[object Object], increment
,[object Object], ,[object Object], ,[object Object],: ,[object Object], comparison, ,[object Object], subtractions, ,[object Object], division, ,[object Object], assignments

Crossover point: Typically around $n = 10$ .

Many optimized sorting algorithms (like Timsort, used in Python) switch to insertion sort for small subarrays.

Rule of thumb:

$n < 10$ : Linear search often faster

n = 10 - 100

: About equal

$n > 100$ : Binary search dominates

Advantages & Limitations

Advantages

Extremely efficient for large datasets
- Searching 1 million elements: ~20 comparisons
- Searching 1 billion elements: ~30 comparisons
Predictable performance
- Worst case = average case = $O (lo g n)$
- No performance surprises
Minimal memory
- Iterative version: $O (1)$ extra space
- Just a few integer variables
Provably optimal
- For comparison-based search in sorted arrays, you can't beat $O (lo g n)$

Limitations

Requires sorted data
- Sorting costs $O (n lo g n)$
- Only worth it if searching multiple times
- Breakeven: If you'll search

\geq n

times, sorting pays off

Requires random access
- Doesn't work on linked lists (accessing middle takes $O (n)$ )
- Requires array or similar structure
Not optimal for small arrays
- Overhead outweighs benefits for $n < 10$
Implementation subtlety
- Easy to get wrong (off-by-one errors, overflow)
- Use well-tested library implementations when possible

When to Use

Use binary search when:

Data is already sorted or will be searched many times
Working with large datasets ( $n > 100$ )
Need guaranteed $O (lo g n)$ performance
Using arrays or random-access structures

Avoid binary search when:

Data is unsorted and sorting cost isn't justified
Working with linked lists or sequential structures
Array is very small ( $n < 10$ )
Data changes frequently (insertions/deletions break sorting)

Alternative: Use hash tables for $O (1)$ average-case search (but with $O (n)$ space).

Comparison with Alternatives

Aspect	Binary Search	Linear Search	Hash Table	Binary Search Tree
Search Time	$O (lo g n)$	$O (n)$	$O (1)$ average, $O (n)$ worst	$O (lo g n)$ balanced
Requires Sorted	Yes	No	No	Maintains order
Space	$O (1)$	$O (1)$	$O (n)$	$O (n)$
Insert Cost	$O (n)$	$O (1)$ end	$O (1)$ average	$O (lo g n)$ balanced
Delete Cost	$O (n)$	$O (n)$	$O (1)$ average	$O (lo g n)$ balanced
Best Use Case	Read-heavy sorted data	Small/unsorted data	Frequent lookups	Dynamic sorted data

Key insight: Choose data structure based on operation frequency:

Mostly searching, rarely modifying → Binary search on sorted array
Frequent insertions/deletions → Balanced BST
Pure lookups, no order needed → Hash table
Small data → Linear search (simplicity wins)

Practice Exercises

Conceptual Questions

Q1: Why must the array be sorted?

Click to reveal answer

Binary search's correctness depends on making logical deductions from comparisons. When we find array[middle] < target, we conclude "target must be in the right half." This is only valid if the array is sorted.

In an unsorted array:

ini
[object Object],, ,[object Object], = ,[object Object],
,[object Object], = ,[object Object],, array[,[object Object],] = ,[object Object],
1 < 8, search left: ,[object Object],

But 1 is actually at index 3 (right side). The algorithm fails because we can't trust the "left < right" property.

Q2: What's the maximum number of comparisons for $n = 1000$ ?

Click to reveal answer

$⌈ lo g_{2} 1000 ⌉ = ⌈ 9.97 ⌉ = 10$ comparisons.

Why? After $k$ comparisons, we've narrowed the search to $n / 2^{k}$ elements. We stop when this reaches 1:

\frac{1000}{2 ^{k}} = 1 ⟹ 2^{k} = 1000 ⟹ k = lo g_{2} 1000 \approx 9.97

Round up to 10 since we need an integer number of comparisons.

Q3: Can binary search work on linked lists efficiently?

Click to reveal answer

No. Binary search requires $O (1)$ access to the middle element. In a linked list:

ini
Array: ,[object Object], = array[n/,[object Object],]  -> O(,[object Object],)
Linked list: ,[object Object], = traverse n/,[object Object], nodes -> O(n)

Each iteration of binary search would take $O (n)$ to find the middle, giving total time $O (n lo g n)$ —worse than linear search!

Solution for linked lists: Use a balanced binary search tree instead ( $O (lo g n)$ search with $O (lo g n)$ insert/delete).

Coding Problems

Easy:

First Bad Version: Given n versions and a function isBadVersion(v), find the first bad version minimizing API calls.
- Hint: Binary search on version numbers

Medium: 2. Search in Rotated Sorted Array: Array is sorted then rotated (e.g., [4,5,6,7,0,1,2]). Find target in $O (lo g n)$ .

Hint: Determine which half is sorted

Find Peak Element: In array where arr[i] != arr[i+1], find any index i where arr[i] > arr[i-1] and arr[i] > arr[i+1].
- Hint: Binary search on slope direction

Hard: 4. Median of Two Sorted Arrays: Given two sorted arrays, find the median in $O (lo g (m + n))$ .

Hint: Binary search on partition points

Real-World Applications

Database Indexes (B-Trees)

Databases use B-trees, which are generalized binary search trees optimized for disk I/O:

sql
[object Object], index (linear scan):
,[object Object], ,[object Object], million ,[object Object],
,[object Object], ,[object Object], ,[object Object], ,[object Object],: ,[object Object],,,[object Object],,,[object Object], disk ,[object Object],
,[object Object], ,[object Object], ,[object Object],ms,[object Object],read: ,[object Object], hours

,[object Object], B,[object Object],tree index:
,[object Object], Height ,[object Object], ,[object Object],(n) ≈ ,[object Object],
,[object Object], ,[object Object], disk ,[object Object],
,[object Object], ,[object Object], ,[object Object],ms,[object Object],read: ,[object Object], seconds

Impact: 50,000× speedup. This is why "add an index" often magically fixes slow queries.

Git Bisect

Find which commit introduced a bug:

bash

$$
git bisect start
$$
 git bisect bad            ,[object Object],

$$
git bisect good v1.0      ,[object Object],

,[object Object],
$$
 run_tests.sh
,[object Object],

$$
git bisect bad
,[object Object],
$$
 git bisect good

,[object Object],
,[object Object],

For 1000 commits, finds the bug in ~10 tests instead of testing all 1000.

Operating Systems

OS maintains sorted free-block lists:

sql
[object Object], memory blocks: [,[object Object],[object Object],, ,[object Object],[object Object],, ...]

Allocation request: ,[object Object],KB
,[object Object], ,[object Object], ,[object Object], ,[object Object], suitable block: O(log n)
,[object Object], ,[object Object],: O(n) linear scan

Critical for OS performance as memory operations happen millions of times per second.

Standard Libraries

Binary search is ubiquitous:

C++ STL: std::binary_search(), std::lower_bound(), std::upper_bound()
Java: Arrays.binarySearch(), Collections.binarySearch()
Python: bisect.bisect_left(), bisect.bisect_right()
Go: sort.Search()

These are highly optimized, battle-tested implementations. Use them instead of rolling your own.

Prerequisites

Arrays - Random access data structure
Algorithm Analysis - Big-O notation
Recursion - For recursive implementation

Divide and Conquer - General paradigm
Binary Search Trees - Binary search in tree form
Sorting Algorithms - Prerequisite for binary search

Advanced Topics

Interpolation Search - $O (lo g lo g n)$ for uniformly distributed data
Exponential Search - For unbounded arrays
Ternary Search - For finding extrema in unimodal functions

References

Knuth, D. (1998). The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley.
Cormen, T., Leiserson, C., Rivest, R., Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
Bentley, J. (1999). Programming Pearls (2nd ed.). Addison-Wesley.
Bloch, J. (2006). "Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken." Google Research Blog.

The Core Concept

This "divide-and-conquer" strategy is what makes binary search so powerful: every comparison reduces the search space by 50%, leading to logarithmic time complexity.

The essential requirement: The array must be sorted. Without order, we can't make logical deductions about which half contains our target.

The Problem

Searching is everywhere in computing. Every time you:

Look up a contact in your phone
Find a file on your computer
Query a database
Use autocomplete

...you're using a search algorithm.

Why Linear Search Falls Short

Linear search (checking each element one by one) works, but it doesn't scale:

arduino
[object Object]

For a billion elements, linear search might take seconds. In a world where users expect millisecond response times, this is unacceptable.

The Key Insight

If data is sorted, we can be strategic. When we check the middle element:

If target < middle: target must be in the left half (if it exists)
If target > middle: target must be in the right half (if it exists)
If target == middle: found it!

We've just eliminated half the array with a single comparison. Do this repeatedly, and we can search a billion elements in about 30 comparisons.

The trade-off: Sorting costs $O (n lo g n)$ , but if we search many times, the cost amortizes. One sort, many searches—that's the sweet spot for binary search.

Historical Context

The Surprisingly Tricky History

Binary search seems simple, but correct implementation eluded programmers for years:

1946: John Mauchly first described the algorithm in "Theory and Techniques for Design of Electronic Digital Computers"

Context: Searching sorted punch card systems
Problem: Computers had tiny memories; efficiency was critical

1962: Donald Knuth published the first provably correct implementation

That's a 16-year gap!
Why? Off-by-one errors in boundary conditions are notoriously subtle

1960s-1980s: Studies found that most published binary search implementations had bugs

2006: A bug was found in Java's Arrays.binarySearch() that had existed since 1994

The issue: mid = (low + high) / 2 causes integer overflow for large arrays
The fix: mid = low + (high - low) / 2

This history reminds us: simple concepts can have complex implementations.

How Binary Search Works: The Mechanics

The Setup

We maintain three pointers into the array:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],  ,[object Object],
        ↑                   ↑                   ↑
       ,[object Object],              middle              ,[object Object],

Initial state:

left = 0 (first index)
right = n - 1 (last index)
middle = left + (right - left) / 2 (midpoint)

The Algorithm: Step by Step

Step 1: Calculate middle

We use this formula:

middle = left + ⌊ \frac{right - left}{2} ⌋

Why not just (left + right) / 2?

Consider left = 2,000,000,000 and right = 2,000,000,100:

left + right = 4,000,000,100
In 32-bit systems, max int = 2,147,483,647
Overflow! Result wraps to negative number
Bug: returns wrong index or crashes

Using left + (right - left) / 2:

right - left = 100
100 / 2 = 50
left + 50 = 2,000,000,050
No overflow, correct answer

This is the infamous Java bug mentioned earlier.

Step 2: Compare

python
[object Object], array[middle] == target:
    ,[object Object], middle  ,[object Object],

If we're lucky, we found it immediately. Best case: $O (1)$ .

Step 3: Eliminate half

python
[object Object], target < array[middle]:
    right = middle - ,[object Object],  ,[object Object],
,[object Object],:
    left = middle + ,[object Object],   ,[object Object],

Critical detail: We set right = middle - 1, not middle.

Why? We already checked middle, so we can exclude it. If we set right = middle, we might loop forever.

Step 4: Repeat

Continue while left <= right. When left > right, the search space is empty—target doesn't exist.

Complete Visual Walkthrough

Let's search for target = 45 in [2, 5, 8, 12, 16, 23, 38, 45, 56, 67, 78]

Initial state:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],  ,[object Object],
        ↑                   ↑                   ↑
       ,[object Object],              middle              ,[object Object],
       (,[object Object],)                (,[object Object],)                (,[object Object],)

middle ,[object Object], ,[object Object], ,[object Object], (,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object], ,[object Object], ,[object Object],
,[object Object],[,[object Object],] ,[object Object], ,[object Object],
Compare: ,[object Object], ,[object Object], ,[object Object],
Action: ,[object Object], ,[object Object], half
,[object Object],: ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object],

After iteration 1:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],  ,[object Object],
                                 ↑   ↑           ↑
                                ,[object Object], mid        ,[object Object],
                                (,[object Object],)  (,[object Object],)        (,[object Object],)

middle ,[object Object], ,[object Object], ,[object Object], (,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object], ,[object Object], ,[object Object],
,[object Object],[,[object Object],] ,[object Object], ,[object Object],
Compare: ,[object Object], ,[object Object], ,[object Object],
Action: ,[object Object], ,[object Object], half
,[object Object],: ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object],

After iteration 2:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],]
Index:  ,[object Object],  ,[object Object],  ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],  ,[object Object],
                                 ↑  ↑
                                ,[object Object],[object Object],mid,[object Object],[object Object],
                                 (,[object Object],) (,[object Object],)

middle ,[object Object], ,[object Object], ,[object Object], (,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object], ,[object Object], ,[object Object],
,[object Object],[,[object Object],] ,[object Object], ,[object Object],
Compare: ,[object Object], ,[object Object], ,[object Object],
Action: ,[object Object], ,[object Object], half
,[object Object],: ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object], ,[object Object],

After iteration 3:

ini
Array: ,[object Object],
Index:  0  1  2   3   4   5   6     7    8   9  10
                                      ↑
                                  left/mid/right
                                      (7)

,[object Object], = ,[object Object], + (,[object Object], - ,[object Object],) / ,[object Object], = ,[object Object],
array,[object Object], = 45
Compare: ,[object Object], == ,[object Object],
FOUND! Return 7

Total comparisons: 3 (compared to 8 for linear search if we started from index 0)

Mathematical Formalization

Formal Algorithm Definition

Given:

A sorted array $A = [a_{0}, a_{1}, a_{2}, \dots, a_{n - 1}]$ where

a_{i} \leq a_{i + 1}

for all $i$

A target value $x$

Find:

An index $i$ such that $A [i] = x$ , or
Return $- 1$ (or null) if no such $i$ exists

Algorithm:

sql
BinarySearch(A, x):
    ,[object Object], ← ,[object Object],
    ,[object Object], ← length(A) ,[object Object], ,[object Object],
    
    while ,[object Object], ≤ ,[object Object], do:
        middle ← ,[object Object], ,[object Object], ⌊(,[object Object], ,[object Object], ,[object Object],) ,[object Object], ,[object Object],⌋
        
        if A[middle] ,[object Object], x ,[object Object],:
            ,[object Object], middle
        ,[object Object], if A[middle] ,[object Object], x ,[object Object],:
            ,[object Object], ← middle ,[object Object], ,[object Object],
        ,[object Object],:
            ,[object Object], ← middle ,[object Object], ,[object Object],
    
    ,[object Object], ,[object Object],  ,[object Object],[object Object], ,[object Object], found

Correctness Proof

We prove binary search is correct using a loop invariant:

Invariant: If $x$ exists in $A$ , then $x$ is in the subarray

A [left \dots right]

Proof by induction:

Base case (initialization):

Initially, left = 0 and right = n - 1
Subarray $A [0 \dots n - 1]$ is the entire array
If $x$ exists anywhere, it's in this range
Invariant holds

Inductive step (maintenance):

Assume invariant holds before an iteration. Three cases:

Case 1:

A [middle] = x

We return middle immediately
Correct!

Case 2:

A [middle] < x

Since array is sorted, all elements left of middle are ≤

A [middle] < x

So $x$ cannot be in

A [left \dots middle]

If $x$ exists, it must be in

A [middle + 1 \dots right]

We set left = middle + 1
Invariant maintained

Case 3:

A [middle] > x

Since array is sorted, all elements right of middle are ≥

A [middle] > x

So $x$ cannot be in

A [middle \dots right]

If $x$ exists, it must be in

A [left \dots middle - 1]

We set right = middle - 1
Invariant maintained

Termination:

Each iteration, either:
- We find $x$ and return, or
- Search space reduces by at least half
Since search space size decreases, eventually left > right
When left > right, search space is empty
By invariant, if we haven't found $x$ , it doesn't exist
Returning -1 is correct

Thus, binary search is correct.

Time Complexity Analysis

Let $T (n)$ be the worst-case number of comparisons for an array of size $n$ .

Recurrence relation:

After one comparison, we search a subarray of size

⌊ n /2 ⌋

T (n) = T (⌊ \frac{n}{2} ⌋) + 1

where the "+1" is for the comparison at this level.

Base case: $T (1) = 1$ (one element, one comparison)

Expanding the recurrence:

T (n) = T (n /2) + 1 = (T (n /4) + 1) + 1 = T (n /4) + 2 = (T (n /8) + 1) + 2 = T (n /8) + 3 ⋮ = T (n / 2^{k}) + k

We stop when $n / 2^{k} = 1$ , which means $2^{k} = n$ , so $k = lo g_{2} n$ .

Therefore:

T (n) = T (1) + lo g_{2} n = 1 + lo g_{2} n

Result: $T (n) = O (lo g n)$

Space Complexity

Iterative implementation:

Space = O (1)

We only use a constant number of variables (left, right, middle, regardless of $n$ ).

Recursive implementation:

Space = O (lo g n)

Each recursive call adds a frame to the call stack. Maximum recursion depth is $lo g n$ (the height of the binary search tree).

Example: For $n = 1024$ :

Iterative: 3-4 integer variables ≈ 12-16 bytes
Recursive: 10 stack frames × ~20 bytes/frame ≈ 200 bytes

For large $n$ , iterative is more memory-efficient.

Why Logarithmic is Powerful

Let's see how $lo g_{2} n$ grows:

bash
n              ,[object Object],₂(n)    Comparisons needed
10                3.3      4
100               6.6      7
1,000            10        10
10,000           13        13
100,000          17        17
1,000,000        20        20
1,000,000,000    30        30

Notice: Every time we add a zero to $n$ (multiply by 10), we add only ~3 comparisons.

Compare to linear search:

sql
n              Linear    ,[object Object],    Speedup
,[object Object],,,[object Object],          ,[object Object],,,[object Object],     ,[object Object],        ,[object Object],×
,[object Object],,,[object Object],,,[object Object],      ,[object Object],,,[object Object],,,[object Object], ,[object Object],        ,[object Object],,,[object Object],×
,[object Object],,,[object Object],,,[object Object],,,[object Object],  ,[object Object], billion ,[object Object],        ,[object Object], million×

This is why binary search is called "logarithmic" and why it's so effective for large datasets.

Worked Example: Complete Trace

Problem

Find target = 31 in array [3, 7, 12, 18, 24, 31, 42, 55, 63]

Array size: $n = 9$

Expected comparisons: $⌈ lo g_{2} 9 ⌉ = ⌈ 3.17 ⌉ = 4$ (worst case)

Detailed Execution

Initial State:

sql
[object Object],:  [,[object Object],,  ,[object Object],,  ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],
         ↑                   ↑               ↑
        ,[object Object],             middle          ,[object Object],

Variables:

left = 0
right = 8
target = 31

Iteration 1:

Calculate middle:

middle = 0 + ⌊ \frac{8 - 0}{2} ⌋ = 0 + 4 = 4

Compare:

array [4] = 24 = ? 31

$24 < 31$ , so target is in the right half.

Update:

left = middle + 1 = 5
right = 8 (unchanged)

Search space reduced:

sql
[object Object],:  [,[object Object],,  ,[object Object],,  ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Index:   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],
                                ↑           ↑
                               ,[object Object],       ,[object Object],

New size: 4 elements (was 9)

Iteration 2:

Calculate middle:

middle = 5 + ⌊ \frac{8 - 5}{2} ⌋ = 5 + 1 = 6

Compare:

array [6] = 42 = ? 31

$42 > 31$ , so target is in the left half (of our current range).

Update:

left = 5 (unchanged)
right = middle - 1 = 5

Search space reduced:

sql
[object Object],:  [,[object Object],,  ,[object Object],,  ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object], ,[object Object], ,[object Object], ,[object Object],, ,[object Object],, ,[object Object],]
Index:   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],   ,[object Object],     ,[object Object],    ,[object Object],   ,[object Object],   ,[object Object],
                                ↑
                            ,[object Object],[object Object],[object Object],

New size: 1 element (was 4)

Iteration 3:

Calculate middle:

middle = 5 + ⌊ \frac{5 - 5}{2} ⌋ = 5 + 0 = 5

Compare:

array [5] = 31 = ? 31

Match! Return middle = 5.

Summary:

Total comparisons: 3
Predicted max: 4
Linear search would need: 6 comparisons (if starting from left)

Binary search is twice as fast in this example, and the advantage grows exponentially with array size.

Practical Implementation Examples

Iterative Implementation (Recommended)

python
[object Object], ,[object Object],(,[object Object],):
    ,[object Object],
    left = ,[object Object],
    right = ,[object Object],(array) - ,[object Object],
    
    ,[object Object], left <= right:
        ,[object Object],
        middle = left + (right - left) // ,[object Object],
        
        ,[object Object], array[middle] == target:
            ,[object Object], middle  ,[object Object],
        ,[object Object], array[middle] < target:
            left = middle + ,[object Object],  ,[object Object],
        ,[object Object],:
            right = middle - ,[object Object],  ,[object Object],
    
    ,[object Object], -,[object Object],  ,[object Object],


,[object Object],
data = [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
,[object Object],(binary_search(data, ,[object Object],))  ,[object Object],
,[object Object],(binary_search(data, ,[object Object],))  ,[object Object],

Recursive Implementation

python
[object Object], ,[object Object],(,[object Object],):
    ,[object Object],
    ,[object Object], right ,[object Object], ,[object Object],:
        right = ,[object Object],(array) - ,[object Object],
    
    ,[object Object],
    ,[object Object], left > right:
        ,[object Object], -,[object Object],
    
    middle = left + (right - left) // ,[object Object],
    
    ,[object Object], array[middle] == target:
        ,[object Object], middle
    ,[object Object], array[middle] < target:
        ,[object Object],
        ,[object Object], binary_search_recursive(array, target, middle + ,[object Object],, right)
    ,[object Object],:
        ,[object Object],
        ,[object Object], binary_search_recursive(array, target, left, middle - ,[object Object],)

Recursion tree visualization for n = 8:

ini
                    ,[object Object], ,[object Object],=,[object Object],
                   /            \
            ,[object Object], ,[object Object],=,[object Object],        [,[object Object],...,[object Object],] n=,[object Object],
           /        \          /        \
      ,[object Object],    ,[object Object],  ,[object Object],    ,[object Object],  ,[object Object],=,[object Object],
      /    \     /    \    /    \     /    \
    ,[object Object],   ,[object Object], ,[object Object],   ,[object Object], ,[object Object],   ,[object Object],  ,[object Object],   ,[object Object],  ,[object Object],=,[object Object],

,[object Object], = log₂(,[object Object],) = ,[object Object], levels

Each level halves the problem size—classic divide-and-conquer.

Advanced: Finding Insertion Point (Lower Bound)

python
[object Object], ,[object Object],(,[object Object],):
    ,[object Object],
    left = ,[object Object],
    right = ,[object Object],(array)
    
    ,[object Object], left < right:  ,[object Object],
        middle = left + (right - left) // ,[object Object],
        
        ,[object Object], array[middle] < target:
            left = middle + ,[object Object],
        ,[object Object],:
            ,[object Object],
            right = middle
    
    ,[object Object], left


,[object Object],
data = [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
,[object Object],(binary_search_insert_position(data, ,[object Object],))   ,[object Object],
,[object Object],(binary_search_insert_position(data, ,[object Object],))   ,[object Object],
,[object Object],(binary_search_insert_position(data, ,[object Object],))  ,[object Object],

This variant is useful for:

Maintaining sorted lists with insertions
Finding ranges of duplicate elements
Database indexing operations

Variations & Extensions

Lower Bound and Upper Bound

Lower bound: First position where element >= target

Upper bound: First position where element > target

These let you find the range of duplicates in $O (lo g n)$ time.

Example:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Target: ,[object Object],

Lower bound: ,[object Object], (,[object Object], ,[object Object],)
Upper bound: ,[object Object], (,[object Object], element ,[object Object], ,[object Object],, which ,[object Object], ,[object Object],)
,[object Object], ,[object Object], ,[object Object],[object Object]

Useful for database queries like "find all records between dates A and B."

Binary Search on Answer Space

Concept: Binary search doesn't only work on arrays—it works on any monotonic function.

Problem: Find the square root of $x$ to 6 decimal places.

Solution: Binary search on the range $[0, x]$ :

python
[object Object], ,[object Object],(,[object Object],):
    left, right = ,[object Object],, x
    
    ,[object Object], right - left > precision:
        mid = (left + right) / ,[object Object],
        ,[object Object], mid * mid < x:
            left = mid
        ,[object Object],:
            right = mid
    
    ,[object Object], (left + right) / ,[object Object],

We're searching for the value where

value^{2} = x

, not an array index.

Applications:

Finding roots of equations
Minimizing/maximizing functions
Optimization problems ("minimize cost such that constraint is satisfied")

Binary Search on Rotated Arrays

Problem: Array is sorted, then rotated at an unknown pivot.

Example: [4, 5, 6, 7, 0, 1, 2] (rotated at index 4)

Challenge: Normal binary search fails because ordering is broken.

Solution: Modified binary search that determines which half is sorted:

python
[object Object], ,[object Object],(,[object Object],):
    left, right = ,[object Object],, ,[object Object],(array) - ,[object Object],
    
    ,[object Object], left <= right:
        mid = left + (right - left) // ,[object Object],
        
        ,[object Object], array[mid] == target:
            ,[object Object], mid
        
        ,[object Object],
        ,[object Object], array[left] <= array[mid]:  ,[object Object],
            ,[object Object], array[left] <= target < array[mid]:
                right = mid - ,[object Object],  ,[object Object],
            ,[object Object],:
                left = mid + ,[object Object],   ,[object Object],
        ,[object Object],:  ,[object Object],
            ,[object Object], array[mid] < target <= array[right]:
                left = mid + ,[object Object],   ,[object Object],
            ,[object Object],:
                right = mid - ,[object Object],  ,[object Object],
    
    ,[object Object], -,[object Object],

Still $O (lo g n)$ , but requires careful logic to handle the rotation.

Common Misconceptions

Misconception 1: "Binary search works on unsorted data"

False. Binary search's correctness relies on the sorted property.

Counterexample:

sql
[object Object],: [,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],, ,[object Object],]
Target: ,[object Object],

,[object Object], ,[object Object],:
,[object Object], middle ,[object Object], ,[object Object],, ,[object Object],[,[object Object],] ,[object Object], ,[object Object],
,[object Object], ,[object Object], ,[object Object], ,[object Object],, so ,[object Object], ,[object Object], half: [,[object Object],, ,[object Object],]
,[object Object], middle ,[object Object], ,[object Object],, ,[object Object],[,[object Object],] ,[object Object], ,[object Object],
,[object Object], ,[object Object], ,[object Object], ,[object Object],, so ,[object Object], left... but ,[object Object], ,[object Object], ,[object Object],, ,[object Object], ,[object Object], ,[object Object],
,[object Object], ,[object Object], found,[object Object],

But ,[object Object], ,[object Object], ,[object Object], the ,[object Object], ,[object Object], index ,[object Object],

Binary search fails silently on unsorted data—no error, just wrong results.

Misconception 2: "(left + right) / 2 is fine"

Risky. This causes integer overflow in languages with fixed-size integers.

Example in Java (32-bit ints):

java
[object Object], ,[object Object], ,[object Object], ,[object Object],;
,[object Object], ,[object Object], ,[object Object], ,[object Object],;
,[object Object], ,[object Object], ,[object Object], (left + right) / ,[object Object],;  ,[object Object],

,[object Object],
,[object Object],
,[object Object],
,[object Object],

Correct:

java
[object Object], ,[object Object], ,[object Object], left + (right - left) / ,[object Object],;

,[object Object],
,[object Object],
,[object Object],

This bug existed in Java's standard library for 12 years (1994-2006)!

Misconception 3: "Binary search is always faster than linear"

Not for small arrays. Binary search has higher overhead:

sql
Operations ,[object Object], comparison:
,[object Object], Linear ,[object Object],: ,[object Object], comparison, ,[object Object], increment
,[object Object], ,[object Object], ,[object Object],: ,[object Object], comparison, ,[object Object], subtractions, ,[object Object], division, ,[object Object], assignments

Crossover point: Typically around $n = 10$ .

Many optimized sorting algorithms (like Timsort, used in Python) switch to insertion sort for small subarrays.

Rule of thumb:

$n < 10$ : Linear search often faster

n = 10 - 100

: About equal

$n > 100$ : Binary search dominates

Advantages & Limitations

Advantages

Extremely efficient for large datasets
- Searching 1 million elements: ~20 comparisons
- Searching 1 billion elements: ~30 comparisons
Predictable performance
- Worst case = average case = $O (lo g n)$
- No performance surprises
Minimal memory
- Iterative version: $O (1)$ extra space
- Just a few integer variables
Provably optimal
- For comparison-based search in sorted arrays, you can't beat $O (lo g n)$

Limitations

Requires sorted data
- Sorting costs $O (n lo g n)$
- Only worth it if searching multiple times
- Breakeven: If you'll search

\geq n

times, sorting pays off

Requires random access
- Doesn't work on linked lists (accessing middle takes $O (n)$ )
- Requires array or similar structure
Not optimal for small arrays
- Overhead outweighs benefits for $n < 10$
Implementation subtlety
- Easy to get wrong (off-by-one errors, overflow)
- Use well-tested library implementations when possible

When to Use

Use binary search when:

Data is already sorted or will be searched many times
Working with large datasets ( $n > 100$ )
Need guaranteed $O (lo g n)$ performance
Using arrays or random-access structures

Avoid binary search when:

Data is unsorted and sorting cost isn't justified
Working with linked lists or sequential structures
Array is very small ( $n < 10$ )
Data changes frequently (insertions/deletions break sorting)

Alternative: Use hash tables for $O (1)$ average-case search (but with $O (n)$ space).

Comparison with Alternatives

Aspect	Binary Search	Linear Search	Hash Table	Binary Search Tree
Search Time	$O (lo g n)$	$O (n)$	$O (1)$ average, $O (n)$ worst	$O (lo g n)$ balanced
Requires Sorted	Yes	No	No	Maintains order
Space	$O (1)$	$O (1)$	$O (n)$	$O (n)$
Insert Cost	$O (n)$	$O (1)$ end	$O (1)$ average	$O (lo g n)$ balanced
Delete Cost	$O (n)$	$O (n)$	$O (1)$ average	$O (lo g n)$ balanced
Best Use Case	Read-heavy sorted data	Small/unsorted data	Frequent lookups	Dynamic sorted data

Key insight: Choose data structure based on operation frequency:

Mostly searching, rarely modifying → Binary search on sorted array
Frequent insertions/deletions → Balanced BST
Pure lookups, no order needed → Hash table
Small data → Linear search (simplicity wins)

Practice Exercises

Conceptual Questions

Q1: Why must the array be sorted?

Click to reveal answer

In an unsorted array:

ini
[object Object],, ,[object Object], = ,[object Object],
,[object Object], = ,[object Object],, array[,[object Object],] = ,[object Object],
1 < 8, search left: ,[object Object],

But 1 is actually at index 3 (right side). The algorithm fails because we can't trust the "left < right" property.

Q2: What's the maximum number of comparisons for $n = 1000$ ?

Click to reveal answer

$⌈ lo g_{2} 1000 ⌉ = ⌈ 9.97 ⌉ = 10$ comparisons.

Why? After $k$ comparisons, we've narrowed the search to $n / 2^{k}$ elements. We stop when this reaches 1:

\frac{1000}{2 ^{k}} = 1 ⟹ 2^{k} = 1000 ⟹ k = lo g_{2} 1000 \approx 9.97

Round up to 10 since we need an integer number of comparisons.

Q3: Can binary search work on linked lists efficiently?

Click to reveal answer

No. Binary search requires $O (1)$ access to the middle element. In a linked list:

ini
Array: ,[object Object], = array[n/,[object Object],]  -> O(,[object Object],)
Linked list: ,[object Object], = traverse n/,[object Object], nodes -> O(n)

Each iteration of binary search would take $O (n)$ to find the middle, giving total time $O (n lo g n)$ —worse than linear search!

Solution for linked lists: Use a balanced binary search tree instead ( $O (lo g n)$ search with $O (lo g n)$ insert/delete).

Coding Problems

Easy:

First Bad Version: Given n versions and a function isBadVersion(v), find the first bad version minimizing API calls.
- Hint: Binary search on version numbers

Medium: 2. Search in Rotated Sorted Array: Array is sorted then rotated (e.g., [4,5,6,7,0,1,2]). Find target in $O (lo g n)$ .

Hint: Determine which half is sorted

Find Peak Element: In array where arr[i] != arr[i+1], find any index i where arr[i] > arr[i-1] and arr[i] > arr[i+1].
- Hint: Binary search on slope direction

Hard: 4. Median of Two Sorted Arrays: Given two sorted arrays, find the median in $O (lo g (m + n))$ .

Hint: Binary search on partition points

Real-World Applications

Database Indexes (B-Trees)

Databases use B-trees, which are generalized binary search trees optimized for disk I/O:

sql
[object Object], index (linear scan):
,[object Object], ,[object Object], million ,[object Object],
,[object Object], ,[object Object], ,[object Object], ,[object Object],: ,[object Object],,,[object Object],,,[object Object], disk ,[object Object],
,[object Object], ,[object Object], ,[object Object],ms,[object Object],read: ,[object Object], hours

,[object Object], B,[object Object],tree index:
,[object Object], Height ,[object Object], ,[object Object],(n) ≈ ,[object Object],
,[object Object], ,[object Object], disk ,[object Object],
,[object Object], ,[object Object], ,[object Object],ms,[object Object],read: ,[object Object], seconds

Impact: 50,000× speedup. This is why "add an index" often magically fixes slow queries.

Git Bisect

Find which commit introduced a bug:

bash

$$
git bisect start
$$
 git bisect bad            ,[object Object],

$$
git bisect good v1.0      ,[object Object],

,[object Object],
$$
 run_tests.sh
,[object Object],

$$
git bisect bad
,[object Object],
$$
 git bisect good

,[object Object],
,[object Object],

For 1000 commits, finds the bug in ~10 tests instead of testing all 1000.

Operating Systems

OS maintains sorted free-block lists:

sql
[object Object], memory blocks: [,[object Object],[object Object],, ,[object Object],[object Object],, ...]

Allocation request: ,[object Object],KB
,[object Object], ,[object Object], ,[object Object], ,[object Object], suitable block: O(log n)
,[object Object], ,[object Object],: O(n) linear scan

Critical for OS performance as memory operations happen millions of times per second.

Standard Libraries

Binary search is ubiquitous:

C++ STL: std::binary_search(), std::lower_bound(), std::upper_bound()
Java: Arrays.binarySearch(), Collections.binarySearch()
Python: bisect.bisect_left(), bisect.bisect_right()
Go: sort.Search()

These are highly optimized, battle-tested implementations. Use them instead of rolling your own.

Prerequisites

Arrays - Random access data structure
Algorithm Analysis - Big-O notation
Recursion - For recursive implementation

Divide and Conquer - General paradigm
Binary Search Trees - Binary search in tree form
Sorting Algorithms - Prerequisite for binary search

Advanced Topics

Interpolation Search - $O (lo g lo g n)$ for uniformly distributed data
Exponential Search - For unbounded arrays
Ternary Search - For finding extrema in unimodal functions

References

Knuth, D. (1998). The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley.
Cormen, T., Leiserson, C., Rivest, R., Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
Bentley, J. (1999). Programming Pearls (2nd ed.). Addison-Wesley.
Bloch, J. (2006). "Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken." Google Research Blog.

The Core Concept

The Problem

Why Linear Search Falls Short

The Key Insight

Historical Context

The Surprisingly Tricky History

How Binary Search Works: The Mechanics

The Setup

The Algorithm: Step by Step

Complete Visual Walkthrough

Mathematical Formalization

Formal Algorithm Definition

Correctness Proof

Time Complexity Analysis

Space Complexity

Why Logarithmic is Powerful

Worked Example: Complete Trace

Problem

Detailed Execution

Practical Implementation Examples

Iterative Implementation (Recommended)

Recursive Implementation

Advanced: Finding Insertion Point (Lower Bound)

Variations & Extensions

Lower Bound and Upper Bound

Binary Search on Answer Space

Binary Search on Rotated Arrays

Common Misconceptions

Misconception 1: "Binary search works on unsorted data"

Misconception 2: "(left + right) / 2 is fine"

Misconception 3: "Binary search is always faster than linear"

Advantages & Limitations

Advantages

Limitations

When to Use

Comparison with Alternatives

Practice Exercises

Conceptual Questions

Coding Problems

Real-World Applications

Database Indexes (B-Trees)

Git Bisect

Operating Systems

Standard Libraries

Related Topics

Prerequisites

Related Algorithms

Advanced Topics

Further Reading

Textbooks

Papers

Online Resources

References

Related Technical Modules

Algorithms & Logic

Distributed Systems

binary search

The Core Concept

The Problem

Why Linear Search Falls Short

The Key Insight

Historical Context

The Surprisingly Tricky History

How Binary Search Works: The Mechanics

The Setup

The Algorithm: Step by Step

Complete Visual Walkthrough

Mathematical Formalization

Formal Algorithm Definition

Correctness Proof

Time Complexity Analysis

Space Complexity

Why Logarithmic is Powerful

Worked Example: Complete Trace

Problem

Detailed Execution

Practical Implementation Examples

Iterative Implementation (Recommended)

Recursive Implementation