Never make a mistake in a Binary Search again!

Binary Search is Hard (well, at least Medium)

When writing code for binary search problems, it is common to make a "one-off" mistake in the implementation of the algorithm - to forget "whether it is 'mid+1' or just 'mid'".

To solve this issue and firmly understand how to write error-less binary search, it can be useful to:

  1. Choose "a truthful statement", that is always being upheld in the window, defined by the 'left' and the 'right' indexes (inclusive)

  2. What value are we looking for when we check by 'mid'

  3. If by 'mid' is not the desired value, what parts of window should we maintain: [left, mid] or [mid, right]

An example is in order:

https://leetcode.com/problems/find-first-and-last-position-of-element-in-sorted-array/description/

We are asked to find the leftest & rightest entry of given number.

[4,6,8,9,9,9,9,42,69,1337], target: 9

We would get [3,6]

Let's try to think out a binary search, which would find the leftest entry of a number, using the mentioned 3 steps:

The Truthful Statement.

We are asked to find the leftest value of X, so it is reasonable to tie the statement to it.

Let's make the Truthful Statement definitively inclusive of both its ends L & R. The Statement will be "between L & R inclusive, there IS the leftest value of X".

What value is 'mid' supposed to be?

It is often reasonable to check 'mid' for exactly what we are trying to find. In our example - "the leftest value of X".

If it is not a desired value, what parts should we save - left or right?

In our example [4,6,8,9,9,9,9,42,69,1337], the first 'L' will be 0, the first 'R' will be '9', and the first 'mid' will be 4. Let's say we are searching for the leftest 9.

By this definition of the sought value, "leftest X" will be smaller than all "non-leftest X" (so we could definitively understand whether to go into the left/right part of split). The value by index 4 will not be the sought value - "the leftest 9", so it will be larger than "the leftest 9".

So we have 3 parts, that the current window [0;9] is split into:

  1. [0;3] - values smaller or equal to 9 - so, smaller or equal to sought val

  2. [4;4] - the mid value == 9, which we concluded, is not the sought value - so, larger than the sought val

  3. [5;9] - values larger or equal to 9 - so, larger than the sought val

The only part, where the sought value can be is (1), because, so we need to go left, and move our inclusive window there - make R == 3, so it becomes [0; 3].

Let's do the other iterations of this binary search

2. Remaining window [4,6,8,9], L == 0, R == 3, mid == 1; is sought val? no, is larger or smaller than sought? smaller; window split into: [0;0] - less than sought, [1;1] - less than sought, [2;3] - more or equal to sought; Move L to 2

3. Remaining window [8,9], L == 2, R == 3, mid == 2; sought val? no, is larger or smaller than sought? smaller; window split into [2;1] - values lefter than idx 2, no values, [2;2] - smaller than sought, [3;3] - more or equal to sought, so move L to 3

4. Remaining window [9,9], L == 3, R == 3, mid == 3; sought val? true -> return.

Let's run this approach on a case when there is no correct answer in the array:

[4,6,8,9,9,9,9,42,69,1337], searching for the leftest 7.

1. [4,6,8,9,9,9,9,42,69,1337], L == 0, R == 9, mid == 4, is sought? no; by mid larger or smaller? larger; [1;3] - less or eq, [4;4] - larger, [5;9] - larger; R = 3

2. [4,6,8,9], L == 0, R == 3, mid == 1; sought? no, by mid: smaller, [0;0] - smaller, [1;1] - smaller, [2;3] - larger or equal; L = 2

3. [8;9], L == 2, R == 3, mid == 2, is sought? no, by mid: larger; [2;1] - non-ex smaller, [2;2] - larger, [3;3] - larger; R = 1;

4. The remaining window is L == 2, R == 1 - non-existing window, so exit without finding the answer.

The code which implements the given principle:

    int leftBS(int[] a, int t) {
        //initial window
        int l = 0, r = a.length-1;
        //inclusive window
        while(l<=r) {
            int mid = (l+r)/2;            
            boolean isByMidSought = a[mid] == t && (mid == 0 || a[mid-1] < t);
            if(isByMidSought) return mid;
            boolean isByMidLarger = a[mid] == t || a[mid] > t;
            //always not include mid into new window
            if(isByMidLarger) r = mid-1;
            else l = mid+1;
        }

        //if the inclusive window became non-existent, 
        //that means we didn't find the sought value
        return -1;
    }

The Leetcode task also requires finding the rightest X, so you can try to use this principle to solve it as a practice.

For now - that's it folks, like, share, subscribe :)