World’s Fastest Binary Search?

(Edited 9 Sept 2011: A few minor improvements have been made; see the updated post instead: Binary Search Revisited.)

Every CS student knows how to write a binary search algorithm. The question is, are we really making full use of the fact that we are using a binary computer to do a binary search? While it is true that computers are good at doing arithmetic, they are even better at doing pure bit logic. Often times our grammar school arithmetic way of thinking can get in the way of solving a problem that has a more elegant solution in terms of lower-level bit logic. Binary search in an ordered list is one such problem.

Your homework problem: re-write the typical textbook binary search in an ordered list algorithm, making it faster, without doing any arithmetic within the main loop of the search algorithm (underlying pointer arithmetic by direct access of array elements is ok).

Solution in C++:

// Fastest binary search
#include <iostream>
#include <math.h>

// Given a value x, return the index of the largest
// element in a sorted list less than or equal to x.
// Return -1 if x is less than all elements, or list
// is empty. T can be any type for which <= is defined.
template <typename T>
int fbsearch( const T *sorted_list, size_t list_size, T x )
    if( list_size <= 1 )
        return list_size && *sorted_list <= x ? 0 : -1;
    unsigned i = 0;
    unsigned b = 1 << int( log(list_size-1) / M_LN2 );
    for( ; b ; b >>= 1 )
        unsigned j = i | b;
        if( j < list_size && sorted_list[j] <= x ) i = j;
    return i || *sorted_list <= x ? i : -1;

using namespace std;

int main()
    const int sorted_list[] =
        { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 };
    const size_t list_size = sizeof(sorted_list)/sizeof(int);
    int i = 7;
    cout << "fbsearch(sorted_list,"<<list_size<<‘,’<<i<<") = "
         << fbsearch(sorted_list,list_size,i) << endl;
    return 0;

Technical Notes:

  • Feel free to add a statement within the loop that returns the index if found. Note that for large n, you will spend more time on average checking for equality than the time you will save by returning early.
  • Instead of dividing by ln(2), feel free to save the constant 1/ln(2) and multiply by it instead.

So what’s going on here? Take a list of the first 10 prime numbers, for example, and enumerate them using base-2:

0000   2
0001   3
0010   5
0011   7
0100  11
0101  13
0110  17
0111  19
1000  23
1001  29

If we are given one of the items on the list, say 13, then we are looking for its 4-bit index. Each bit is a yes/no question. The first bit asks, “Are you at index 1000 or higher?” If no, then the second bit asks, “Are you at index 0100 or higher?” If yes, then the third bit asks, “Are you at 0110 or higher?” If no then the last bit asks, “Are you at 0101 or higher?” By now we have asked all 4 questions, and thus have all 4 bits of the index.

Moral of the story: Sometimes it’s better to think like a silicon being than a carbon one.

p.s. Happy birthday, Nate!

  • Reddit
  • Technorati
  • Twitter
  • Facebook
  • Google Bookmarks
  • HackerNews
  • PDF
  • RSS
This entry was posted in programming. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.
  • Chao Xu

    This is great. It is only faster than the normal binary search with a bit more optimization:
    unsigned b = (1<<(31-__builtin_clz(list_size-1)));
    without this line, it is horribly inefficient(at least on small arrays)

  • Matt Pulver

    @Chao: Thanks for that. There’s also a condition in which the array index doesn’t need to be checked to be in bounds every iteration. Other improvements have been made here: Binary Search Revisited.

  • Daniel

    You have a mistake. The calculation: int( log(list_size-1) / M_LN2 ) is not accurate so you might loose a number. For example when list length is 8193, due to floating point division it is treated as 8192. There is more accurate method of computing log() using only integer calculations

    __inline int getClosestPowerOfTwo1(int x){


    x |= x >> 1;

    x |= x >> 2;

    x |= x >> 4;

    x |= x >> 8;

    x |= x >> 16;


    return (x>>1);


  • Matt Pulver

    int( log(list_size-1) / M_LN2 ) evaluates to 13 when list_size=8193, which it is supposed to. Just as it evaluates to 3 when list_size=9, as in the example.