Boyer-Moore string search algorithm in Ruby

Just a quick post. I’ve converted the C code from the wikipedia entry (this version) on the Boyer-Moore string search algorithm to Ruby. I’ve extended it to support searches on token arrays and regular expressions.

You can find the code on github.

Usage:, needle)   # returns index of needle or nil


Basic search in string:"ANPANMAN", "ANP")   # => 0"ANPANMAN", "ANPXX") # => nil"foobar", "bar")     # => 3

You can also search an array of tokens:["<b>", "hi", "</b>"], ["hi"])         # => 1["bam", "foo", "bar"], ["foo", "bar"]) # => 1["bam", "bar", "baz"], ["foo"])        # => nil

A token can be a regular expression:["Sing", "99", "Luftballon"], [/\d+/]) == 1["Nate Murray", "5 Pine Street", "Los Angeles", "CA", "90210"], [/^\w{2}$/, /^\d{5}$/]) == 3


The regular-expression token matching is a bit of a hack and will be fairly slow because every hash miss is compared against every regular expression key. You probably shouldn’t use the regular expression token search for anything more than a toy.

Download the Boyer-Moore string search algorithm in Ruby.

  • Reddit
  • Technorati
  • Twitter
  • Facebook
  • Google Bookmarks
  • HackerNews
  • PDF
  • RSS
This entry was posted in programming. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.