Nate Murray is a programmer, musician, and beekeeper. I work at IFTTT and I have been working with terabyte-scale data since 2009. My work involves large-scale data mining with MapReduce, distributed computing, iOS apps, and a few web apps.
Some of my recent projects include:
- Building a web crawler from the ground up for AT&T Interactive that is capable of fetching 1 million pages per hour from a single machine. The fetcher was written in Clojure and the crawl planning was calculated in Hadoop MapReduce.
- BeeSaving.com, a site committed to saving bees from Colony Collapse Disorder, in part, by connecting Beekeepers with geographically relevant swarms.
- iPad Game for Cats a top 100-iPad entertainment app, featured in the New York Times, The Los Angeles Times, ABC News, and more.
I have a wide variety of open-source projects which you can view on github. Here’s a handful:
- cascading-simhash: simple simhashing in Hadoop with Cascading
- Smoker: A library for writing Hive UDFs in Clojure
- Similarity: Document similarity in large collections with MapReduce in Scala
- Chordjerl: An Erlang implementation of the CHORD distributed hash lookup protocol
- Interval: A Ruby library that calculates musical note pitch and interval arithmetic
- gen_cluster: An Erlang behavior for distributed node clustering
- Stoplight: Fully distributed mutex server in Erlang (based on the SIGMA algorithm)
- Backupgem: An end-to-end rotational backup and archive tool written in Ruby
- Shapes Panels: An Objective-C iPhone library for creating interactive pages of panels (used in our iPhone game Jacob’s Shapes)
- BoyeRMoore: Boyer-Moore string search algorithm in Ruby (supports tokens and regexps)