The recent Hack/Reduce hackathon in Montreal was a tonne of fun. Our team tackled a data set of consisting of Bixi (Montreal’s bicycle share system) station states at one minute temporal resolution. We used Hadoop and mapreduce to pull out some features of user behaviours. One of the things we extracted was the flux at each station, which we defined as the number of bikes arriving and departing from a given station per unit time…
The day started out with coffee and a presentation by the team at Hopper about Mapreduce and about using the infrastructure and data mappers we had created. We quickly ran through a tutorial and examples available on github. Most people that were present hadn’t used Mapreduce before so the overview was quite exhaustive and took about one and a half hours. We also gave a rundown of accessing and running on the Amazon Elastic Mapreduce instance that we were using.
Greg Lu, Joost Ouwerkerk and Frédéric Lalonde giving the rundown to the infrastructure and mapreduce
Some teams had already started hacking during the presentation. For the ones who didn’t already have a set team we gave the opportunity to pitch their idea quickly or just talk about what they might want to build. About 15 people gave a short pitch about what they might want to build. Several teams were formed from this when people got some idea of what the others’ were interested in and could connect with other developers with matching interests. Continue reading →