July 8, 2011
A few weeks back I was at the Basis Technology Government Users Conference in Washington, DC. It was an interesting experience, meeting people from agencies responsible for processing lots of important data. One thing I noticed is that in the Bay area, your name tag at an event tries to convey that you’re working on super-cool stuff. Here in DC, it’s more cool to be classified. For example, name tags more…
November 16, 2009
I’m going to be giving a talk at the Bay Area ACM data mining SIG in December, and I need to finalize my topic soon – like today 🙂 I was going to expand on my Elastic Web Mining talk (“Web mining for SEO keywords”) from the ACM data mining unconference a few weeks back. But the fact that I’ll have 10s to 100s of millions of web page data more…
November 2, 2009
Here’s the presentation I gave at the ACM data mining unconference on elastic web mining – how to create scalable, reliable and cost effective web mining solutions using an open source stack (Hadoop, Cascading, Bixo) running in Amazon’s Elastic Compute Cloud (EC2). [slideshare id=2407600&doc=acmuctalk-091102194640-phpapp02] But I don’t see my notes showing up, so here’s the PDF version with full notes, which make the resulting slides a lot more meaningful. [slideshare more…
October 30, 2009
This coming Sunday is the big Bay Area data mining “unconference“, and with more than 200 people already signed up, it’s going to be a lot of fun. I’ll be presenting at some point during the day – since it’s an unconference, you don’t really know who’s going to be talking about what/when. My topic is “Elastic web mining using open source (Hadoop/Cascading/Bixo) in Amazon’s EC2 cloud“. If you scan more…