Please see our Training page for details about all of our courses.
In my previous blog post on text feature selection, I'd covered some of the key steps:
Extract the relevant text from the content.
Tokenize this text into discrete words.
Normalize these words (case-folding, stemming)
(and a bit of filtering out “bad words”).
In this blog post I'm going to talk about improving the quality of more...
Ken will be giving a talk on Thursday, September 11th at this year's Cassandra Summit in San Francisco. His presentation describes how Early Warning (one of Scale Unlimited's clients) uses Cassandra and Solr to handle fuzzy entity matching across hundreds of millions of people and companies.