读书笔记,及时更新
2009-01-23
以下提到的是pdf文件的页数,实际书本的页数是要减去27
P8 content
***** 2009-01-23 *****
P26
The story of the cover illustration
Inhabitants are differed by their dress codes
P31
Ref book "Wisdom of the crowds" by James Surowiecki
'when 4 basic conditions are met, A crowd’s collective intelligence will produce better results than those of a small group of experts'
——Good point but too simple
P32
Ref book The Hundredth Monkey by Ken Keyes in 1952
Early research in group behavior.
Example:
1 monkey learnt to wash potatoes in 1952
say 99 monkeys learnt to wash potatoes in 1958
next day thousands of monkeys learnt it
100 is the tipping point
End 47
***** 2009-2-10 chap 2 *****
P50
Asynchronous and synchronous services' list and architecture
P57-58
vector space model for text analysis
Tokenization -> Normalize -> Eliminate -> Stemming
P60
similarity metric from metadata(P53)
See user the same as items
End 68
***** 2009-2-24 Chap 3*****
P72 correlation calculation
P81 three kinds of tages:professional,user,machine
P84 tag cloud
P94 tag database and SQl query
coding...
P109 Chap 4
P115 blog database, and wiki,group,so on
coding...
P125 tag cloud for the blog text
P134 Chap 5
P136 RSS history, in XML finally
P155 technorati blogsearch-don't understand yet
Search blog from blog-tracking providers by API or RSS with out crawler
P162 Bloglines search API
End 172
***** 2009-3-26 Chap 6*****
P197 a not so good example for MapReduce
P202 Chap 7 data mining
P208 Summery of data mining algorithms
End 210
***** 2009-5-6 Chap 7*****
P213 WEKA and package APIs(java opensource)
P221 standart API JDM(java data mining)
P233 Chap 8 text analysis
P235 lucene for text search(java opensource)
P264 text match for ad business
End 267
***** 2009-7-27 Chap 9*****
P289 WEKA api P295 JDM api
P301 Chap 10 prediction
Decision trees, Naive Bayes, and belief networks?
P336 Chap 11 intelligent search
P376 Chap 12 Recommendation engine
End