The past few evenings I’ve been working through a review copy of Google’s PageRank and Beyond, by Amy Langville and Carl Meyer. Unlike some recent books on Google, this isn’t exactly an easy and engaging summer read. However, if you have an interest in search algorithms, applied math, search engine optimization, or are considering building your own search engine, this is a book for you.
Students of search and information retrieval literature may recognize the authors, Langville and Meyer, from their review paper, Deeper Inside PageRank. Their new book expands on the technical subject material in the original paper, and adds many anecdotes and observations in numerous sidebars throughout the text. The side notes provide some practical, social, and recent historical context for the math being presented, including topics such as “PageRank and Link Spamming”, “How Do Search Engines Make Money?”, “SearchKing vs Google”, and a reference to Jeremy Zawodny’s PageRank is Dead post. There is also some sample Matlab code and pointers to web resources related to search engines, linear algebra, and crawler implementations. (The aspiring search engine builder will want to explore some of these resources and elsewhere to learn about web crawlers and large scale computation, which is not the focus here.)
This book could serve as an excellent introduction to search algorithms for someone with a programming or mathematics background, covering PageRank at length, along with some discussion of HITS, SALSA, and antispam approaches. Some current topics, such as clustering, personalization, and reputation (TrustRank/SpamRank) are not covered here, although they are mentioned briefly. The bibliography and web resources provide a comprehensive source list for further research (up through around 2004), which will help point motivated readers in the right direction. I’m sure it will be popular at Google and Yahoo, and perhaps at various SEO agencies as well.
Those with less interest in the innards of search technology may enjoy a more casual summer read about Google, try John Battelle’s The Search. Or get Langville and Meyers’ book, skip the math, and just read the sidebars.
See also: A Reading List on PageRank and Search Algorithms, my del.icio.us links on search algorithms