Publication Type:Conference Paper
Source:Proceedings of ICDE 2008, Cancun, Mexico (2008)
Keywords:crawling, graph theory
Given a dynamic corpus whose content and attention are changing on a daily basis, is it possible to collect and maintain the high-quality resources with a minimal investment? We address two problems that arise from this question for hyperlinked corpora such as Web pages or blogs: how to efficiently discover the correct set of authoritative resources given a fixed network, and how to track these resources over time as new entrants arrive, old standbys depart, and existing participants change roles.