Track: Search
Paper Title:
First-order Focused Crawling
Authors:
Abstract:
This paper reports a new general framework of focused web
crawling based on relational subgroup discovery. Predicates
are used explicitly to represent the relevance clues of
those unvisited pages in the crawl frontier, and then firstorder
classification rules are induced using subgroup discovery
technique. The learned relational rules with sufficient
support and confidence will guide the crawling process afterwards.
We present the many interesting features of our
proposed first-order focused crawler, together with preliminary
promising experimental results.