Track: Semantic Web
Paper Title:
Using Google Distance to Weight Approximate Ontology Matches
Authors:
Abstract:
Discovering mappings between concept hierarchies is widely regarded as
one of the hardest and most urgent problems facing the Semantic Web. The
problem is even harder in domains where concepts are inherently vague
and ill-defined, and cannot be given a crisp definition. A notion of
approximate concept mapping is required in such domains, but
until now, no such notion is available.
The first contribution of this paper is a definition for approximate mappings between concepts. Roughly, a mapping between two concepts is decomposed into a number of submappings, and a sloppiness value determines the fraction of these submappings that can be ignored when establishing the mapping.
A potential problem of such a definition is that with an increasing sloppiness value, it will gradually allow mappings between any two arbitrary concepts. To improve on this trivial behaviour, we need to design a heuristic weighting which minimises the sloppiness required to conclude desirable matches, but at the same time maximises the sloppiness required to conclude undesirable matches. The second contribution of this paper is to show that a Google-based similarity measure has exactly these desirable properties.
We establish these results by experimental validation in the domain of musical genres. We show that this domain does suffer from ill-defined concepts. We take two real-life genre hierarchies from the Web, we compute approximate mappings between them at varying levels of sloppiness, and we validate our results against a hand-crafted Gold Standard.
Our method makes use of the huge amount of knowledge that is implicit in the current Web, and exploits this knowledge as a heuristic for establishing approximate mappings between ill-defined concepts.