Cross-outlier detection

Spiros Papadimitriou, Christos Faloutsos

Research output: Chapter in Book/Report/Conference proceedingChapter

24 Scopus citations

Abstract

The problem of outlier detection has been studied in the context of several domains and has received attention from the database research community. To the best of our knowledge, work up to date focuses exclusively on the problem as follows [10]: "given a single set of observations in some space, find those that deviate so as to arouse suspicion that they were generated by a different mechanism." However, in several domains, we have more than one set of observations (or, equivalently, as single set with class labels assigned to each observation). For example, in astronomical data, labels may involve types of galaxies (e.g., spiral galaxies with abnormal concentration of elliptical galaxies in their neighborhood; in biodiversity data, labels may involve different population types, e.g., patches of different species populations, food types, diseases, etc). A single observation may look normal both within its own class, as well as within the entire set of observations. However, when examined with respect to other classes, it may still arouse suspicions. In this paper we consider the problem "given a set of observations with class labels, find those that arouse suspicions, taking into account the class labels." This variant has significant practical importance. Many of the existing outlier detection approaches cannot be extended to this case. We present one practical approach for dealing with this problem and demonstrate its performance on real and synthetic datasets.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsThanasis Hadzilacos, Yannis Theodoridis, Yannis Manoloponlos, John F. Roddick
PublisherSpringer Verlag
Pages199-213
Number of pages15
ISBN (Print)3540405356, 9783540405351
DOIs
StatePublished - 2003
Externally publishedYes

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2750

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Cross-outlier detection'. Together they form a unique fingerprint.

Cite this