【文件属性】:
文件名称:Spotting Outliers in Large Distributed Datasets using
文件大小:679KB
文件格式:PDF
更新时间:2021-06-25 04:06:31
Data Mining KDD
ABSTRACT
Outliers are abnormal instances or observations. Detecting data
outliers is a very important concept in Knowledge data discovery.
Outlier detection has been studied in the context of a large number
of research areas like large distributed systems, data mining,
wireless sensor networks(WSN), health monitoring, environmental
science, statistics, etc., Density based (DB) outlier detection
techniques are robust in detecting outliers. In many applications,
too much voluminous distributed data is generating every day.
Finding deviating observations in the large distributed database
rather than in any individual database is not a simple task.
Integrating distributed database cause two major problems. First,
render massive data from different databases. In addition, data
integration may cause violation of data security and leakage of
sensitive information. In this work we propose cell density based
mechanism for outlier detection (CDOD) in large distributed
databases. A centralized detection paradigm is used; it allows
overcoming the expensive data integration and information
leakage. The experimental results show robustness for finding
outliers in large number of databases, instances and attributes