当前位置: 数据挖掘：基于实例的分类器-最近邻KNN > 正文

数据挖掘：基于实例的分类器-最近邻KNN

专题: 散文简友广场想法心理
作者:Cache_wood 来源:原文地址时间:2022-04-12 16:30:54 阅读:458 网上投稿

@[toc]

Examples:

Rote-learner

Memorizes entire training data and performs classification only if attributes of record match one of the training examples exactly.
Nearest neighbor

Uses k "closest" points(nearest neighbors) for performing classification.

Requires three things

To classify an unknown record

Compute distance to other training records.
Identify k nearest neighbors.
Use class labels of nearest neighbors to determine the class label of unknown record(e.g., taking majority vote)

Definition of Nearest Neighbor

K-nearest neighbors of a record x are data points that have the k smallest distance to x.

Nearest Neighbor Classification

Compute distance between two points:

Euclidean distance :

Determine the class from nearest neighbor list

Choosing the value of k:

Scaling issues

Attributes may have to be scaled to prevent distance measures from being dominated by one of the attributes.

Problem with Euclidean measure:

k-NN classifiers are lazy learners

Example: PEBLS

PEBLS: Parallel Examplar-Based Learning System(Cost & Salzberg)

Works with both continuous and nominal features.
- For nominal features, distance between two nominal features.
Each record is assigned a weight factor.
Number of nearest neighbor, k=1

Distance between record X and record Y:

where: = Number of times X is used for prediction/(Number of times X predicts correctly)

if X makes accurate prediction most of the time.

if X is not reliable for making predicti

相关美文阅读：