반응형
Multiclass Classification can be divided into 2 categories: ordinal classification and nominal classification.
- For nominal(명목형) classification problems, you can think of a multiclass classification algorithm that outputs probability distributions like [0.7, 0.1, 0.2] for distinguishing between car, human, and tree.
- For ordinal classification, you can think of a problem that categorizes a child's computer addiction into 4 ordered levels: Very Severe, Severe, Moderate, and Good.
Solving Nominal Classification Problems
- No order or magnitude relationship between classes.
- Sum of outputs must be 1 (probability).
- Independent threshold setting for each class.
- Primarily uses one-vs-rest approach.
- Typically uses Softmax function.
# 일반적인 다중분류 신경망 예시
class NominalClassifier(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.Linear(input_size, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 3), # 3개 클래스
nn.Softmax(dim=1) # 확률 합이 1이 되도록
)
def forward(self, x):
return self.model(x)
# 손실 함수
criterion = nn.CrossEntropyLoss()
What does threshold optimization mean in nominal classification problems:
- Generally, multi-class models output probability values for each class.
- By default, predictions are made based on the class with the highest probability.
- However, this default approach isn't always optimal.
- Therefore, we use threshold optimization to
- Address class imbalance problems
- Adjust False Positive/Negative ratios for specific classes
1. Class Imbalance
# For imbalanced data
# Dogs: 1000 samples, Cats: 100 samples, Birds: 50 samples
# Lower thresholds for minority classes to increase prediction opportunities
thresholds = [0.6, 0.4, 0.3] # Higher for majority class, lower for minority classes
2. Different Misclassification Costs(오분류 비용이 다른 경우)
# 예: 새를 강아지로 잘못 분류하는 것이 고양이로 잘못 분류하는 것보다 심각한 경우
def cost_sensitive_predict(probs):
# 새에 대한 임계값을 낮게 설정
thresholds = [0.5, 0.5, 0.3]
predictions = (probs >= thresholds)
# ... 판단 로직
3. Adjusting Precision/Recall for Specific Classes(특정 클래스의 정밀도/재현율 조정)
# 강아지 클래스의 정밀도를 높이고 싶은 경우
# 강아지 클래스의 임계값을 높게 설정
thresholds = [0.7, 0.5, 0.5]
# 강아지 클래스의 재현율을 높이고 싶은 경우
# 강아지 클래스의 임계값을 낮게 설정
thresholds = [0.3, 0.5, 0.5]
Types:
1. Optimization through Grid Search
- Set threshold values independently for each class.
- Search for optimal values by trying all possible combinations.
# 순서관계가 없는 경우: 각 클래스를 독립적으로 처리
thresholds = np.arange(0.1, 0.9, 0.1)
for t1 in thresholds:
for t2 in thresholds:
pred = (proba > [t1, t2]).astype(int)
2. ROC Curve Analysis
- Process by converting each class into a binary classification problem.
- Optimize performance of each class independently without considering order.
# One-vs-Rest 방식으로 각 클래스별 독립적인 ROC 분석
for class_idx in range(n_classes):
fpr, tpr, thresholds = roc_curve(y_true[:, class_idx], y_pred[:, class_idx])
# Youden's J statistic
j_scores = tpr - fpr
optimal_threshold = thresholds[np.argmax(j_scores)]
3. Using Precision-Recall Curves
- Independent optimization for each class.
- Effective with imbalanced data.
# 각 클래스별로 독립적인 PR 곡선 분석
for i in range(n_classes):
precision, recall, thresholds = precision_recall_curve(y_true[:, i], y_pred[:, i])
f1_scores = 2 * (precision * recall) / (precision + recall)
4. Cost Function-based Optimization
- Considers only misclassification costs without regard to order relationships.
- Enables independent cost setting for each class.
def custom_cost(threshold, proba, y_true):
pred = (proba > threshold).astype(int)
fp_cost = 1
fn_cost = 2
5. Validation through Cross-Validation
- Validation technique applicable to all optimization methods.
- Used regardless of order relationships.
for train_idx, val_idx in kf.split(X):
fold_threshold = find_optimal_threshold(X[train_idx], y[train_idx])
Solving Ordinal Classification Problems
- Order relationship exists between classes.
- Relationships between adjacent classes are important.
- Requires special ordinal encoding/decoding.
class OrdinalClassifier(nn.Module):
def __init__(self):
super().__init__()
self.base_model = nn.Sequential(
nn.Linear(input_size, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU()
)
# 각 임계점에 대한 이진 분류기
self.thresholds = nn.Linear(64, 3) # 4개 클래스면 3개 임계점
def forward(self, x):
features = self.base_model(x)
# 누적 확률 계산
cumulative_probs = torch.sigmoid(self.thresholds(features))
# 클래스 확률 계산
probs = torch.zeros(x.size(0), 4) # 4개 클래스
probs[:, 0] = 1 - cumulative_probs[:, 0]
probs[:, 1] = cumulative_probs[:, 0] - cumulative_probs[:, 1]
probs[:, 2] = cumulative_probs[:, 1] - cumulative_probs[:, 2]
probs[:, 3] = cumulative_probs[:, 2]
return probs
# 순서를 고려한 손실 함수
class OrdinalLoss(nn.Module):
def __init__(self):
super().__init__()
def forward(self, predictions, targets):
# 순서 관계를 반영한 가중치 부여
weights = torch.abs(
torch.arange(predictions.size(1))[None, :] -
targets[:, None]
)
return torch.mean(weights * nn.CrossEntropyLoss(reduction='none')
(predictions, targets))
What does threshold optimization mean in nominal classification problems:
- Finding the optimal thresholds for converting predicted values into actual classes.
- Finds the best boundary values to divide continuous model predictions into four classes (0, 1, 2, 3).
- Aims to find thresholds that maximize the Quadratic Weighted Kappa score for example of a evaluation method.
KappaOptimizer = minimize(evaluate_predictions,
x0=[0.5, 1.5, 2.5], # initial thresholds
args=(y, oof_non_rounded), # actual and predicted values
method='Nelder-Mead') # optimization algorithm
- Separation
- x < 0.5 is class 0
- 0.5 ≤ x < 1.5 is class 1
- 1.5 ≤ x < 2.5 is class 2
- x ≥ 2.5 is class 3
- Process
- Uses the Nelder-Mead algorithm to iteratively adjust thresholds
- For each attempt, calls evaluate_predictions function to:
- Convert predicted values to classes using current thresholds
- Calculate Quadratic Weighted Kappa score
- Return negative score (since minimize function minimizes, we use negative for maximization)
- tpTuned = threshold_Rounder(tpm, KappaOptimizer.x)
- Uses the found optimal thresholds (KappaOptimizer.x) for final predictions
Types:
1. Kappa Optimization using Nelder-Mead
- Also called as Simplex Method.
- Nonlinear Optimization Algorithm.
- Particularly useful for functions that are non-differentiable or complex.
2. Cumulative Probability-based Threshold Optimization
3. Cost Function Optimization Considering Order
4. Binary Classifier Combination using Frank & Hall Method
5. Threshold Optimization through Cross-validation
6. Ensemble-based Threshold Optimization
Needs to be updated with more details (2024.11.17)
I'm not here to take part; I'm here to take over.
- Conor Mcgregor -
반응형
'캐글 보충' 카테고리의 다른 글
[Kaggle Extra Study] 18. Types of Correlation Analysis (0) | 2024.11.23 |
---|---|
[Kaggle Extra Study] 16. Handling Categorical Variables (2) | 2024.11.11 |
[Kaggle Extra Study] 15. GBM vs. XGBoost (0) | 2024.11.10 |
[Kaggle Extra Study] 14. Tree-based Ensemble Models (1) | 2024.11.10 |
[Kaggle Extra Study] 13. Weight Initialization (3) | 2024.11.09 |