대회

CIBMTR - Equity in post-HCT Survival Predictions #12 Deep understanding of (C-index) evaluation measure for better model

dongsunseng 2025. 2. 10. 20:11
반응형

Annotation of this discussion: https://www.kaggle.com/competitions/equity-post-HCT-survival-predictions/discussion/550152

 

CIBMTR - Equity in post-HCT Survival Predictions

Improve prediction of transplant survival rates equitably for allogeneic HCT patients

www.kaggle.com

Deep understanding of (C-index) evaluation measure for better model

I will try to explain the C-index evaluation measure of the this competition in order to train the model well because 75% of the data is not included in the test data so understanding of the measure is very important.

 

Lets start with three patients groups:

  • Group A
  • Group B
  • Group C

For each patient, we will predict risk score (higher score means higher risk of early event).

Step 1: Understanding Concordance Index

The Concordance Index (C-index) evaluate how well the model ranks survival times.

 

Understand with sample data:

Group A has 3 patients with actual survival times and predicted risk scores:

Comparable pairs:

  • (P1, P2): P2 has a shorter survival time and a higher risk score → Concordant 
  • (P1, P3): P3 has a longer survival time and a lower risk score → Concordant 
  • (P2, P3): P3 has a longer survival time and a lower risk score → Concordant 

Total pairs = 3
Total concordant pairs = 3

C-index for Group A = Concordant pairs/Total pairs= 3/3 = 1.0

Step 2: Calculate C-index for All Groups

Repeat the process for all groups.

 

For now we can assume:

  • Group A: C-index = 1.0
  • Group B: C-index = 0.8
  • Group C: C-index = 0.6

Step 3: Stratified Concordance Index

The Stratified Concordance Index combines the C-index scores of all groups and focusing on the following:

  1. Average performance across groups (mean of C-indices).
  2. Consistency across groups (low standard deviation of C-indices).

Formula:

Stratified C-index = Mean(C-index scores) - Standard Deviation(C-index scores)

  1. Calculate the mean:
    Mean=1.0 + 0.8 + 0.6/3 = 0.8
  2. Calculate the standard deviation:
    Standard Deviation= sqrt((1.0-0.8)^2 + (0.8-0.8)^2 + (0.6-0.8)^/3) = 0.16
  3. Stratified C-index:
    Stratified C-index = 0.8 - 0.16 = 0.64

Step 4: Interpret the Results

A high Stratified C-index means:

  • The model predicts well overall (high mean C-index).
  • The model predicts equitably across racial groups (low standard deviation).

Finally we can say:

  • Group A predictions are perfect (C-index = 1.0).
  • Group B is decent (C-index = 0.8).
  • Group C struggles (C-index = 0.6).

The Stratified C-index = 0.64 showing that while predictions are good overall, the model is less consistent across groups.


실패를 미리 두려워할 필요는 없다.
- 버트런드 러셀 -
반응형