반응형
#1 Detecting missing values
# credit: https://www.kaggle.com/willkoehrsen/start-here-a-gentle-introduction.
def missing_values_table(df):
# Total missing values
mis_val = df.isnull().sum()
# Percentage of missing values
mis_val_percent = 100 * df.isnull().sum() / len(df)
# Make a table with the results
mis_val_table = pd.concat([mis_val, mis_val_percent], axis=1)
# Rename the columns
mis_val_table_ren_columns = mis_val_table.rename(
columns = {0 : 'Missing Values', 1 : '% of Total Values'})
# Sort the table by percentage of missing descending
mis_val_table_ren_columns = mis_val_table_ren_columns[
mis_val_table_ren_columns.iloc[:,1] != 0].sort_values(
'% of Total Values', ascending=False).round(1)
# Print some summary information
print ("Your selected dataframe has " + str(df.shape[1]) + " columns.\n"
"There are " + str(mis_val_table_ren_columns.shape[0]) +
" columns that have missing values.")
# Return the dataframe with missing information
return mis_val_table_ren_columns
train_missing= missing_values_table(train)
train_missing
#2
You don't have to be great to start, but you have to start to be great.
- Zig Ziglar -
반응형
'캐글' 카테고리의 다른 글
[Kaggle Extra Study] 9. Plots with Missing Data (3) | 2024.10.28 |
---|---|
[Kaggle Extra Study] 8. Imputation Techniques for Time Series Data (0) | 2024.10.27 |
[Kaggle Extra Study] 7. Data Imputation (3) | 2024.10.27 |
[Kaggle Study] #1 Titanic - Machine Learning from Disaster (1) | 2024.10.26 |
[Kaggle Study] 2024 How I am studying (3) | 2024.10.25 |