Imbalanced Classification Master Class in Python
This course was designed around major imbalanced classification techniques that are directly relevant to real-world problems. This is a master class on handling real-world class imablance.
Course Introduction
Course Outcomes
Course Structure
Imbalanced Classification Defined
Causes of Class Imbalance
Challenge of Imbalance Classification
Examples of Class Imbalance
Create Synthetic Dataset with Class Distribution
Effect of Skewed Class Distributions
Visualizing Extreme Skew
Why Imbalanced Classification Is Hard
Compounding Effect of Dataset Size
Compounding Effect of Label Noise
Compounding Effect of Data Distribution
Evaluation Metrics and Imbalance
Taxonomy of Classifier Evaluation Metrics
Ranking Metrics for Imbalanced Classification
Probabilistic Metrics for Imbalanced Classification
How to Choose an Evaluation Metric
Accuracy Fails for Imbalanced Classification
Accuracy Paradox
Demo: Accuracy for Imbalanced Classification
Precision for Imbalanced Classification
Precision for Multi-Class Classification
Recall for Imbalanced Classification
Demo: Recall for Imbalanced Classification
F-Measure for Imbalanced Classification
Demo: F- Measure for Imbalanced Classification
ROC Curves and Precision-Recall Curves
ROC Curve
Demo: ROC Curve
ROC Area Under Curve (AUC) Score
Precision-Recall Curves
Precision-Recall Area Under Curve (AUC) Score
ROC AUC on with Severe Imbalance
ROC and Precision-Recall Curves With a Severe Imbalance
Probability Scoring Methods in Python
Log Loss Score
Brier Score
Cross-Validation for Imbalanced Classification
Challenge of Evaluating Classifiers
Failure of k-Fold Cross-Validation
Data Sampling Methods for Imbalanced Classification
Oversampling Techniques
Undersampling Techniques
Combinations of Techniques
Random Resampling Imbalanced Datasets
Demo: Random Oversampling Imbalanced Datasets
Demo: Random Undersampling Imbalanced Datasets
Demo: Combining Random Oversampling and Undersampling Techniques
Synthetic Minority Oversampling Technique (SMOTE)
SMOTE for Balancing Data
SMOTE for Classification
Borderline-SMOTE SVM
Adaptive Synthetic Sampling (ADASYN)
Undersampling Methods
Near Miss Undersampling (NearMiss-1)
Near Miss Undersampling (NearMiss-2 and NearMiss-3)
Condensed Nearest Neighbor Rule Undersampling
Tomek Links for Undersampling
Edited Nearest Neighbors Rule for Undersampling (ENN)
Neighborhood Cleaning Rule for Undersampling
Cost-Sensitive Learning for Imbalanced Classification
Not All Classification Errors Are Equal
Cost-Sensitive Learning
Cost-Sensitive Imbalanced Classification
Cost-Sensitive Methods
Cost-Sensitive Algorithms
Cost-Sensitive Ensembles
Cost-Sensitive Logistic Regression
Logistic Regression for Imbalanced Classification
Weighted Logistic Regression with Scikit-Learn
Grid Search Weighted Logistic Regression
Cost-Sensitive Decision Trees for Imbalanced Classification
Decision Trees for Imbalanced Classification
Weighted Decision Tree With Scikit-Learn
Grid Search Weighted Decision Tree
Develop a Cost-Sensitive Neural Network for Imbalanced Classification
Neural Network Model in Keras
Deep Learning for Imbalanced Classification
Weighted Neural Network With Keras
Project: Breast Cancer Dataset
Haberman Breast Cancer Survival Dataset
Dataset Exploration
Model Test and Baseline Result
Evaluate Probabilistic Models
Model Evaluation With Scaled Inputs
Model Evaluation With Power Transform