Today I'm going to share a powerful Python library called autogluon.
https://github.com/autogluon/autogluon
AutoGluon is an AutoML toolkit for deep learning that can automatically perform end-to-end machine learning tasks, allowing us to achieve powerful predictive performance with just a few lines of code.
AutoGluon "automates machine learning tasks" and allows you to easily achieve powerful predictive performance in your applications.
First experience
Installing AutoGluon
We can directly install it using pip.
pip install autogluon
Loading datasets
We can load datasets using TabularDataset.
from autogluon.tabular import TabularDataset, TabularPredictor
train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
Model building
To use the model, we need to initialize "evaluation metrics, dependent variables, and the directory to store the results."
In the following example, we use f1 as the evaluation metric. The dependent variable is "class", and the models are stored in the "output_models" folder.
evaluation_metric = "f1"
data_label = "class"
save_path = "output_models"
predictor = TabularPredictor(label='class').fit(train_data, time_limit=120) # Fit models for 120s
leaderboard = predictor.leaderboard(test_data)
# Create predictor
predictor = TabularPredictor(label=data_label, path=save_path, eval_metric=evaluation_metric)
predictor = predictor.fit(train_data)
predictor.leaderboard(silent=True)
The leaderboard shown in the image below allows you to see "the attempts made with all models and the scores you obtained with these models."
Now let's take a look at feature importance.
X = train_data
Predictor.feature_importance(X)
All built models are stored in the output folder "output_models".