The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2024-11-26	17.3 kB	0
v1.2.0 source code.tar.gz	2024-11-26	3.3 MB	0
v1.2.0 source code.zip	2024-11-26	3.9 MB	0
Totals: 3 Items		7.2 MB	0

Version 1.2.0

We're happy to announce the AutoGluon 1.2.0 release.

AutoGluon 1.2 contains massive improvements to both Tabular and TimeSeries modules, each achieving a 70% win-rate vs AutoGluon 1.1. This release additionally adds support for Python 3.12 and drops support for Python 3.8.

We are also excited to announce AutoGluon-Assistant (AG-A), our first venture into the realm of Automated Data Science!

WIP: The 1.2.0 release notes are still being updated. Come back in a few hours for the complete release notes. The below is a draft.

For Tabular, we encompass the primary enhancements of the new TabPFNMix tabular foundation model and parallel fit strategy into the new "experimental_quality" preset to ensure a smooth transition period for those who wish to try the new cutting edge features. We will be using this release to gather feedback prior to incorporating these features into the other presets. We also introduce a new stack layer model pruning technique that results in a 3x inference speedup on small datasets with zero performance loss and greatly improved post-hoc calibration across the board, particularly on small datasets.

For TimeSeries, we introduce Chronos-Bolt, our latest foundation model integrated into AutoGluon, with massive improvements to both accuracy and inference speed compared to Chronos, along with fine-tuning capabilities. We also added covariate regressor support!

See more details in the Spotlights below!

Spotlight

AutoGluon Becomes the Golden Standard for Competition ML in 2024

Before diving into the new features of 1.2, we would like to start by highlighting the wide-spread adoption AutoGluon has received on competition ML sites like Kaggle in 2024. Across all of 2024, AutoGluon was used to achieve a top 3 finish in 15 out of 18 tabular Kaggle competitions, including 7 first place finishes, and was never outside the top 1% of private leaderboard placements, with an average of over 1000 competing human teams in each competition. In the $75,000 prize money 2024 Kaggle AutoML Grand Prix, AutoGluon was used by the 1st, 2nd, and 3rd place teams, with the 2nd place team led by two AutoGluon developers: Lennart Purucker and Nick Erickson! For comparison, in 2023 AutoGluon achieved only 1 first place and 1 second place solution. We attribute the bulk of this increase to the improvements seen in AutoGluon 1.0 and beyond.

We'd like to emphasize that these results are achieved via human expert interaction with AutoGluon and other tools, and often includes manual feature engineering and hyperparameter tuning to get the most out of AutoGluon. To see a live tracking of all AutoGluon solution placements on Kaggle, refer to our AWESOME.md ML competition section where we provide links to all solution write-ups.

AutoGluon-Assistant: Automating Data Science with AutoGluon and LLMs

We are excited to share the release of a new AutoGluon-Assistant module (AG-A), powered by LLMs from AWS Bedrock or OpenAI. AutoGluon-Assistant empowers users to solve tabular machine learning problems using only natural language descriptions, in zero lines of code with our simple user interface. Fully autonomous AG-A outperforms 74% of human ML practitioners in Kaggle competitions and secured a live top 10 finish in the $75,000 prize money 2024 Kaggle AutoML Grand Prix competition as Team AGA 🤖!

TabularPredictor presets="experimental_quality"

TabularPredictor has a new "experimental_quality" preset that offers even better predictive quality than "best_quality". On the AutoMLBenchmark, we observe a 70% winrate vs best_quality when running for 4 hours on a 64 CPU machine. This preset is a testing ground for cutting edge features and models which we hope to incorporate into best_quality for future releases. We recommend to use a machine with at least 16 CPU cores, 64 GB of memory, and a 4 hour+ time_limit to get the most benefit out of experimental_quality. Please let us know via a GitHub issue if you run into any problems running the experimental_quality preset.

TabPFNMix: A Foundation Model for Tabular Data

TabPFNMix is the first tabular foundation model created by the AutoGluon team, and was pre-trained exclusively on synthetic data. The model builds upon the prior work of TabPFN and TabForestPFN. TabPFNMix to the best of our knowledge achieves a new state-of-the-art for individual open source model performance on datasets between 1000 and 10000 samples, and also supports regression tasks! Across the 109 classification datasets with less than or equal to 10000 training samples in TabRepo, fine-tuned TabPFNMix outperforms all prior models, with a 64% win-rate vs the strongest tree model, CatBoost, and a 61% win-rate vs fine-tuned TabForestPFN.

The model is available via the TABPFNMIX hyperparameters key, and is used in the new experimental_quality preset. We recommend using this model for datasets smaller than 50,000 training samples, ideally with a large time limit and 64+ GB of memory. This work is still in the early stages, and we appreciate any feedback from the community to help us iterate and improve for future releases. You can learn more by going to our HuggingFace model page for the model (tabpfn-mix-1.0-classifier, tabpfn-mix-1.0-regressor). Give us a like on HuggingFace if you want to see more! A paper is planned in future to provide more details about the model.

fit_strategy="parallel"

AutoGluon's TabularPredictor now supports the new fit argument fit_strategy and the new "parallel" option, enabled by default in the new experimental_quality preset. For machines with 16 or more CPU cores, the parallel fit strategy offers a major speedup over the previous "sequential" strategy. We estimate with 64 CPU cores that most datasets will experience a 2-4x speedup, with the speedup getting larger as CPU cores increase.

Chronos-Bolt⚡: a 250x faster, more accurate Chronos model

Chronos-Bolt is our latest foundation model for forecasting that has been integrated into AutoGluon. It is based on the T5 encoder-decoder architecture and has been trained on nearly 100 billion time series observations. It chunks the historical time series context into patches of multiple observations, which are then input into the encoder. The decoder then uses these representations to directly generate quantile forecasts across multiple future steps—a method known as direct multi-step forecasting. Chronos-Bolt models are up to 250 times faster and 20 times more memory-efficient than the original Chronos models of the same size.

The following plot compares the inference time of Chronos-Bolt against the original Chronos models for forecasting 1024 time series with a context length of 512 observations and a prediction horizon of 64 steps.

Chronos-Bolt models are not only significantly faster but also more accurate than the original Chronos models. The following plot reports the probabilistic and point forecasting performance of Chronos-Bolt in terms of the Weighted Quantile Loss (WQL) and the Mean Absolute Scaled Error (MASE), respectively, aggregated over 27 datasets (see the Chronos paper for details on this benchmark). Remarkably, despite having no prior exposure to these datasets during training, the zero-shot Chronos-Bolt models outperform commonly used statistical models and deep learning models that have been trained on these datasets (highlighted by *). Furthermore, they also perform better than other FMs, denoted by a +, which indicates that these models were pretrained on certain datasets in our benchmark and are not entirely zero-shot. Notably, Chronos-Bolt (Base) also surpasses the original Chronos (Large) model in terms of the forecasting accuracy while being over 600 times faster.

Chronos-Bolt models are now available through AutoGluon in four sizes—Tiny (9M), Mini (21M), Small (48M), and Base (205M)—and can also be used on the CPU. With the addition of Chronos-Bolt models and other enhancements, AutoGluon v1.2 achieves a 70%+ win rate against the previous release!

In addition to the new Chronos-Bolt models, we have also added support for effortless fine-tuning of Chronos and Chronos-Bolt models. Check out the updated Chronos tutorial to learn how to use and fine-tune Chronos-Bolt models.

Time Series Covariate Regressors

We have added support for covariate regressors for all forecasting models. Covariate regressors are tabular regression models that can be combined with univariate forecasting models to incorporate exogenous information. These are particularly useful for foundation models like Chronos-Bolt, which rely solely on the target time series' historical data and cannot directly use exogenous information (such as holidays or promotions). To improve the predictions of univariate models when covariates are available, a covariate regressor is first fit on the known covariates and static features to predict the target column at each time step. The predictions of the covariate regressor are then subtracted from the target column, and the univariate model then forecasts the residuals. The Chronos tutorial showcases how covariate regressors can be used with Chronos-Bolt.

General

WIP: Come back in a few hours for the full release notes

Tabular

WIP: Come back in a few hours for the full release notes

TimeSeries

New Features

Add fine-tuning support for Chronos and Chronos-Bolt models @abdulfatir (#4608, [#4645], [#4653], [#4655], [#4659], [#4661], [#4673], [#4677])
Add Chronos-Bolt @canerturkmen (#4625)
TimeSeriesPredictor.leaderboard now can compute extra metrics and return hyperparameters for each model @shchur (#4481)
Add target_scaler support for all forecasting models @shchur (#4460, [#4644])
Add covariate_regressor support for all forecasting models @shchur (#4566, [#4641])
Add method to convert a TimeSeriesDataFrame to a regular pd.DataFrame @shchur (#4415)
[experimental] Add the weighted cumulative error forecasting metric @shchur (#4594)
[experimental] Allow custom ensemble model types for time series @shchur (#4662)

Fixes and Improvements

Update presets @canerturkmen @shchur (#4656, [#4658], [#4666], [#4672])
Unify all Croston models into a single class @shchur (#4564)
Bump statsforecast version to 1.7 @canerturkmen @shchur (#4194, [#4357])
Fix deep learning models failing if item_ids have StringDtype @rsj123 (#4539)
Update logic for inferring the time series frequency @shchur (#4540)
Speed up and reduce memory usage of the TimeSeriesFeatureGenerator preprocessing logic @shchur (#4557)
Update to GluonTS v0.16.0 @shchur (#4628)
Refactor GluonTS default parameter handling, update TiDE parameters @canerturkmen (#4640)
Move covariate scaling logic into a separate class @shchur (#4634)
Prune timeseries unit and smoke tests @canerturkmen (#4650)
Minor fixes @abdulfatir @canerturkmen @shchur (#4259, [#4299], [#4395], [#4386], [#4409], [#4533], [#4565], [#4633], [#4647])

Multimodal

Fixes and Improvements

Fix Missing Validation Metric While Resuming A Model Failed At Checkpoint Fusing Stage by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4449
Add coco_root for better support for custom dataset in COCO format. by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/3809
Add COCO Format Saving Support and Update Object Detection I/O Handling by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/3811
Skip MMDet Config Files While Checking with bandit by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4630
Fix Logloss Bug and Refine Compute Score Logics by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4629
Fix Index Typo in Tutorial by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4642
Fix Proba Metrics for Multiclass by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4643
Support torch 2.4 by @tonyhoo in https://github.com/autogluon/autogluon/pull/4360
Add Installation Guide for Object Detection in Tutorial by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4430
Add Bandit Warning Mitigation for Internal torch.save and torch.load Usage by @tonyhoo in https://github.com/autogluon/autogluon/pull/4502
update accelerate version range by @cheungdaven in https://github.com/autogluon/autogluon/pull/4596
Bound nltk version to avoid verbose logging issue by @tonyhoo in https://github.com/autogluon/autogluon/pull/4604
Upgrade TIMM by @prateekdesai04 in https://github.com/autogluon/autogluon/pull/4580
Key dependency updates in _setup_utils.py for v1.2 release by @tonyhoo in https://github.com/autogluon/autogluon/pull/4612
Configurable Number of Checkpoints to Keep per HPO Trial by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4615
Refactor Metrics for Each Problem Type by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4616
Fix Torch Version and Colab Installation for Object Detection by @FANGAreNotGnu in https://github.com/autogluon/autogluon/pull/4447

Special Thanks

WIP: Come back in a few hours for the full release notes

Xiyuan Zhang for leading the development of TabPFNMix!
The TabPFN author's Noah Hollmann, Samuel Muller, Katharina Eggensperger, and Frank Hutter for unlocking the power of foundation models for tabular data, and the TabForestPFN author's Felix den Breejen, Sangmin Bae, Stephen Cha, and Se-Young Yun for extending the idea to a more generic representation. Our TabPFNMix work builds upon the shoulders of giants.
Lennart Purucker for leading development of the parallel model fit functionality and pushing AutoGluon to its limits in the 2024 Kaggle AutoML Grand Prix.
Robert Hatch, Tilii, Optimistix, Mart Preusse, Ravi Ramakrishnan, Samvel Kocharyan, Kirderf, Carl McBride Ellis, Konstantin Dmitriev, and others for their insightful discussions and for championing AutoGluon on Kaggle!
Eddie Bergman for his insightful surprise code review of the tabular callback support feature.

New Contributors

@nathanaelbosch made their first contribution in https://github.com/autogluon/autogluon/pull/4366
@adibiasio made their first contribution in https://github.com/autogluon/autogluon/pull/4391
@abdulfatir made their first contribution in https://github.com/autogluon/autogluon/pull/4608
@echowve made their first contribution in https://github.com/autogluon/autogluon/pull/4667
@abhishek-iitmadras made their first contribution in https://github.com/autogluon/autogluon/pull/4685
@xiyuanzh made their first contribution in https://github.com/autogluon/autogluon/pull/4694

Source: README.md, updated 2024-11-26

AutoGluon Files

AutoGluon: AutoML for Image, Text, and Tabular Data

Version 1.2.0

Spotlight

AutoGluon Becomes the Golden Standard for Competition ML in 2024

AutoGluon-Assistant: Automating Data Science with AutoGluon and LLMs