cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. This package helps you find label issues and other data issues, so you can train reliable ML models. All features of cleanlab work with any dataset and any model. Yes, any model: PyTorch, Tensorflow, Keras, JAX, HuggingFace, OpenAI, XGBoost, scikit-learn, etc. If you use a sklearn-compatible classifier, all cleanlab methods work out-of-the-box.

Features

  • Binary and multi-class classification
  • Multi-label classification (e.g. image/document tagging)
  • Token classification (e.g. entity recognition in text)
  • Classification with data labeled by multiple annotators
  • Active learning with multiple annotators (suggest which data to label or re-label to improve model most)
  • Outlier and out of distribution detection

Project Samples

Project Activity

See All Activity >

License

Affero GNU Public License

Follow Cleanlab

Cleanlab Web Site

Other Useful Business Software
Our Free Plans just got better! | Auth0 Icon
Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Try free now
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Cleanlab!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Data Labeling Tool, Python Data Quality Tool

Registered

2023-05-23