| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-03-19 | 4.2 kB | |
| v0.17.0 source code.tar.gz | 2025-03-19 | 7.2 MB | |
| v0.17.0 source code.zip | 2025-03-19 | 7.3 MB | |
| Totals: 3 Items | 14.6 MB | 0 | |
Highlights:
- Light-weight installation without UMAP and HDBSCAN by @MaartenGr in #2289
- Add Model2Vec as an embedding backend by @MaartenGr in #2245
- Add LiteLLM as a representation model by @MaartenGr in #2213
- Interactive DataMapPlot by @MaartenGr in #2287
Fixes:
- Lightweight installation: use safetensors without torch by @hedgeho in #2306
- Fix missing links by @MaartenGr in #2305
- Set up pre-commit hooks by @afuetterer in #2283
- Fix handling OpenAI returning None objects by @jeaninejuliettes in #2280
- Add support for python 3.13 by @afuetterer in #2173
- Added system prompts by @Leo-LiHao in #2145
- More documentation for topic reduction by @MaartenGr in #2260
- Drop support for python 3.8 by @afuetterer in #2243
- Fixed online topic modeling on GPU by @SSivakumar12 in #2181
- Fixed hierarchical cluster visualization by @PipaFlores in #2191
- Remove duplicated phrase by @AndreaFrancis in #2197
Model2Vec
With Model2Vec, we now have a very interesting pipeline for light-weight embeddings. Combined with the light-weight installation, you can now run BERTopic without using pytorch!
Installation is straightforward:
pip install --no-deps bertopic
pip install --upgrade numpy pandas scikit-learn tqdm plotly pyyaml
This will install BERTopic even without UMAP or HDBSCAN, so you can use other techniques instead. If these are not installed, then it uses PCA with scikit-learn's HDBSCAN instead. You can install them, together with Model2Vec:
pip install model2vec umap-learn hdbscan
Then, creating a BERTopic model is as straightforward as you are used to:
:::python
from bertopic import BERTopic
from model2vec import StaticModel
# Model2Vec
embedding_model = StaticModel.from_pretrained("minishlab/potion-base-8M")
# BERTopic
topic_model = BERTopic(embedding_model=embedding_model)
DataMapPlot
To use the interactive version of DataMapPlot, you only need to run the following:
:::python
from umap import UMAP
# Reduce your embeddings to 2-dimensions
reduced_embeddings = UMAP(n_neighbors=10, n_components=2, min_dist=0.0, metric='cosine').fit_transform(embeddings)
# Create an interactive DataMapPlot figure
topic_model.visualize_document_datamap(docs, reduced_embeddings=reduced_embeddings, interactive=True