Download Latest Version v0.14.8 source code.tar.gz (626.0 kB)
Email in envelope

Get an email when there's a new version of Datapipe

Home / v0.13.0-alpha.3
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2023-07-19 975 Bytes
v0.13.0-alpha.3 source code.tar.gz 2023-07-19 604.1 kB
v0.13.0-alpha.3 source code.zip 2023-07-19 648.7 kB
Totals: 3 Items   1.3 MB 0

WIP 0.13.0

Major changes

  • Add datapipe.metastore.TransformMetaTable. Now each transform gets it's own meta table that tracks status of each transformation
  • Generalize BatchTransform and DatatableBatchTransform through BaseBatchTransformStep
  • Add transform_keys to *BatchTransform
  • Move changed idx computation out of DataStore to BaseBatchTransformStep
  • Add column priority to transform meta table, sort work by priority

New features

  • Add step reset-metadata CLI command
  • Add step fill-metadata CLI command that populates transform meta-table with all indices to process
  • Add helm chart for running regular loops in k8s as CronJob
  • Switch from vanilla tqdm to tqdm_loggable for better display in logs
  • Add step run-idx CLI command

  • Executors: datapipe.executor.SingleThreadExecutor, datapipe.executor.ray.RayExecutor

Bugfixes

  • Fix QdrantStore.read_rows when no idx is specified
Source: README.md, updated 2023-07-19