| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-12-10 | 41.5 kB | |
| v25.12.00 source code.tar.gz | 2025-12-10 | 9.4 MB | |
| v25.12.00 source code.zip | 2025-12-10 | 11.8 MB | |
| Totals: 3 Items | 21.2 MB | 0 | |
What's Changed
🚨 Breaking Changes
- Rewrite JNI functions to use
JNI_TRY/JNI_CATCHby @ttnghia in https://github.com/rapidsai/cudf/pull/19053 - Remove compatibility with nvCOMP versions before 5.0 by @vuule in https://github.com/rapidsai/cudf/pull/20140
- Remove DataFrame.apply_chunks, Groupby.apply_grouped by @mroeschke in https://github.com/rapidsai/cudf/pull/20194
- Change .str.starts/endswith with tuple argument to match any pattern instead of pairwise matching by @mroeschke in https://github.com/rapidsai/cudf/pull/20249
- [cudf-polars] CUDA stream by @madsbk in https://github.com/rapidsai/cudf/pull/20154
- Chunked read parquet, prepend index column, and apply deletion vector by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20201
- Zero-copy
hostdevice_vectoron integrated systems by @vuule in https://github.com/rapidsai/cudf/pull/20225 - Use int64_t for the num_rows slot in parquet_reader_options by @wence- in https://github.com/rapidsai/cudf/pull/20256
- Require CUDA 12.2+ by @jakirkham in https://github.com/rapidsai/cudf/pull/20416
- Remove compatibility for CCCL < 3.1 by @bdice in https://github.com/rapidsai/cudf/pull/20468
- Remove deprecated types and APIs by @vuule in https://github.com/rapidsai/cudf/pull/20422
- Support signed integers and decimals in
SUM_WITH_OVERFLOWgroupby by @PointKernel in https://github.com/rapidsai/cudf/pull/19598 - Change groupby-scan COUNT to 1-based results by @davidwendt in https://github.com/rapidsai/cudf/pull/20168
- Change strings::like() pattern parameter from string_scalar to string_view by @davidwendt in https://github.com/rapidsai/cudf/pull/20428
- No-op performance tracking wrappers by @galipremsagar in https://github.com/rapidsai/cudf/pull/20595
🐛 Bug Fixes
- Copy
attrsat correct place inDataFrameconstructor by @galipremsagar in https://github.com/rapidsai/cudf/pull/20074 - Handle missing nightly runs in pandas tests job by @galipremsagar in https://github.com/rapidsai/cudf/pull/20081
- Fix numpy ufunc for
DataFrameby @galipremsagar in https://github.com/rapidsai/cudf/pull/20070 - Unproxy few unnecessary testing utilities in pandas by @galipremsagar in https://github.com/rapidsai/cudf/pull/20088
- Fix libcudf groupby benchmarks to not include internal cache by @davidwendt in https://github.com/rapidsai/cudf/pull/20038
- Fix cudf.date_range with non-iso start and end date strings by @mroeschke in https://github.com/rapidsai/cudf/pull/20116
- Fix create_distinct_rows_column to create non-nullable columns by @davidwendt in https://github.com/rapidsai/cudf/pull/20082
- Fix arrow timestamp frequency cases in
cudf.pandasby @galipremsagar in https://github.com/rapidsai/cudf/pull/20128 - Cast inputs to true division from decimal to float by @Matt711 in https://github.com/rapidsai/cudf/pull/20077
- Handle NVMLError_NotSupported in cudf-polars by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20179
- Fix RMM JNI pinned_fallback_host_memory_resource for CCCL 3.1.0 by @bdice in https://github.com/rapidsai/cudf/pull/20160
- Require passing memory resources to from_libcudf methods by @vyasr in https://github.com/rapidsai/cudf/pull/20171
- Enable hash-groupby for decimal32/64 type and MEAN aggregation by @davidwendt in https://github.com/rapidsai/cudf/pull/20040
- Align decimal dtypes in predicate before conditional join by @Matt711 in https://github.com/rapidsai/cudf/pull/20060
- Change stream_checking_resource_adaptor::do_deallocate to noexcept by @vyasr in https://github.com/rapidsai/cudf/pull/20218
- Deallocation should be noexcept by @bdice in https://github.com/rapidsai/cudf/pull/20219
- Fix a race condition in the decode of delta encoded Parquet columns by @vuule in https://github.com/rapidsai/cudf/pull/20216
- Fix the host-device tdigest offsets by using cuda::std::span by @PointKernel in https://github.com/rapidsai/cudf/pull/20220
- Add
streamandmrarguments toColumn.from_arrowtype stub by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20244 - Pin
deltalakein cudf-polars-polars-tests CI job by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20255 - Pin ibis-framework<11.0.0 by @Matt711 in https://github.com/rapidsai/cudf/pull/20267
- Add private attributes for
cudf.pandasproxy objects by @galipremsagar in https://github.com/rapidsai/cudf/pull/20276 - Add Proxy for
SparseAccessorby @galipremsagar in https://github.com/rapidsai/cudf/pull/20278 - We need this to pacify mypy by @wence- in https://github.com/rapidsai/cudf/pull/20285
- Purge non-empty nulls for the generated lists columns in data generation utility by @ttnghia in https://github.com/rapidsai/cudf/pull/20283
- Fix missing table compatibility check in two_table_comparator constructor by @PointKernel in https://github.com/rapidsai/cudf/pull/20305
- Fix the check for equal
num_colsacross empty parquet sources by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20320 - Add
nans_to_nullstoFrameby @galipremsagar in https://github.com/rapidsai/cudf/pull/20314 - Add support for list type in
getby @galipremsagar in https://github.com/rapidsai/cudf/pull/20332 - Fix decimal dtype serialization in cudf-polars by @Matt711 in https://github.com/rapidsai/cudf/pull/20300
- Make the
GroupedRollingWindowexpression node reconstructable in cudf-polars by @Matt711 in https://github.com/rapidsai/cudf/pull/20288 - Ensure pylibcudf.Scalar.from_py uses CUDA streams by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20340
- Skip failing cudf-polars test due to hash groupby bug by @Matt711 in https://github.com/rapidsai/cudf/pull/20356
- Support order by keys for order-sensitive scalar aggregations in grouped windows by @Matt711 in https://github.com/rapidsai/cudf/pull/20350
- Honor user-passed stream in slice_strings for scalar inputs by @mroeschke in https://github.com/rapidsai/cudf/pull/20349
- Thread missing streams in column/table view creation to char size calculation by @vyasr in https://github.com/rapidsai/cudf/pull/20351
- Fix missed-sync for
mapping_indices_kernelin hash-based groupby aggregation by @ttnghia in https://github.com/rapidsai/cudf/pull/20370 - Fix a few SPDX-related issues by @KyleFromNVIDIA in https://github.com/rapidsai/cudf/pull/20364
- Fix a
dtypebug in column constructor by @galipremsagar in https://github.com/rapidsai/cudf/pull/20384 - Refactor
as_columndtype parameter calls by @galipremsagar in https://github.com/rapidsai/cudf/pull/20379 - Add CUDA stream to
cudf_polars.Column.deserializeby @TomAugspurger in https://github.com/rapidsai/cudf/pull/20396 - Add missing CUDA stream to cudf-polars left-semi join by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20398
- Fix various string APIs to work with extension types by @galipremsagar in https://github.com/rapidsai/cudf/pull/20368
- Add parameter validation for
mergeandMultiIndex.from_frameby @galipremsagar in https://github.com/rapidsai/cudf/pull/20382 - Fix nvtext::normalize_characters special token case by @davidwendt in https://github.com/rapidsai/cudf/pull/20242
- Fix pinned memory resource
shared_pointerlifetime in tests. by @bdice in https://github.com/rapidsai/cudf/pull/20407 - Support new
nvcompStatus_tenum value by @vuule in https://github.com/rapidsai/cudf/pull/20376 - Don't skip blank CSV lines rows after the header in cudf-polars scan_csv by @mroeschke in https://github.com/rapidsai/cudf/pull/20341
- Fix OOB accesses in JSON_CornerCase_Empty test and get_row_array_parent_col_id function by @bdice in https://github.com/rapidsai/cudf/pull/20421
- Change calls to cudaMemcpyToSymbol to cudaMemcpyToSymbolAsync by @davidwendt in https://github.com/rapidsai/cudf/pull/20374
- Do not accelerate
pandas._config.configby @Matt711 in https://github.com/rapidsai/cudf/pull/20413 - Return timedelta instead of datetime type with std with datetime type with missing values by @mroeschke in https://github.com/rapidsai/cudf/pull/20439
- Disallow non-bool skipna arguments to reduction methods by @mroeschke in https://github.com/rapidsai/cudf/pull/20436
- Fix parquet scans for duckDB PDS-DS by @Matt711 in https://github.com/rapidsai/cudf/pull/20388
- Support
__array_function__on the proxy array type by @Matt711 in https://github.com/rapidsai/cudf/pull/20419 - Make
memory_usageand__sizeof__proxy attributes and always skip all memory usage tests by @Matt711 in https://github.com/rapidsai/cudf/pull/20425 - Add input validation for
from_recordsby @galipremsagar in https://github.com/rapidsai/cudf/pull/20412 - Use computed reduction result type for empty sum and product aggregations by @mroeschke in https://github.com/rapidsai/cudf/pull/20438
- Correct level arg validation for Index.isin, unique by @mroeschke in https://github.com/rapidsai/cudf/pull/20449
- Add private
_grouperattribute toDataFrameGroupByproxy type by @Matt711 in https://github.com/rapidsai/cudf/pull/20448 - Raise ValueError when indexing with zero step slice by @mroeschke in https://github.com/rapidsai/cudf/pull/20453
- Raise IndexError for float-like indexers in RangeIndex/MultiIndex.getitem by @mroeschke in https://github.com/rapidsai/cudf/pull/20454
- Disallow slice(bool, ...) in DataFrame.loc with MultiIndex by @mroeschke in https://github.com/rapidsai/cudf/pull/20457
- Fix core dump in MemoryCleaner by @res-life in https://github.com/rapidsai/cudf/pull/19872
- Disallow multiple ellipse values in loc/iloc indexing by @mroeschke in https://github.com/rapidsai/cudf/pull/20456
- Fix
scanoperations forstringcolumns by @galipremsagar in https://github.com/rapidsai/cudf/pull/20460 - Fix UTF8 data generator in libcudf benchmarks utility by @davidwendt in https://github.com/rapidsai/cudf/pull/20465
- Handle dealloc in stream-ordered cudf-polars ops by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20467
- Raise on unsupported unstack cases by @Matt711 in https://github.com/rapidsai/cudf/pull/20463
- Allow early exit for left semi-/anti- joins with empty build/probe tables by @shrshi in https://github.com/rapidsai/cudf/pull/20452
- Fix OOB memory access in JSON reader ingest_raw utility by @davidwendt in https://github.com/rapidsai/cudf/pull/20451
- Round up small-type groupby outputs to 4-byte boundary by @PointKernel in https://github.com/rapidsai/cudf/pull/20455
- Fix GPU acceleration bug in decimal type-cast by @galipremsagar in https://github.com/rapidsai/cudf/pull/20471
- Add missing CUDA stream in cudf_polars Distinct by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20477
- Support
__arrow_array__on proxy extension array by @Matt711 in https://github.com/rapidsai/cudf/pull/20478 - Enable scan operation for
datetime64andtimedelta64types by @galipremsagar in https://github.com/rapidsai/cudf/pull/20464 - Remove unneeded type check in cudf::strings::slice_strings by @davidwendt in https://github.com/rapidsai/cudf/pull/20437
- Fix join match context tests by @PointKernel in https://github.com/rapidsai/cudf/pull/20472
- Fix the statistics_mr in benchmark fixture by @PointKernel in https://github.com/rapidsai/cudf/pull/20496
- Guard
__sizeof__in pandas compatability mode by @Matt711 in https://github.com/rapidsai/cudf/pull/20495 - Fix OOB memory access in Orc and Parquet stacks from fixed-width unaligned loads by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20458
- Fix cudf.pandas Timestamp/Timedelta not subclassing stdlib datetime objects by @mroeschke in https://github.com/rapidsai/cudf/pull/20433
- Revert benchmark input generation logic for list type by @davidwendt in https://github.com/rapidsai/cudf/pull/20498
- Avoid using pylibcudf directly in rapidsmpf runtime by @rjzamora in https://github.com/rapidsai/cudf/pull/20501
- Suppress NVRTC arch warnings by @brandon-b-miller in https://github.com/rapidsai/cudf/pull/20517
- Fix
ChannelManagerandLineariserby @rjzamora in https://github.com/rapidsai/cudf/pull/20516 - Synchronize streams in
LocalShuffleby @rjzamora in https://github.com/rapidsai/cudf/pull/20515 - Make
argsorthave return typenp.intpto match pandas by @Matt711 in https://github.com/rapidsai/cudf/pull/20487 - Fix
polars.concat_strwith one column in cudf_polars by @mroeschke in https://github.com/rapidsai/cudf/pull/20535 - Override
__sizeof__forcudf.Indexby @Matt711 in https://github.com/rapidsai/cudf/pull/20530 - Fix
pl.scan_csv(...).slice(...).collect(engine="gpu")with None endpoint by @mroeschke in https://github.com/rapidsai/cudf/pull/20519 - Fix DataChunkSourceTest by syncing default stream by @davidwendt in https://github.com/rapidsai/cudf/pull/20492
- Fix data size errors in some libcudf benchmarks by @davidwendt in https://github.com/rapidsai/cudf/pull/20512
- Pin cython and pytest dependencies by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20571
- Pin Cython pre-3.2.0 and PyTest pre-9 by @jakirkham in https://github.com/rapidsai/cudf/pull/20573
- Handle
Emptychild IRs in_decomposeby @Matt711 in https://github.com/rapidsai/cudf/pull/20409 - Skip flaky pandas datetime test by @Matt711 in https://github.com/rapidsai/cudf/pull/20585
- Fix max-pool-size-exceeded error in DATA_CHUNK_SOURCE_TEST by @davidwendt in https://github.com/rapidsai/cudf/pull/20534
- Fix racecheck in nvtext wordpiece tokenizer kernel by @davidwendt in https://github.com/rapidsai/cudf/pull/20588
- Fix the check to determine if all column chunk pages are dict encoded by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20524
- Add stream synchronize to QUANTILES_TEST PercentileApprox gtests by @davidwendt in https://github.com/rapidsai/cudf/pull/20558
- updated update-version.sh to handle release branch version changes by @rockhowse in https://github.com/rapidsai/cudf/pull/20598
- Fix nvtext tokenizers handling invalid UTF8 data by @davidwendt in https://github.com/rapidsai/cudf/pull/20514
- Fix overflow errors in distinct and filtered joins when hash table size exceeds int32 limits by @shrshi in https://github.com/rapidsai/cudf/pull/20594
- [FEA] Optimize JIT Filter for Low-Selectivity by @lamarrr in https://github.com/rapidsai/cudf/pull/20222
- Compute boolean function(NOT) on integers as a bitwise invert by @Matt711 in https://github.com/rapidsai/cudf/pull/20599
- Cast output dtype of rolling aggregations to match pandas by @Matt711 in https://github.com/rapidsai/cudf/pull/20526
- Add noop path for
Frame.astypeby @Matt711 in https://github.com/rapidsai/cudf/pull/20581 - Fix
copysemantics bugs thus reduce copies and memory usage by @galipremsagar in https://github.com/rapidsai/cudf/pull/20121 - Ensure the sum after expression decomposition for mean has float output dtype by @Matt711 in https://github.com/rapidsai/cudf/pull/20596
- Use
Decimal(0)literal for all-null decimal groups in groupby-sum by @Matt711 in https://github.com/rapidsai/cudf/pull/20591 - Do not drop
freqwhen constructingDatetimeIndexfrom pandas by @brandon-b-miller in https://github.com/rapidsai/cudf/pull/18778 - Fix --validation flag for cudf.pandas PDSH benchmarks by @mroeschke in https://github.com/rapidsai/cudf/pull/20540
- Enable GPU acceleration for more binops by @galipremsagar in https://github.com/rapidsai/cudf/pull/20507
- Fix
rmmfunction calls due to removed deprecated APIs and macro by @ttnghia in https://github.com/rapidsai/cudf/pull/20661 - Fix orc reader bool bug due to not being able to resume rle decode by @pmattione-nvidia in https://github.com/rapidsai/cudf/pull/20666
- Fix categorical comparisons in
cudfto matchpandasby @galipremsagar in https://github.com/rapidsai/cudf/pull/20674 - Fix
anyandallto match pandas by @galipremsagar in https://github.com/rapidsai/cudf/pull/20679 - Fix return types of string APIs in
cudf.pandasby @galipremsagar in https://github.com/rapidsai/cudf/pull/20683 - Resolve pandas test failures by @galipremsagar in https://github.com/rapidsai/cudf/pull/20704
- Fix DatetimeIndex pickling by @vyasr in https://github.com/rapidsai/cudf/pull/20709
DatetimeIndex.serialize()headers are msgpack serializable by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20714
📖 Documentation
- Add note that --rmm-async only affects distributed scheduler. by @bdice in https://github.com/rapidsai/cudf/pull/20129
- Add profiling guide by @bdice in https://github.com/rapidsai/cudf/pull/20292
- Find RMM before CCCL by @wence- in https://github.com/rapidsai/cudf/pull/20336
- Use current system architecture in conda environment creation command by @bdice in https://github.com/rapidsai/cudf/pull/20500
- Use uname -m instead of arch command by @bdice in https://github.com/rapidsai/cudf/pull/20502
- Use RAPIDS_BRANCH file for documentation links by @bdice in https://github.com/rapidsai/cudf/pull/20494
🚀 New Features
- Add memory resources to unary, transform, and filling modules by @vyasr in https://github.com/rapidsai/cudf/pull/20054
- Add memory resources to binaryop, copying, and stream_compaction by @vyasr in https://github.com/rapidsai/cudf/pull/20059
- Add memory resources to groupby, datetime, and lists modules by @vyasr in https://github.com/rapidsai/cudf/pull/20102
- Add memory resources to search, reshape, and partitioning module by @vyasr in https://github.com/rapidsai/cudf/pull/20101
- Add memory resources to rolling, sorting, and quantiles modules by @vyasr in https://github.com/rapidsai/cudf/pull/20099
- [FEA] Implement JIT Filter for read_parquet by @lamarrr in https://github.com/rapidsai/cudf/pull/19831
- Add memory resources to all nvtext APIs by @vyasr in https://github.com/rapidsai/cudf/pull/20119
- Add memory resource to all strings modules by @vyasr in https://github.com/rapidsai/cudf/pull/20123
- Add memory resources to reduce, column, column_factories, and contiguous_split by @vyasr in https://github.com/rapidsai/cudf/pull/20135
- Add memory resources to I/O modules by @vyasr in https://github.com/rapidsai/cudf/pull/20136
- Remove rounding from cudf java by @pmattione-nvidia in https://github.com/rapidsai/cudf/pull/20110
- Add memory resources to replace, json, and hashing by @vyasr in https://github.com/rapidsai/cudf/pull/20150
- Add support for maintain_order param in joins by @Matt711 in https://github.com/rapidsai/cudf/pull/17698
- Add an example to inspect parquet files and dump row group and page level metadata information by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20117
- Support forward/backward filling null values in a grouped window context by @Matt711 in https://github.com/rapidsai/cudf/pull/19907
- Allow multiple calls to
cudf::initializeandcudf::deinitializeby @vuule in https://github.com/rapidsai/cudf/pull/20111 - Add remaining memory resources by @vyasr in https://github.com/rapidsai/cudf/pull/20197
- Add memory resources to scalars by @vyasr in https://github.com/rapidsai/cudf/pull/20196
- Add pylibcudf is_valid_reduce_aggregation API by @davidwendt in https://github.com/rapidsai/cudf/pull/20145
- Support decimal literals in cudf-polars by @Matt711 in https://github.com/rapidsai/cudf/pull/20147
- Support
cum_sum(...).over(...)expressions in cudf-polars by @Matt711 in https://github.com/rapidsai/cudf/pull/19908 - Passthrough unary ops through Parquet predicate pushdown by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20127
- Implement
ARGMINandARGMAXaggregations for reduction by @ttnghia in https://github.com/rapidsai/cudf/pull/20207 - Skip decompression of pruned parquet pages by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20192
- Add an example to demonstrate the use of next-gen parquet reader to read a parquet file with highly selective filters by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/19469
- Evaluate
IS_NULLat row group and page level in Parquet filtering by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20144 - [Java] Add optional native deps loader by @zpuller in https://github.com/rapidsai/cudf/pull/20414
- Add cudf-polars + rapidsmpf CI check by @rjzamora in https://github.com/rapidsai/cudf/pull/20355
- Add Python bindings for the hybrid scan reader by @vyasr in https://github.com/rapidsai/cudf/pull/20381
- RapidsMPF streaming-engine translation by @rjzamora in https://github.com/rapidsai/cudf/pull/20161
- [JNI] Use a read/write lock pattern in Rmm.class by @abellina in https://github.com/rapidsai/cudf/pull/20521
- [Java] Supports output projection indices for
contiguousSplitGroupsAndGenUniqKeysby @res-life in https://github.com/rapidsai/cudf/pull/20391 - Support
Series.atandSeries.iatfor pandas compatability by @Matt711 in https://github.com/rapidsai/cudf/pull/20529 - Add COUNT_VALID aggregation support to groupby-scan by @davidwendt in https://github.com/rapidsai/cudf/pull/20531
- Use RapidsMPF
read_parquetin "rapidsmpf" runtime by @rjzamora in https://github.com/rapidsai/cudf/pull/20497 - Support decimal128 SUM aggregation in hash-based groupby by @PointKernel in https://github.com/rapidsai/cudf/pull/20509
- Add stream testing in pylibcudf by @vyasr in https://github.com/rapidsai/cudf/pull/20625
🛠️ Improvements
- Deprecate .from_pandas constructor by @mroeschke in https://github.com/rapidsai/cudf/pull/19996
- Prune entries in Sphinx nitpick_ignore by @mroeschke in https://github.com/rapidsai/cudf/pull/20045
- Avoid direct CategoricalColumn calls in dask_cudf by @mroeschke in https://github.com/rapidsai/cudf/pull/20080
- Fix typing issues in pylibcudf by @vyasr in https://github.com/rapidsai/cudf/pull/20069
- Avoid shadowing module names by @vyasr in https://github.com/rapidsai/cudf/pull/20071
- Remove calling to
purge_nonempty_nullsinmake_lists_columnby @ttnghia in https://github.com/rapidsai/cudf/pull/12873 - Reduce verbosity of running the pandas test suite by @vyasr in https://github.com/rapidsai/cudf/pull/20107
- Clean up detail device atomic logic using atomic_ref by @PointKernel in https://github.com/rapidsai/cudf/pull/19924
- Use 8 processes for pandas tests, show top 10 test times by @bdice in https://github.com/rapidsai/cudf/pull/20109
- Update nvbench by @bdice in https://github.com/rapidsai/cudf/pull/19619
- Cleanup of some libcudf aggregation code by @davidwendt in https://github.com/rapidsai/cudf/pull/20053
- Run cudf-polars conda unit tests with more than 1 process by @mroeschke in https://github.com/rapidsai/cudf/pull/19980
- Avoid running pandas unit tests for private functionality with cudf.pandas by @mroeschke in https://github.com/rapidsai/cudf/pull/20115
- Remove MultiIndex.from_pandas pytest benchmark by @mroeschke in https://github.com/rapidsai/cudf/pull/20112
- Switch host_vector and host_span dependency by @davidwendt in https://github.com/rapidsai/cudf/pull/20106
- Have ListColumn.from_sequence go through pylibcudf by @mroeschke in https://github.com/rapidsai/cudf/pull/20098
- Fix
RAPIDS_BRANCHversion and update script by @galipremsagar in https://github.com/rapidsai/cudf/pull/20091 - Add pyarrow stubs to mypy environment and fix associated errors by @vyasr in https://github.com/rapidsai/cudf/pull/20118
- Fix slowdown in cudf-polars distributed tests by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20137
- Improve performance of string column size computation during parquet reads. by @nvdbaranec in https://github.com/rapidsai/cudf/pull/19986
- Disable async MR priming in cudf.pandas by @bdice in https://github.com/rapidsai/cudf/pull/20133
- Rework reduction case statement as dispatch_type_and_aggregation by @davidwendt in https://github.com/rapidsai/cudf/pull/20078
- Fix type annotations in cudf-polars by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20131
- Add tests for AUTO and HYBRID (de)compression modes by @vuule in https://github.com/rapidsai/cudf/pull/20126
- Branch 25.12 merge branch 25.10 by @vyasr in https://github.com/rapidsai/cudf/pull/20152
- Manual forward merger for Branch 25.12 - branch 25.10 by @galipremsagar in https://github.com/rapidsai/cudf/pull/20157
- Temporarily disable conda-java-tests by @bdice in https://github.com/rapidsai/cudf/pull/20162
- Remove unused ColumnBase.view by @mroeschke in https://github.com/rapidsai/cudf/pull/20141
- Avoid NumericalColumn call from CategoricalColumn.children by @mroeschke in https://github.com/rapidsai/cudf/pull/20153
- Deprecate legacy public row operators by @PointKernel in https://github.com/rapidsai/cudf/pull/20097
- Avoid more explicit calls to IntervalColumn and StructColumn by @mroeschke in https://github.com/rapidsai/cudf/pull/20064
- Run cudf-polars wheels unit tests with more than 1 process by @mroeschke in https://github.com/rapidsai/cudf/pull/20124
- Trace node execution in cudf-polars by @TomAugspurger in https://github.com/rapidsai/cudf/pull/19895
- Make ColumnBase.as_*_column convert via pylibcudf by @mroeschke in https://github.com/rapidsai/cudf/pull/20149
- Reduce execution times for parquet dictionary tests by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20176
- Update to rapids-logger 0.2 by @bdice in https://github.com/rapidsai/cudf/pull/20172
- Adjust rmm pool handling in PDSH benchmarks by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20138
- Don't assume cudf_polars benchmarking scale factor is always an integer by @mroeschke in https://github.com/rapidsai/cudf/pull/20182
- Skip filtering Parquet row groups with dictionaries if there are non-dict encoded pages by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20175
- Remove unnecessary work from
read_parquet_metadataby @vuule in https://github.com/rapidsai/cudf/pull/20180 - Improve performance of groupby tdigests gtests by @davidwendt in https://github.com/rapidsai/cudf/pull/20173
- Revert "Temporarily disable conda-java-tests" by @bdice in https://github.com/rapidsai/cudf/pull/20184
- Add PDSH benchmark runner for cudf.pandas by @mroeschke in https://github.com/rapidsai/cudf/pull/20164
- Make Column.set_mask go through pylibcudf by @mroeschke in https://github.com/rapidsai/cudf/pull/20103
- Pin pydantic<2.12 in ci/test_cudf_polars_polars_tests.sh by @mroeschke in https://github.com/rapidsai/cudf/pull/20200
- Add an overhead field to cudf-polars tracing by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20198
- Support binops between float scalar to decimal column by @mroeschke in https://github.com/rapidsai/cudf/pull/20199
- Reduce output buffer sizes for pruned pages of columns with a
listparent by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20086 - Make ListColumn._transform_leaves convert via pylibcudf by @mroeschke in https://github.com/rapidsai/cudf/pull/20151
- Rename
comparison_binop_generatortoarg_minmax_binop_generatorand corresponding file tonested_types_extrema_utils.cuhby @Copilot in https://github.com/rapidsai/cudf/pull/20212 - Pin polars version <1.34 and >=1.29 by @Matt711 in https://github.com/rapidsai/cudf/pull/19912
- Stop using libcudf default parameters in pylibcudf by @vyasr in https://github.com/rapidsai/cudf/pull/20204
- Fix various typing errors by @vyasr in https://github.com/rapidsai/cudf/pull/20205
- Cleanup parquet for simple columns by @pmattione-nvidia in https://github.com/rapidsai/cudf/pull/19869
- Configuration for which metrics are enabled during tracing by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20223
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/rapidsai/cudf/pull/20189
- Fix parquet row number check for page bounds by @pmattione-nvidia in https://github.com/rapidsai/cudf/pull/20217
- More mypy and docs fixes by @vyasr in https://github.com/rapidsai/cudf/pull/20224
- Prevent accidental copies of expensive-to-copy object types by @vuule in https://github.com/rapidsai/cudf/pull/20226
- Split row operator header by @PointKernel in https://github.com/rapidsai/cudf/pull/20166
- Standardize setting StructDtype field names post libcudf conversion by @mroeschke in https://github.com/rapidsai/cudf/pull/20235
- Add arm testing of cudf.pandas unit tests by @vyasr in https://github.com/rapidsai/cudf/pull/20251
- Enable
sccache-distconnection pool by @trxcllnt in https://github.com/rapidsai/cudf/pull/20264 - Run polars tests with the streaming and in-memory executors by @Matt711 in https://github.com/rapidsai/cudf/pull/19354
- Move and rename
ScanPartitionPlanby @rjzamora in https://github.com/rapidsai/cudf/pull/20248 - Unpin DuckDB and Ibis in cudf.pandas thirdparty tests by @mroeschke in https://github.com/rapidsai/cudf/pull/20269
- Add pylibcudf to pre-commit linting and fix outstanding errors by @vyasr in https://github.com/rapidsai/cudf/pull/20250
- Update
ConfigOptionsfor rapidsmpf-streaming integration by @rjzamora in https://github.com/rapidsai/cudf/pull/20252 - Handle unordered grouped windows properly for null filling and cum sums by @Matt711 in https://github.com/rapidsai/cudf/pull/20275
- Add more type annotations to cudf/core/column subclasses by @mroeschke in https://github.com/rapidsai/cudf/pull/20277
- Remove extraneous host_memory_resource include by @bdice in https://github.com/rapidsai/cudf/pull/20284
- Add
MultiIndex.dtypesby @galipremsagar in https://github.com/rapidsai/cudf/pull/20279 - Skip mypy in pre-commit.ci by @bdice in https://github.com/rapidsai/cudf/pull/20286
- Make ColumnBase.deserialize construct via pylibcudf by @mroeschke in https://github.com/rapidsai/cudf/pull/20142
- Add numpy to the mypy pre-commit environment by @vyasr in https://github.com/rapidsai/cudf/pull/20282
- Add ability to set the source_info of parquet_reader_options by @wence- in https://github.com/rapidsai/cudf/pull/20253
- Add more Python type annotations to
cudf/coreby @mroeschke in https://github.com/rapidsai/cudf/pull/20287 - Use main in RAPIDS_BRANCH by @bdice in https://github.com/rapidsai/cudf/pull/20312
- Move "All rights reserved" statements to copyright line by @KyleFromNVIDIA in https://github.com/rapidsai/cudf/pull/20313
- Add
inferred_typeand missingIntervalIndexproperties by @galipremsagar in https://github.com/rapidsai/cudf/pull/20294 - Avoid unseeded, random data generation in cuDF classic tests by @mroeschke in https://github.com/rapidsai/cudf/pull/20319
- Improve hash-based groupby aggregation: direct write to the dense output columns whenever possible by @ttnghia in https://github.com/rapidsai/cudf/pull/19764
- Avoid accessing range values in cudf::strings::contains_re logic by @davidwendt in https://github.com/rapidsai/cudf/pull/20122
- Migrate mixed join to use the multiset data structure by @PointKernel in https://github.com/rapidsai/cudf/pull/19989
- Add benchmark for strings cast to/from integer APIs by @davidwendt in https://github.com/rapidsai/cudf/pull/20247
- Use main shared-workflows branch by @bdice in https://github.com/rapidsai/cudf/pull/20324
- Use the thread pool for Parquet metadata processing by @vuule in https://github.com/rapidsai/cudf/pull/20263
- Add
.dt.day_of_weekand.dt.daysinmonthby @galipremsagar in https://github.com/rapidsai/cudf/pull/20298 - Avoid Column materialization in RangeIndex.nans_to_nulls by @mroeschke in https://github.com/rapidsai/cudf/pull/20331
- Update the code to be compatible with the new cuco stream-ordered allocator by @PointKernel in https://github.com/rapidsai/cudf/pull/20258
- Deprecate Series.data by @mroeschke in https://github.com/rapidsai/cudf/pull/20281
- Align cudf Python's Column constructors by @mroeschke in https://github.com/rapidsai/cudf/pull/20233
- Make type annotations of ColumnBase.set_mask stricter by @mroeschke in https://github.com/rapidsai/cudf/pull/20261
- Make type annotations of ColumnBase.find_and_replace stricter by @mroeschke in https://github.com/rapidsai/cudf/pull/20259
- Make type annotations of ColumnBase.apply_boolean_mask stricter by @mroeschke in https://github.com/rapidsai/cudf/pull/20262
- Skip Python LZ4 tests when nvCOMP is disabled by @vuule in https://github.com/rapidsai/cudf/pull/20293
- Move cudf/io/nvcomp_adapter.hpp to cudf/io/detail by @davidwendt in https://github.com/rapidsai/cudf/pull/20327
- Add context to IR.do_evaluate by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20322
- Update mypy
# type: ignorecomments according to stricter mypy configs by @mroeschke in https://github.com/rapidsai/cudf/pull/20272 - Remove duplicated enforce null consistency code by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20342
- Use SPDX for all copyright headers by @KyleFromNVIDIA in https://github.com/rapidsai/cudf/pull/20321
- Add more type annotations to
cudf/core/series.pyby @mroeschke in https://github.com/rapidsai/cudf/pull/20304 - Remove/Replace uses of numba.cuda arrays in pytest benchmarks and tests by @mroeschke in https://github.com/rapidsai/cudf/pull/20359
- Add duckdb pdsh query queries by @Matt711 in https://github.com/rapidsai/cudf/pull/20257
- Use stream in cudf_polars.DataFrame.to_polars by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20323
- Add
join_streamsto pylibcudf API by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20316 - Use CUDA streams in all pylibcudf calls made by cudf-polars by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20291
- Add cudf/io/config_utils.hpp to doxygen by @davidwendt in https://github.com/rapidsai/cudf/pull/20329
- Test coverage for parallel metadata parsing by @vuule in https://github.com/rapidsai/cudf/pull/20334
- Support serializing more polars types by @Matt711 in https://github.com/rapidsai/cudf/pull/20347
- Add CUDAStreamPolicy to cudf-polars configuration by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20366
- Unskip cudf-polars groupby test by @Matt711 in https://github.com/rapidsai/cudf/pull/20406
- Deprecate pylibcudf interop arrow APIs by @Matt711 in https://github.com/rapidsai/cudf/pull/20405
- Get rid of the hashing helper header by @PointKernel in https://github.com/rapidsai/cudf/pull/20360
- Minor cleanup and fixes for libcudf generate_input.cu by @davidwendt in https://github.com/rapidsai/cudf/pull/20363
- Ignore assert_produces_warning and shares_memory pandas unit tests for cudf.pandas by @mroeschke in https://github.com/rapidsai/cudf/pull/20434
- Short circut RangeIndex.append for length 0 input, proxy private attribute by @mroeschke in https://github.com/rapidsai/cudf/pull/20442
- Mark DataFrame.insert as _external_only_api by @Copilot in https://github.com/rapidsai/cudf/pull/20403
- Deprecate
get_current_device_resourcein favor ofget_current_device_resource_refby @PointKernel in https://github.com/rapidsai/cudf/pull/20386 - Promote
JoinNoneValueto public asJoinNoMatchfor clear non-match Join semantics by @PointKernel in https://github.com/rapidsai/cudf/pull/20440 - Remove duplicate entries in NODEIDS_THAT_FAIL_WITH_CUDF_PANDAS by @mroeschke in https://github.com/rapidsai/cudf/pull/20447
- Use the thread pool in the compact protocol reader by @vuule in https://github.com/rapidsai/cudf/pull/20417
- Update README.md generalizing all cuDF components by @mroeschke in https://github.com/rapidsai/cudf/pull/20357
- Skip TestDatetimelikeCoercion pandas tests that assert ._value identity by @mroeschke in https://github.com/rapidsai/cudf/pull/20459
- Add PSDH Q2-9 for cudf.pandas by @mroeschke in https://github.com/rapidsai/cudf/pull/20418
- Add s3fs to
test_cudf_pythoncommon dependencies by @trxcllnt in https://github.com/rapidsai/cudf/pull/20473 - Use public pandas APIs in StringColumn.to_pandas by @mroeschke in https://github.com/rapidsai/cudf/pull/20474
- Expose java GatherMap internals and add toString to AST by @revans2 in https://github.com/rapidsai/cudf/pull/20483
- Add create_ascii_string_column to the libcudf benchmark data generator by @davidwendt in https://github.com/rapidsai/cudf/pull/20354
- Skip more pandas unit tests that tests BlockManager, private sparse types by @mroeschke in https://github.com/rapidsai/cudf/pull/20489
- Add boto3/botocore/aiobotocore to common test dependencies by @trxcllnt in https://github.com/rapidsai/cudf/pull/20490
- Use a lower bound when estimating the partial file-size by @rjzamora in https://github.com/rapidsai/cudf/pull/20193
- Performance improvement for nvtext::edit_distance for long strings by @davidwendt in https://github.com/rapidsai/cudf/pull/20268
- Add MemoryResourceConfig to cudf-polars config by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20042
- Improve project automation by @vyasr in https://github.com/rapidsai/cudf/pull/20523
- Fuse simple streaming reductions in cudf-polars by @rjzamora in https://github.com/rapidsai/cudf/pull/18757
- Migrate to new CCCL memory resource interface by @bdice in https://github.com/rapidsai/cudf/pull/20513
- Add empty input gtest for cudf::transform by @davidwendt in https://github.com/rapidsai/cudf/pull/20505
- Rework internal json headers to allow converting gtests files from .cu to .cpp by @davidwendt in https://github.com/rapidsai/cudf/pull/20491
- Set continue on error in the cudf-polars-rapidsmpf nightly CI job by @Matt711 in https://github.com/rapidsai/cudf/pull/20550
- Permanently back cuDF column by a pylibcudf.Column by @mroeschke in https://github.com/rapidsai/cudf/pull/20306
- Skip flaky upstream polars rolling test by @Matt711 in https://github.com/rapidsai/cudf/pull/20552
- Accelerate data page mask computation on device by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20280
- Change default rapidsmpf stream policy to 'pool' by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20527
- Increase gtests coverage for cudf::strings::like patterns by @davidwendt in https://github.com/rapidsai/cudf/pull/20348
- Add cuda::std::span operator to cudf::column_view by @davidwendt in https://github.com/rapidsai/cudf/pull/20541
- Update ArrowStringView compare benchmark for gather by @davidwendt in https://github.com/rapidsai/cudf/pull/19935
- Add pytest stubs and remove ujson usage by @vyasr in https://github.com/rapidsai/cudf/pull/20560
- Skip arrow array constructor tests by @Matt711 in https://github.com/rapidsai/cudf/pull/20579
- Add Polars to mypy environment and fix errors by @vyasr in https://github.com/rapidsai/cudf/pull/20563
- Ensure table chunks are unspilled and available by @madsbk in https://github.com/rapidsai/cudf/pull/20583
- Skip tests that assert behavior when copy-on-write is False by @Matt711 in https://github.com/rapidsai/cudf/pull/20506
- Pass streams through
Column.from_array/from_iterable_of_pyby @Matt711 in https://github.com/rapidsai/cudf/pull/20569 - Stop using Dtype annotation by @vyasr in https://github.com/rapidsai/cudf/pull/20590
- Workaround to enable running PDS-H via WebHDFS by @kingcrimsontianyu in https://github.com/rapidsai/cudf/pull/20132
- Update RMM includes from
<rmm/mr/device/*>to<rmm/mr/*>by @bdice in https://github.com/rapidsai/cudf/pull/20607 - Stricter typing import for cudf-polars by @TomAugspurger in https://github.com/rapidsai/cudf/pull/20614
- Avoid the unnecessary H2H copy in the
std::vectorsink by @vuule in https://github.com/rapidsai/cudf/pull/20602 - Preprocessing offsets for Parquet non-dictionary string columns by @pmattione-nvidia in https://github.com/rapidsai/cudf/pull/20430
- Move more pandas unit tests that test private APIs by @mroeschke in https://github.com/rapidsai/cudf/pull/20511
- Use
.plc_columninstead of.to_pylibcudfin rolling, string utilties by @mroeschke in https://github.com/rapidsai/cudf/pull/20562 - Skip TestSetitemNADatetimeLikeDtype pandas unit tests due to private assertion by @mroeschke in https://github.com/rapidsai/cudf/pull/20578
- Pin Polars version <1.35 by @Matt711 in https://github.com/rapidsai/cudf/pull/20266
- Skip pandas unit tests in
test_old_base.pythat test private APIs by @mroeschke in https://github.com/rapidsai/cudf/pull/20572 - Use
.plc_columnattribute instead ofto_pylibcudfmore internally by @mroeschke in https://github.com/rapidsai/cudf/pull/20559 - Skip arrow-backed arithmetic tests and categorize the remaining failing tests by @Matt711 in https://github.com/rapidsai/cudf/pull/20577
- Fix a pytest execution that is spawned in a subprocess by @galipremsagar in https://github.com/rapidsai/cudf/pull/20660
- Accelerated parquet page header decoding when page index is available by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20369
- feat: add error handling for non-existent columns in parquet reader by @gforsyth in https://github.com/rapidsai/cudf/pull/20659
- Optimize row mask computation for single filter column by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20335
- Skip MultiIndex pandas unit tests testing private functionalty,
test_chaining_and_caching.pyby @mroeschke in https://github.com/rapidsai/cudf/pull/20575 - Address minor comments from recent hybrid scan PRs by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/20672
- Add a timeout for the rapidsmpf test run by @vyasr in https://github.com/rapidsai/cudf/pull/20681
- Use
sccache-distbuild cluster for conda and wheel builds by @trxcllnt in https://github.com/rapidsai/cudf/pull/20488
New Contributors
- @Copilot made their first contribution in https://github.com/rapidsai/cudf/pull/20212
- @rockhowse made their first contribution in https://github.com/rapidsai/cudf/pull/20598
Full Changelog: https://github.com/rapidsai/cudf/compare/v25.12.00a...v25.12.00