FlashInfer - Browse /v0.2.8 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2025-07-15	4.3 kB	0
v0.2.8 source code.tar.gz	2025-07-15	1.4 MB	0
v0.2.8 source code.zip	2025-07-15	1.9 MB	0
Totals: 3 Items		3.3 MB	0

What's Changed

[fix] fix BatchAttention CTA_TILE_KV mask issue by @happierpig in https://github.com/flashinfer-ai/flashinfer/pull/1206
feat: enable and update all-reduce fused quantization by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1164
Fix the issue with auxillary kernel launch and grid dim calculation by @Anerudhan in https://github.com/flashinfer-ai/flashinfer/pull/1208
Fix test_groupwise_scaled_gemm_fp8.py by @jinyangyuan-nvidia in https://github.com/flashinfer-ai/flashinfer/pull/1211
[TVM] Remove enable_pdl from TVM binding interface by @MasterJH5574 in https://github.com/flashinfer-ai/flashinfer/pull/1217
misc: minor adds in readme by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1218
bugfix: fix blackwell fmha hanging issue for empty kv_len by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1198
update trtllm-gen decode attention kernel launcher by @wenscarl in https://github.com/flashinfer-ai/flashinfer/pull/1189
Handle allocation cutlass fused MoE output to caller by @wenscarl in https://github.com/flashinfer-ai/flashinfer/pull/1225
Fix missing hash in the cudnn cubin path by @Anerudhan in https://github.com/flashinfer-ai/flashinfer/pull/1227
bugfix: add logits processor to pyproject.toml by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1224
fix: add trtllm-allreduce-fusion api notes and fix memory error by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1229
feat: Add non-causal cudnn prefill kernels by @Anerudhan in https://github.com/flashinfer-ai/flashinfer/pull/1230
minor: update oneshot handling, add params notes by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1232
Enable cudnn decode and add tests for the cudnn decode kernel by @Anerudhan in https://github.com/flashinfer-ai/flashinfer/pull/1221
docker: add cuda-python to CI docker image by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1233
bugfix: Fix building without get_requires*() invocation by @mgorny in https://github.com/flashinfer-ai/flashinfer/pull/1226
bugfix: support uint8_t for vec_t class template by @chenyang78 in https://github.com/flashinfer-ai/flashinfer/pull/1234
feat: trtllm-gen fp8 moe kernels by @aleozlx in https://github.com/flashinfer-ai/flashinfer/pull/1212
Patch fp8 cubin availability by @aleozlx in https://github.com/flashinfer-ai/flashinfer/pull/1240
[comm] TRT-LLM's Multi-Node NVLink All-Reduce Kernel by @nvmbreughe in https://github.com/flashinfer-ai/flashinfer/pull/1213
feat: Support MXFP8 x MXFP4 CUTLASS grouped GEMM by @jinyangyuan-nvidia in https://github.com/flashinfer-ai/flashinfer/pull/1241
feat: add trtllm-gen mla cubin by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1222
Add DeepGEMM kernels by @cyx-6 in https://github.com/flashinfer-ai/flashinfer/pull/1209
Remove sm100+ requirment for trtllm allreduce kernels by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1249
Defer mpi import for comm module by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1250
feat: support environment variable overrides for NVSHMEM paths and linker flags by @EmilienM in https://github.com/flashinfer-ai/flashinfer/pull/1253
release: bump version to v0.2.8 by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1257
TRT-LLM's Multi-Node NVLink AR + fused RMSNorm kernel by @nvmbreughe in https://github.com/flashinfer-ai/flashinfer/pull/1255

New Contributors

@jinyangyuan-nvidia made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1211
@mgorny made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1226
@chenyang78 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1234
@aleozlx made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1212
@nvmbreughe made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1213
@EmilienM made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1253

Full Changelog: https://github.com/flashinfer-ai/flashinfer/compare/v0.2.7.post1...v0.2.8

Source: README.md, updated 2025-07-15

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

What's Changed

New Contributors

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

Get an email when there's a new version of FlashInfer

What's Changed

New Contributors