Download Latest Version Release v3.8.0 source code.tar.gz (10.4 MB)
Email in envelope

Get an email when there's a new version of IREE

Home / v3.7.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-09-04 39.1 kB
Release v3.7.0 source code.tar.gz 2025-09-04 9.8 MB
Release v3.7.0 source code.zip 2025-09-04 13.3 MB
iree_tools_tflite-20250904.1374-py3-none-any.whl 2025-09-04 3.6 kB
iree_tools_tf-20250904.1374-py3-none-any.whl 2025-09-04 32.6 kB
iree_base_runtime-3.7.0-cp313-cp313-macosx_13_0_universal2.whl 2025-09-04 3.9 MB
iree_base_runtime-3.7.0-cp312-cp312-macosx_13_0_universal2.whl 2025-09-04 3.9 MB
iree_base_runtime-3.7.0-cp311-cp311-macosx_13_0_universal2.whl 2025-09-04 3.9 MB
iree_base_runtime-3.7.0-cp313-cp313-win_amd64.whl 2025-09-04 5.7 MB
iree_base_runtime-3.7.0-cp312-cp312-win_amd64.whl 2025-09-04 5.7 MB
iree_base_runtime-3.7.0-cp311-cp311-win_amd64.whl 2025-09-04 5.7 MB
iree_base_runtime-3.7.0-cp313-cp313-manylinux_2_28_x86_64.whl 2025-09-04 8.1 MB
iree_base_runtime-3.7.0-cp312-cp312-manylinux_2_28_x86_64.whl 2025-09-04 8.1 MB
iree_base_runtime-3.7.0-cp311-cp311-manylinux_2_28_x86_64.whl 2025-09-04 8.1 MB
iree_base_runtime-3.7.0-cp310-cp310-manylinux_2_28_x86_64.whl 2025-09-04 8.1 MB
iree_base_runtime-3.7.0-cp313-cp313t-manylinux_2_28_x86_64.whl 2025-09-04 8.1 MB
iree_base_runtime-3.7.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 8.3 MB
iree_base_runtime-3.7.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 8.3 MB
iree_base_runtime-3.7.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 8.3 MB
iree_base_runtime-3.7.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 8.3 MB
iree_base_runtime-3.7.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 8.3 MB
iree_base_runtime-3.7.0-cp39-cp39-manylinux_2_28_x86_64.whl 2025-09-04 8.1 MB
iree_base_runtime-3.7.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 8.3 MB
iree_base_compiler-3.7.0-cp313-cp313-win_amd64.whl 2025-09-04 52.1 MB
iree_base_compiler-3.7.0-cp312-cp312-win_amd64.whl 2025-09-04 52.1 MB
iree_base_compiler-3.7.0-cp311-cp311-win_amd64.whl 2025-09-04 52.1 MB
iree_base_compiler-3.7.0-cp313-cp313-macosx_13_0_universal2.whl 2025-09-04 66.7 MB
iree_base_compiler-3.7.0-cp312-cp312-macosx_13_0_universal2.whl 2025-09-04 66.7 MB
iree_base_compiler-3.7.0-cp311-cp311-macosx_13_0_universal2.whl 2025-09-04 66.7 MB
iree_base_compiler-3.7.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 80.8 MB
iree_base_compiler-3.7.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 80.8 MB
iree_base_compiler-3.7.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-09-04 81.6 MB
iree_base_compiler-3.7.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 80.9 MB
iree_base_compiler-3.7.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-09-04 81.5 MB
iree_base_compiler-3.7.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 80.8 MB
iree_base_compiler-3.7.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-09-04 81.5 MB
iree_base_compiler-3.7.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-09-04 81.5 MB
iree_base_compiler-3.7.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-09-04 81.5 MB
iree_base_compiler-3.7.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 80.8 MB
iree_base_compiler-3.7.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-09-04 81.5 MB
iree_base_compiler-3.7.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-09-04 80.8 MB
iree-dist-3.7.0rc20250904-linux-aarch64.tar.xz 2025-09-04 87.0 MB
iree-dist-3.7.0rc20250904-linux-x86_64.tar.xz 2025-09-04 75.2 MB
Totals: 43 Items   1.6 GB 0

Highlights in IREE v3.7 Release

1. Compiler

1.1 FP4 and BFLOAT Support on CPUs:

1.2 Dispatch and Fusion Improvements:

1.3 GPU Codegen and Optimization:

1.4 Convolution and IGEMM Enhancements:

1.5 Data Tiling and Materialization Updates:

1.6 Reduction and Vectorization Improvements:

1.7 Codegen and Canonicalization:

1.8 CPU Pipeline and Lowering Config:

1.9 Python Bindings & Tuner:

2. Runtime

Note: There is an known issue that IREE may miscompile some matmul dispatches on RDNA4

Change Log

Git History

* Fix string parsing of i8 and i16 cl values by @qedawkins in https://github.com/iree-org/iree/pull/21409 * [Codegen] Rename `thread_basis` to `lane_basis`. NFC. by @kuhar in https://github.com/iree-org/iree/pull/21412 * [Codegen] Don't add map_scatter for only reshapes by @Max191 in https://github.com/iree-org/iree/pull/21414 * [Encoding] Remove ambiguity from encoding propagation interface methods by @Max191 in https://github.com/iree-org/iree/pull/21415 * [Dispatch Creation] Fuse bit-truncate ops with producers by @IanWood1 in https://github.com/iree-org/iree/pull/21346 * [LLVMCPU] Populate fp4 expansion patterns on CPUs by @krzysz00 in https://github.com/iree-org/iree/pull/21413 * [Codegen] Materialize 0D set_encoding into no-op by @Max191 in https://github.com/iree-org/iree/pull/21418 * [Codegen][Tuner] add python binding for VirtualMMAIntrinsic by @bangtianliu in https://github.com/iree-org/iree/pull/21403 * Integrate LLVM to llvm/llvm-project@5f53182 by @bangtianliu in https://github.com/iree-org/iree/pull/21408 * [CPU] Propagate cache tiling sizes in lowering config propagation. by @hanhanW in https://github.com/iree-org/iree/pull/21410 * [DispatchCreation] Fuse reshape op chains along with set_encoding ops by @Max191 in https://github.com/iree-org/iree/pull/21365 * Integrate LLVM at 92c55a3 by @bjacob in https://github.com/iree-org/iree/pull/21429 * [DT][SVE] DT support for scalable tiles - encoding materialization for mmt4d by @egebeysel in https://github.com/iree-org/iree/pull/21304 * [CPU] Use lowering config attribute interface in LLVMCPUTileAndFuse. by @hanhanW in https://github.com/iree-org/iree/pull/21405 * Simplify the resolution of `scf.forall` created by split reductions. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21422 * [Codegen] Support multi-result and interchanged generic materialization by @Max191 in https://github.com/iree-org/iree/pull/21416 * Removing iree/base/internal/file_io.h by migrating to file handle. by @benvanik in https://github.com/iree-org/iree/pull/21411 * [Codegen] Collect slices to fuse producers of loop destinations into lane foralls. by @YashDeshpande25 in https://github.com/iree-org/iree/pull/21432 * [TensorExt] Add inliner interface by @qedawkins in https://github.com/iree-org/iree/pull/21437 * [codegen] use vector.broadcast instead of vector.splat by @newling in https://github.com/iree-org/iree/pull/21435 * [Dispatch Creation] Don't place bit-truncate in consumer dispatch by @IanWood1 in https://github.com/iree-org/iree/pull/21379 * [Codegen] Fold bitcastss of inner dimensions into binding.subspan by @krzysz00 in https://github.com/iree-org/iree/pull/21443 * [NFC] removing debug statement by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/21446 * [CPU][NFC] Update pack ops to not carry artificial padding. by @hanhanW in https://github.com/iree-org/iree/pull/21440 * [NFC][LLVMGPU] Move intrinsic sorting to deduceMMASchedule by @Groverkss in https://github.com/iree-org/iree/pull/21447 * [Codegen] Add llvm_unreachable for unhandled WorkgroupId cases by @KyleHerndon in https://github.com/iree-org/iree/pull/21442 * Fix an issue in ReferencePatitioning. by @AWoloszyn in https://github.com/iree-org/iree/pull/21343 * Integrate LLVM at aa1b416 by @raikonenfnu in https://github.com/iree-org/iree/pull/21455 * Bump version to 3.7.0 after 3.6.0 release. by @sa-faizal in https://github.com/iree-org/iree/pull/21460 * [Codegen][Tuner]: expose python binding for mma single subgroup layout by @bangtianliu in https://github.com/iree-org/iree/pull/21454 * [Codegen] Add canonicalizer with IREE codegen specific patterns by @Max191 in https://github.com/iree-org/iree/pull/21456 * Update regression tests to not have artificial padding. by @hanhanW in https://github.com/iree-org/iree/pull/21436 * [NFC] Switch dynamic inputs to flow.tensor.dynamic_constant. by @hanhanW in https://github.com/iree-org/iree/pull/21461 * Integrate LLVM at [8fff23] by @bjacob in https://github.com/iree-org/iree/pull/21463 * [Codegen][LLVMGPU] Add fallback patterns for fp4/f8E8M0FNU handling by @krzysz00 in https://github.com/iree-org/iree/pull/21453 * [Flow] Add support for moving operations with dependencies into dispatch regions by @jtuyls in https://github.com/iree-org/iree/pull/21399 * [GPU] Sort intrinsic pairs for attention configuration by @Groverkss in https://github.com/iree-org/iree/pull/21448 * Integrate LLVM at 1c3e4e99 by @bjacob in https://github.com/iree-org/iree/pull/21476 * [LinalgExt] Fix reshape fusion crash by @IanWood1 in https://github.com/iree-org/iree/pull/21472 * [HAL][AMDGPU] Use doorbell handle in iree_amd_make_cached_queue by @atgutier in https://github.com/iree-org/iree/pull/21479 * [Codegen][AMDGPU] Resolve swizzling hints with GatherToLDSOp by @lialan in https://github.com/iree-org/iree/pull/21478 * [build] Add GFX ARCH type to bitcode file names by @atgutier in https://github.com/iree-org/iree/pull/21484 * [DT][NFC] Unified DEBUG_TYPE for encoding materialization implementations. by @hanhanW in https://github.com/iree-org/iree/pull/21480 * [Codegen] Add SwapExtractWithCollapsePattern by @yzhang93 in https://github.com/iree-org/iree/pull/21419 * [DT][VMVX][NFC] Rename and update VMVX encoding materialization tests. by @hanhanW in https://github.com/iree-org/iree/pull/21488 * [GPU][NFC] Delete unused legacy LLVMGPUTensorPad pass. by @hanhanW in https://github.com/iree-org/iree/pull/21489 * [LinalgExt] Add pattern to make attention more static by @IanWood1 in https://github.com/iree-org/iree/pull/21481 * [DispatchCreation] Add unset encoding through generic propagation by @jtuyls in https://github.com/iree-org/iree/pull/21426 * [Codegen][LLVMGPU] Use inner reduction lowering for multi_reduction by @kuhar in https://github.com/iree-org/iree/pull/21486 * [Codegen][LLVMGPU] Remove math scalarization patterns by @krzysz00 in https://github.com/iree-org/iree/pull/21490 * [LinalgExt] Adding lowering to inner_tiled ops for contraction like ops with scales by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/21358 * Integrate LLVM at dc58a08 by @bjacob in https://github.com/iree-org/iree/pull/21494 * [CPU] Add CombineLayoutTransformation passes after distribution passes. by @hanhanW in https://github.com/iree-org/iree/pull/21444 * [HAL] Allow HAL dialect to store attributes in properties structs by @krzysz00 in https://github.com/iree-org/iree/pull/21485 * Add support for split-reduction-tiling of multiple reduction dimensions. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21474 * [NFC] Followup from https://github.com/iree-org/iree/pull/21474 by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21498 * [Flow][NFC] Fix deprecation warnings for ArrayRef(std::nullopt). by @hanhanW in https://github.com/iree-org/iree/pull/21502 * Integrate LLVM at 9e09c4d by @bjacob in https://github.com/iree-org/iree/pull/21495 * [CPU] adjust CPUPrepareUKernelsPass to accept iree_cpu.lowering by @egebeysel in https://github.com/iree-org/iree/pull/21493 * [DispatchCreation] Ensure that the dynamic quantized kernel gets fused into a single dispatch by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21492 * [CPU] Skip distribution passes if the tile sizes are known as zeros. by @hanhanW in https://github.com/iree-org/iree/pull/21508 * Integrate LLVM at llvm/llvm-project@1381ad497b9a. by @hanhanW in https://github.com/iree-org/iree/pull/21510 * [Codegen] Add infra for lowering MLIR ukernels based on descriptors by @jtuyls in https://github.com/iree-org/iree/pull/21428 * [TensorExt] Add folder for bitcast(tensor.cast) by @qedawkins in https://github.com/iree-org/iree/pull/21507 * [Codegen][Util] Remove TiedOpInterface implementation from IREE::Codegen::InnerTiledOp. by @hanhanW in https://github.com/iree-org/iree/pull/21517 * [Codegen] Remove to_buffer from bufferization deny list by @qedawkins in https://github.com/iree-org/iree/pull/21505 * [CPU] Switch CPUDefault pipeline to use IREE::CPU::LoweringConfigAttr. by @hanhanW in https://github.com/iree-org/iree/pull/21515 * [CPU] Switch CPUDoubleTilingExpert pipeline to use IREE::CPU::LoweringConfigAttr. by @hanhanW in https://github.com/iree-org/iree/pull/21354 * [Codegen] Add pattern to bubble bitcast past extract_slice by @qedawkins in https://github.com/iree-org/iree/pull/21518 * [CPU] Tile reduction dimensions for non-root reduction ops. by @hanhanW in https://github.com/iree-org/iree/pull/21500 * [HAL] Add hal.allocator.resolve_memory_properties by @ziereis in https://github.com/iree-org/iree/pull/21115 * Integrate LLVM at llvm/llvm-project@a28e7f1aad3e by @hanhanW in https://github.com/iree-org/iree/pull/21520 * [CPU] Convert accumulating GEMMs to GEMMs. by @hanhanW in https://github.com/iree-org/iree/pull/21473 * Migrate existing mi300 runners to new mi325 capacity. by @deedongala in https://github.com/iree-org/iree/pull/21523 * [Codegen][NFC] Switch to new LDBG macro. by @hanhanW in https://github.com/iree-org/iree/pull/21525 * [Dispatch Creation] Run multi-use fusion after forming dispatches by @IanWood1 in https://github.com/iree-org/iree/pull/21524 * Integrate stablehlo at openxla/stablehlo@69d6dae46e by @hanhanW in https://github.com/iree-org/iree/pull/21529 * [LinalgExt] Use IndexingMapOpInterface for attention by @IanWood1 in https://github.com/iree-org/iree/pull/21469 * [CPU][NFC] Switch existing tests to use IREE::CPU::LoweringConfig. by @hanhanW in https://github.com/iree-org/iree/pull/21516 * [docs] Add a configuration example for ROCm/HIP targets. by @hanhanW in https://github.com/iree-org/iree/pull/21535 * [Codegen][GPU] Tile fully dynamic root ops to the subgroup size by @krzysz00 in https://github.com/iree-org/iree/pull/21526 * [NFC] Add a dev flag to not do reduction vector distribution by @nirvedhmeshram in https://github.com/iree-org/iree/pull/21532 * [Dispatch] Fix return in multiuse fusion by @IanWood1 in https://github.com/iree-org/iree/pull/21536 * Integrate LLVM at llvm/llvm-project@8e9a0fc0f2e5 by @hanhanW in https://github.com/iree-org/iree/pull/21533 * [LLVMGPU][Codegen] Set FMF for arith.mulf + arith.addf -> math.fma by @efric in https://github.com/iree-org/iree/pull/21528 * [GPU] Add col_major optional attribution to VirtualMMAAttr by @bangtianliu in https://github.com/iree-org/iree/pull/21537 * Pattern to hoist pack unpack ops from scf.for op by @YashDeshpande25 in https://github.com/iree-org/iree/pull/21431 * [Codegen][Encoding] Fix generic op materialization with 0D tensors by @Max191 in https://github.com/iree-org/iree/pull/21545 * [CPU] Refresh CPU pipeline verification. by @hanhanW in https://github.com/iree-org/iree/pull/21541 * [CPU] Drop empty tile sizes from lowering config. by @hanhanW in https://github.com/iree-org/iree/pull/21542 * [DT] Perform vectorization if the value is defined by scf.for by @Abhishek-Varma in https://github.com/iree-org/iree/pull/21543 * iree/runtime: iree-cpuinfo: add SME/SVE feature checks for ARM64 macOS by @Manewing in https://github.com/iree-org/iree/pull/21427 * [iree][gpu] Add LLVM func attributes when setting lowering attention config and change default MNTile seed by @fabianmcg in https://github.com/iree-org/iree/pull/21547 * [Codegen] Fix dominance issues blocking consumer fusions by @Max191 in https://github.com/iree-org/iree/pull/21551 * Revert "[iree][gpu] Add LLVM func attributes when setting lowering attention config and change default MNTile seed" by @fabianmcg in https://github.com/iree-org/iree/pull/21561 * Fix parentheses warning in ireeGPUGetSingleSubgroupLayout by @jtuyls in https://github.com/iree-org/iree/pull/21558 * Integrate LLVM at llvm/llvm-project@1194353 by @jtuyls in https://github.com/iree-org/iree/pull/21559 * [CPU][NFCI] Drop the use of TilingConfig from pipeline. by @hanhanW in https://github.com/iree-org/iree/pull/21556 * [Codegen] Add padding for convolutions before IGEMM by @yzhang93 in https://github.com/iree-org/iree/pull/21470 * [LinalgExt] Implement unit dim folding pattern for map_scatter by @Max191 in https://github.com/iree-org/iree/pull/21563 * [GPU] Add vector distribution pattern for map_scatter by @Max191 in https://github.com/iree-org/iree/pull/21124 * [Codegen] Fix dynamic tensor ukernel descriptor lowering by @jtuyls in https://github.com/iree-org/iree/pull/21570 * [Codegen][e2e testing] Add regression tests of matvec with dynamic reduction by @newling in https://github.com/iree-org/iree/pull/21538 * [VMVX] Migrate VMVX backend to use IREE::CPU::LoweringConfigAttr. by @hanhanW in https://github.com/iree-org/iree/pull/21566 * [CPU][NFC] Migrate TilingConfig to interface methods in split reduction pass. by @hanhanW in https://github.com/iree-org/iree/pull/21564 * [Codegen] Introduce lowering config interface methods for vectorization. by @hanhanW in https://github.com/iree-org/iree/pull/21555 * [CPU][NFC] Migrate TilingConfig to interface methods in LLVMCPU2DScalableTo1DScalable pass. by @hanhanW in https://github.com/iree-org/iree/pull/21565 * [CPU] Drop TilingConfig from KernelDispatch.cpp by @hanhanW in https://github.com/iree-org/iree/pull/21567 * [CPU][NFC] Delete TilingConfig. by @hanhanW in https://github.com/iree-org/iree/pull/21568 * [CPU][DT] Implement data layout propagation for CPU dispatches. by @hanhanW in https://github.com/iree-org/iree/pull/21554 * [Codegen][IGEMM] Fix pre-padding for group convolutions by @yzhang93 in https://github.com/iree-org/iree/pull/21583 * [CPU][AArch64][Test] Add more tests for encoding materialisation by @banach-space in https://github.com/iree-org/iree/pull/21560 * [ROCM] Add ukernel descriptor PDL pattern infra by @jtuyls in https://github.com/iree-org/iree/pull/21572 * Integrate LLVM at llvm/llvm-project@215e6beae02334 by @hanhanW in https://github.com/iree-org/iree/pull/21576 * [DT] Add support for materializing func.func and func.return op. by @hanhanW in https://github.com/iree-org/iree/pull/21582 * [Codegen][GPU] Adding heuristic strategy to reduce tile size to fill workloads to all CUs by @jerryyin in https://github.com/iree-org/iree/pull/21546 * Register `VectorExt` Dialect in LLVMCPUTarget by @NoumanAmir657 in https://github.com/iree-org/iree/pull/21593 * [Codegen] Refactor CombineLayoutTransformation with scope options by @Max191 in https://github.com/iree-org/iree/pull/21577 * [Codegen] Cater to bitwidth of largest operand in reduction by @kuhar in https://github.com/iree-org/iree/pull/21438 * [DT] Fix a bug in encoding propagation when there are scalar inputs. by @hanhanW in https://github.com/iree-org/iree/pull/21596 * LDBG fixes for "[Codegen] Cater to bitwidth of largest operand in reduction" by @hanhanW in https://github.com/iree-org/iree/pull/21601 * [LLVMGPU] Support map_scatter in LLVMGPUVectorDistribute pipeline by @Max191 in https://github.com/iree-org/iree/pull/21595 * [Codegen] linalg.generic with dynamic reduction dim: use `LLVMGPUVectorDistribution`. by @newling in https://github.com/iree-org/iree/pull/21430 * [Dispatch Creation] Fuse pad with generic conv consumer by @IanWood1 in https://github.com/iree-org/iree/pull/21606 * Integrate llvm-project@cfd1ee781f by @krzysz00 in https://github.com/iree-org/iree/pull/21598 * [CODEGEN] Remove special case logic for poison padding by @newling in https://github.com/iree-org/iree/pull/21574 * [CODEGEN] Allow pack-unpack pairs to be hoisted through multiple forOps by @YashDeshpande25 in https://github.com/iree-org/iree/pull/21569 * [Codegen][VectorDistribute] Add pattern to distribute poison by @newling in https://github.com/iree-org/iree/pull/21573 * [Codegen] Refactor CombineLayoutTransformation to use patterns by @Max191 in https://github.com/iree-org/iree/pull/21592 * [ROCM] Add support for multiple-of/bounds PDL constraints by @jtuyls in https://github.com/iree-org/iree/pull/21578 * Integrate llvm-project@351b38f2 by @krzysz00 in https://github.com/iree-org/iree/pull/21609 * Update the logic for resolve `scf.forall` to account for maximum number of workgroups. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21584 * [CPU] Re-enable math tests for RISC-V targets. by @hanhanW in https://github.com/iree-org/iree/pull/21608 * [NFC] Trim compile flags from GPU sharktank tests. by @hanhanW in https://github.com/iree-org/iree/pull/21617 * [Codegen][ROCDL] Add test to ensure fp4 truncation is packed by @krzysz00 in https://github.com/iree-org/iree/pull/21553 * [Codegen] Add CPU e2e tests for fp4 conversions by @krzysz00 in https://github.com/iree-org/iree/pull/21445 * [DispatchCreation] Allow more encoding op fusions by @Max191 in https://github.com/iree-org/iree/pull/21612 * [Codegen] Fix return of non-owning reference in CombineLayoutTransformation by @Max191 in https://github.com/iree-org/iree/pull/21618 * [CPU] Use default flags + iree-opt-level in sharktank tests. by @hanhanW in https://github.com/iree-org/iree/pull/21607 * Cleanup the way config values are retrieved. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21610 * [GPU][Codegen] Distribute to single subgroup for large parallel dimension in reduction by @efric in https://github.com/iree-org/iree/pull/21499 * Revert "[Codegen] Fix dominance issues blocking consumer fusions (#21…551)" by @Max191 in https://github.com/iree-org/iree/pull/21632 * [ROCM] Add PDL pattern driver for embedding ukernels by @jtuyls in https://github.com/iree-org/iree/pull/21591 * [GPU] Add pass to tile convolution operations to matmul by @nirvedhmeshram in https://github.com/iree-org/iree/pull/21552 * [GPU][NFCI] Make dot/mma field optional and trim the IR. by @hanhanW in https://github.com/iree-org/iree/pull/21626 * Reapply "[Codegen] Fix dominance issues blocking consumer fusions (#21551)" by @Max191 in https://github.com/iree-org/iree/pull/21637 * [GPU][NFC] Deprecate iree-codegen-gpu-native-math-precision flag. by @hanhanW in https://github.com/iree-org/iree/pull/21636 * Bump llvm/torch-mlir@46925eb by @zjgarvey in https://github.com/iree-org/iree/pull/21628 * [GPU][DT] dce unused tensor.dim ops in SpecializeExports by @jtuyls in https://github.com/iree-org/iree/pull/21624 * Integrate llvnm-project@ff616a19 by @krzysz00 in https://github.com/iree-org/iree/pull/21614 * [docs] Add documentation for updating golden outputs by @efric in https://github.com/iree-org/iree/pull/21641 * Unifying dispatch/dispatch_indirect and adding extended configuration. by @benvanik in https://github.com/iree-org/iree/pull/21627 * Adding iree_hal_device_queue_dispatch. by @benvanik in https://github.com/iree-org/iree/pull/21630 * [LLVMCPU] Fix llvmcpu check before conversion for complex types by @castigli in https://github.com/iree-org/iree/pull/21644 * [Codegen] Update tests to be in correct state for strategy selection by @newling in https://github.com/iree-org/iree/pull/21647 * Toggle the default option to false for pre-padding convolution flag by @yzhang93 in https://github.com/iree-org/iree/pull/21579 * Update workgroup count op syntax by @rkayaith in https://github.com/iree-org/iree/pull/21656 * Integrate llvm/llvm-project@1ffc38ca4 by @kuhar in https://github.com/iree-org/iree/pull/21658 * [Codegen] Select for pad value just before yielding by @newling in https://github.com/iree-org/iree/pull/21581 * Free buffers synchronously if async caching is disabled. by @AWoloszyn in https://github.com/iree-org/iree/pull/21668 * [DT][NFCI] Switch SetEncoding pass to walk-based pass. by @hanhanW in https://github.com/iree-org/iree/pull/21662 * Bump the github-actions group with 3 updates by @dependabot[bot] in https://github.com/iree-org/iree/pull/21655 * Add support for ml_dtypes to python runtime bindings by @rsuderman in https://github.com/iree-org/iree/pull/21549 * Integrate llvm/llvm-project@8071d279 by @kuhar in https://github.com/iree-org/iree/pull/21669 * [LLVMCPU] Tracks the dimension mapping for multi lowering config by @Yu-Zhewen in https://github.com/iree-org/iree/pull/21649 * [CPU][NFC] Improve code quality and make few methods local. by @hanhanW in https://github.com/iree-org/iree/pull/21673 * Bump llvm/torch-mlir@155680c by @vivekkhandelwal1 in https://github.com/iree-org/iree/pull/21680 * [DT][CPU] Exclude pack ops with reshape producers from lowering config setting by @Yu-Zhewen in https://github.com/iree-org/iree/pull/21675 * [ROCM] Add ukernel descriptor lowering to pipeline by @jtuyls in https://github.com/iree-org/iree/pull/21634 * Fix workgroup_count_from_slice assembly format in test by @jtuyls in https://github.com/iree-org/iree/pull/21685 * [Codegen][GPU] Use arithmetic intensity to guide gemm size categorization - Step 1 by @jerryyin in https://github.com/iree-org/iree/pull/21638 * Integrate llvm/llvm-project@0ff92fe2f by @kuhar in https://github.com/iree-org/iree/pull/21689 * [DispatchCreation] Drop unit dims for flow.parameter.named by @Groverkss in https://github.com/iree-org/iree/pull/21687 * [DT] Set encodings if `iree.opt.data_tiling` unit attribute is attached. by @hanhanW in https://github.com/iree-org/iree/pull/21676 * Add `ChipDetails` definition for MI350X and MI355X target. by @amd-eochoalo in https://github.com/iree-org/iree/pull/21690 * Numerical tests: softmax with dynamic reduction size by @newling in https://github.com/iree-org/iree/pull/21594 * Integrate llvm/llvm-project@9a14b1d254a by @kuhar in https://github.com/iree-org/iree/pull/21702 * [Codegen] Skip scalar ops in large tensor tiling pass by @qedawkins in https://github.com/iree-org/iree/pull/21704 * Revert "[codegen][gpu] Add the `iree-rocdl-use-buffer-instructions` pass (#21335)" by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21695 * [docs] Fix a typo and attach the pass link in tuning.md by @hanhanW in https://github.com/iree-org/iree/pull/21707 * [Codegen][LLVMGPU] Config tests for matmuls by @newling in https://github.com/iree-org/iree/pull/21697 * [Codegen] Use vector distribute for softmax with dynamic reduction size by @newling in https://github.com/iree-org/iree/pull/21650 * Work around gcc bug. NFC. by @kuhar in https://github.com/iree-org/iree/pull/21711 * Move windows builds to experimental to unblock release packages. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21712 * Fix misc coding issues. NFC. by @kuhar in https://github.com/iree-org/iree/pull/21713 * Exclude broken ninja version for Windows package builds. by @ScottTodd in https://github.com/iree-org/iree/pull/21717 * Disable failing ARM-SME tests. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21715 * Use range small vector constructors. NFC. by @kuhar in https://github.com/iree-org/iree/pull/21719 * Expose loops transforms through python api by @Hardcode84 in https://github.com/iree-org/iree/pull/21710 * [Integrate] Drop LLVM revert of "Remove matmul_transpose variants" by @hanhanW in https://github.com/iree-org/iree/pull/21344 * Drop needless template parameters from patterns. NFC. by @kuhar in https://github.com/iree-org/iree/pull/21721 * Revert "Move windows builds to experimental to unblock release packages." by @ScottTodd in https://github.com/iree-org/iree/pull/21723 * [DT] Drop the data-tiling hint after encodings are set. by @hanhanW in https://github.com/iree-org/iree/pull/21724 * [ROCM] Readd SpecializeExports pass by @qedawkins in https://github.com/iree-org/iree/pull/21727 * Fix SmallVector conversion error with gcc by @jtuyls in https://github.com/iree-org/iree/pull/21725 * Bump sarisia/actions-status-discord from 1.15.3 to 1.15.4 in the github-actions group by @dependabot[bot] in https://github.com/iree-org/iree/pull/21730 * Integrate llvm/llvm-project@6fc1deb8b749 by @hanhanW in https://github.com/iree-org/iree/pull/21732 * Remove myself from samples/ CODEOWNERS. by @ScottTodd in https://github.com/iree-org/iree/pull/21726 * [Codegen] Improve early bufferized padding codegen by @Max191 in https://github.com/iree-org/iree/pull/21694 * [CPU] Improve TileRootAndFuseProducerConsumer pass and deprecate TileAndFuse pass. by @hanhanW in https://github.com/iree-org/iree/pull/21674 * Apply UnsignedWhenEquivalent at the ModuleOp level. by @amd-eochoalo in https://github.com/iree-org/iree/pull/21743 * Integrate LLVM at llvm/llvm-project@c65c0e87fc73 by @hanhanW in https://github.com/iree-org/iree/pull/21744 * Integrate LLVM at [bfab80] by @Groverkss in https://github.com/iree-org/iree/pull/21747 * Adding semaphore creation and wait flags for controlling behavior. by @benvanik in https://github.com/iree-org/iree/pull/21619 * Adding iree_hal_device_queue_host_call and emulation. by @benvanik in https://github.com/iree-org/iree/pull/21653 * Fixing merge conflict from [#21619] + [#21653]. by @benvanik in https://github.com/iree-org/iree/pull/21751 * [ConstEval] Do not jit parameterized flow.tensor.constants by @Groverkss in https://github.com/iree-org/iree/pull/21748 * [Dispatch] CollapseDims for extract_slice and scf.forall by @IanWood1 in https://github.com/iree-org/iree/pull/21708 * [Codegen] Add matmul and batched matmul to list of ops to generalize by @newling in https://github.com/iree-org/iree/pull/21720 * [NFC] Moving iree_hal_amdgpu_bitmap to iree/base/internal/. by @benvanik in https://github.com/iree-org/iree/pull/21666 * Temporarily disable the circular buffer for parameter uploads. by @AWoloszyn in https://github.com/iree-org/iree/pull/21758 * [RISCV] Remove unused cmake variables. by @HanKuanChen in https://github.com/iree-org/iree/pull/21746 * Adding IREE_HAL_COMMAND_BUFFER_MODE_UNRETAINED flag. by @benvanik in https://github.com/iree-org/iree/pull/21755 * [DT] Graduate data-tiling fusion from experimental flag to binding option. by @hanhanW in https://github.com/iree-org/iree/pull/21745 * [ROCM] Port mlir ukernels to ukernel descriptor lowering flow by @jtuyls in https://github.com/iree-org/iree/pull/21683 * [Codegen] PV and QK matmul's must have same acc layout by @newling in https://github.com/iree-org/iree/pull/21729 * [DispatchCreation] Fix trailing unit dims case for collapse of expand folding by @dan-garvey in https://github.com/iree-org/iree/pull/21677 * [Codegen] Add corner case for SwapExtractWithCollapsePattern by @yzhang93 in https://github.com/iree-org/iree/pull/21773 * [ROCM] Fix redefinition of symbol error for including tensor ukernels by @jtuyls in https://github.com/iree-org/iree/pull/21780 * [Codegen][IGEMM] Fix and preserve padding dim order for convs by @yzhang93 in https://github.com/iree-org/iree/pull/21772 * [ROCM] Update Ukernel infra to allow ROCM-specific bitcode ukernel lowering by @Abhishek-Varma in https://github.com/iree-org/iree/pull/21681 * [Codegen] Add XOR-based Swizzle Attribute by @sebvince in https://github.com/iree-org/iree/pull/21562 * [GPU][DT] Fix matmul narrow dim selection by @Yu-Zhewen in https://github.com/iree-org/iree/pull/21764 * [NFC] Remove debug messages by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/21768 * Integrate LLVM at llvm/llvm-project@4b84223aad4f by @IanWood1 in https://github.com/iree-org/iree/pull/21791 * [Codegen][Tuner] expose python binding to query target info by @bangtianliu in https://github.com/iree-org/iree/pull/21782 * [Codegen] Remove WarpReduction from ROCDL pipeline by @newling in https://github.com/iree-org/iree/pull/21795 * [Codegen][GPU] Use arithmetic intensity to guide gemm size categorization - step 2 by @jerryyin in https://github.com/iree-org/iree/pull/21691 * [Dispatch][GlobalOpt] Improve transpose fusion for conv by @IanWood1 in https://github.com/iree-org/iree/pull/21778 * [Codegen][LLVMGPU] Give ops same config irrespective of generalized/specialized by @newling in https://github.com/iree-org/iree/pull/21769 * Drop TensorCore/MMA pipelines. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21741 * Integrate LLVM at llvm/llvm-project@f2e6ca805dbb by @IanWood1 in https://github.com/iree-org/iree/pull/21805 * [Codegen][GPU] Adding new heuristics to take all dimensions into account when distributing tiles by @jerryyin in https://github.com/iree-org/iree/pull/21803 * [GPU] Add pattern to sink extract_slice through generic ops by @nirvedhmeshram in https://github.com/iree-org/iree/pull/21796 * [ROCM] Add zero fill check to ukernel patterns by @jtuyls in https://github.com/iree-org/iree/pull/21793 * [GPU][DT] Fix LHS operand offset calculation for DataTiledMMAAttr by @Yu-Zhewen in https://github.com/iree-org/iree/pull/21808 * [VectorDistribute] Correctly find new dimensions during reduction config by @Groverkss in https://github.com/iree-org/iree/pull/21797 * [VectorDistribute] Do not handle bit extend during matmul configuration by @Groverkss in https://github.com/iree-org/iree/pull/21798 * [codegen] more consumer fusion by @ftynse in https://github.com/iree-org/iree/pull/21521 * Move ROCM tests to fix dialect not registered error by @jtuyls in https://github.com/iree-org/iree/pull/21811 * Migrate ROCM ukernels from tuning spec to ukernel descriptor lowering by @jtuyls in https://github.com/iree-org/iree/pull/21794 * [Codegen] Rewrite test so LLVMGPUWarpReduction is not used by @newling in https://github.com/iree-org/iree/pull/21770 * [LinalgExt][NFC] Delete duplicated SingleBlockImplicitTerminator trait. by @hanhanW in https://github.com/iree-org/iree/pull/21818 * Revert "[codegen] more consumer fusion (#21521)" by @pravg-amd in https://github.com/iree-org/iree/pull/21819 * [Codegen][LLVMGPU] Remove LLVMGPUWarpReduction pipeline by @newling in https://github.com/iree-org/iree/pull/21821 * [codegen][rocdl] Remove ROCDLKernelConfig and ROCDLSelectLoweringStrategy by @fabianmcg in https://github.com/iree-org/iree/pull/21820 * Revert "[VectorDistribute] Correctly find new dimensions during reduction config" by @Groverkss in https://github.com/iree-org/iree/pull/21810 * Integrate LLVM at llvm/llvm-project@74275a11038c by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/21831 * [Codegen][GPU] Use arithmetic intensity to guide gemm size categorization - step 3 by @jerryyin in https://github.com/iree-org/iree/pull/21826 * [Hoisting] Fix the double-free issue in `HoistIntoGlobalsPass::cleanupDeadOp`. by @JerryShih in https://github.com/iree-org/iree/pull/21699 * [iree-test-suites] Add data tiling tests for LLAMA 8B by @Abhishek-Varma in https://github.com/iree-org/iree/pull/21832 * Integrate LLVM at llvm/llvm-project@9c7727c62af0 by @fabianmcg in https://github.com/iree-org/iree/pull/21835

New Contributors

Full Changelog: https://github.com/iree-org/iree/compare/v3.6.0...v3.7.0

Source: README.md, updated 2025-09-04