Download Latest Version Release v3.8.0 source code.tar.gz (10.4 MB)
Email in envelope

Get an email when there's a new version of IREE

Home / v3.5.0
Name Modified Size InfoDownloads / Week
Parent folder
iree-dist-3.5.0rc20250609-linux-aarch64.tar.xz 2025-06-11 84.3 MB
iree-dist-3.5.0rc20250609-linux-x86_64.tar.xz 2025-06-11 72.4 MB
iree_base_runtime-3.5.0-cp39-cp39-manylinux_2_28_x86_64.whl 2025-06-11 8.1 MB
iree_base_runtime-3.5.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 8.2 MB
iree_base_runtime-3.5.0-cp313-cp313-win_amd64.whl 2025-06-11 5.6 MB
iree_base_runtime-3.5.0-cp313-cp313t-manylinux_2_28_x86_64.whl 2025-06-11 8.1 MB
iree_base_runtime-3.5.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 8.2 MB
iree_base_runtime-3.5.0-cp313-cp313-manylinux_2_28_x86_64.whl 2025-06-11 8.1 MB
iree_base_runtime-3.5.0-cp313-cp313-macosx_13_0_universal2.whl 2025-06-11 3.9 MB
iree_base_runtime-3.5.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 8.2 MB
iree_base_runtime-3.5.0-cp312-cp312-win_amd64.whl 2025-06-11 5.6 MB
iree_base_runtime-3.5.0-cp312-cp312-manylinux_2_28_x86_64.whl 2025-06-11 8.1 MB
iree_base_runtime-3.5.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 8.2 MB
iree_base_runtime-3.5.0-cp312-cp312-macosx_13_0_universal2.whl 2025-06-11 3.9 MB
iree_base_runtime-3.5.0-cp311-cp311-win_amd64.whl 2025-06-11 5.6 MB
iree_base_runtime-3.5.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 8.2 MB
iree_base_runtime-3.5.0-cp311-cp311-manylinux_2_28_x86_64.whl 2025-06-11 8.1 MB
iree_base_runtime-3.5.0-cp311-cp311-macosx_13_0_universal2.whl 2025-06-11 3.9 MB
iree_base_runtime-3.5.0-cp310-cp310-manylinux_2_28_x86_64.whl 2025-06-11 8.1 MB
iree_base_runtime-3.5.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 8.2 MB
iree_base_compiler-3.5.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-06-11 76.4 MB
iree_base_compiler-3.5.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 75.6 MB
iree_base_compiler-3.5.0-cp313-cp313-win_amd64.whl 2025-06-11 51.0 MB
iree_base_compiler-3.5.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-06-11 76.4 MB
iree_base_compiler-3.5.0-cp313-cp313t-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 75.6 MB
iree_base_compiler-3.5.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-06-11 76.4 MB
iree_base_compiler-3.5.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 75.6 MB
iree_base_compiler-3.5.0-cp313-cp313-macosx_13_0_universal2.whl 2025-06-11 62.3 MB
iree_base_compiler-3.5.0-cp312-cp312-win_amd64.whl 2025-06-11 51.0 MB
iree_base_compiler-3.5.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-06-11 76.4 MB
iree_base_compiler-3.5.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 75.6 MB
iree_base_compiler-3.5.0-cp312-cp312-macosx_13_0_universal2.whl 2025-06-11 62.3 MB
iree_base_compiler-3.5.0-cp311-cp311-win_amd64.whl 2025-06-11 51.0 MB
iree_base_compiler-3.5.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-06-11 76.4 MB
iree_base_compiler-3.5.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 75.6 MB
iree_base_compiler-3.5.0-cp311-cp311-macosx_13_0_universal2.whl 2025-06-11 62.3 MB
iree_base_compiler-3.5.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl 2025-06-11 76.4 MB
iree_base_compiler-3.5.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl 2025-06-11 75.6 MB
iree_tools_tflite-20250609.1287-py3-none-any.whl 2025-06-11 3.6 kB
iree_tools_tf-20250609.1287-py3-none-any.whl 2025-06-11 32.6 kB
README.md 2025-06-11 31.7 kB
Release v3.5.0 source code.tar.gz 2025-06-11 9.4 MB
Release v3.5.0 source code.zip 2025-06-11 12.8 MB
Totals: 43 Items   1.6 GB 0

Notable changes

Compiler

  • Added support for AMD Radeon 9060XT and Radeon PRO AI R9070 GPUs #21035.
  • Defined the new gfx950 target within AMDGPU support, incorporating several novel MFMAs (Matrix-Fused Multiply-Add operations) #20623.
  • Introduced CombineLayoutTransformation to consolidate transpose, reshape, and slice into a single iree_linalg_ext.map_scatter operation #20655.
  • Supported medium-sized expanded-shape FP8 in the pingpong strategy #20735. And removed dynamic M bounds checks for 'pingpong' strategies in AMDGPU support #20738.
  • Enhanced CombineLayoutTransformation to support folding tensor.pad operations into map_scatter operations #20797.
  • Constrained GPUAllocPrivateMemoryForDPSOps pass to pure tensor semantics, addressing recent implementation issues #20939.
  • The handling capability of the SpecializeEncodings pass to accommodate pad-based encodings was enhanced, allowing non-serializable encodings to be converted into serializable forms #20845.
  • A refinement was made to prevent the hoisting of set_encoding and unset_encoding operations related to padding encodings #20733. Additionally, the dispatch creation mechanism has been updated to utilize patterns that facilitate the bubbling up of expand_shape operations across collapse_shape operations #20648. The attention operation was optimized by removing unit dimensions from the mask operand #20796 and new logic was introduced to impose limits on the application of padding encoding during dispatch creation #20732.
  • Added support for the F8E8M0FNU type, ensuring validity as a HAL element type #20783. Expanded the range of valid HAL element types to include various scaled MFMA types, specifically f8E8M0FNU, f6E3M2FN, and f4E2M1FN.
  • Linalg Extensions Dialect improvements (#[20688](https://github.com/iree-org/iree/pull/20688),[#20728](https://github.com/iree-org/iree/pull/20728),[#20747](https://github.com/iree-org/iree/pull/20747),[#20776](https://github.com/20688,#20728,#20747,/issues/20776), #19719, #20827, #20863 , #20568, #20916)

Runtime

  • Introduced hal.executable.export condition regions, enhancing dispatch decision-making at each site based on device capabilities and workload parameters #20739.
  • Added user-defined IREE_ALLOCATOR_SYSTEM support #20727, providing the ability for external override of the allocator control function.
  • Extended support for mimalloc v3 as an optional system allocator #20730, enabling integration by setting -DIREE_ALLOCATOR_SYSTEM=mimalloc to statically link mimalloc into iree::base.
  • Enabled the import of external streams into HIP for scenarios requiring close integration with external applications #20972.
  • Heterogenous device support is under development, which will allow compiled programs to allocate buffers across compatible devices and synchronize operations via semaphores. This intial phase will support CPU-only configurations, aiming for seamless integration #20851.
  • The IREE PJRT plugin now supports memory related APIs, logging control and has been updated to the latest API version #20911.
  • An experimental #hal.device.optimal<...> affinity attribute for runtime-resolvable device affinities, initially focusing on allocation-related operations has been implemented #20879.

New Contributors

Full changelog

List of changes

* Move ASM to the end of languages list in CMakeLists.txt by @javidcf in https://github.com/iree-org/iree/pull/19781 * Add MathToROCDL patterns in ConvertToROCLPass. by @benvanik in https://github.com/iree-org/iree/pull/20684 * [Codegen] Add pass to bufferize dispatch.tensor.load/store ops by @Max191 in https://github.com/iree-org/iree/pull/20627 * Raise an error in demotion passes if illegal extern funcs are present. by @benvanik in https://github.com/iree-org/iree/pull/20679 * Properly handle unaligned refs in VM ABI marshaling. by @benvanik in https://github.com/iree-org/iree/pull/20671 * Integrate llvm/llvm-project@7b70fc7 by @IanWood1 in https://github.com/iree-org/iree/pull/20674 * [Codegen][GPU] Add placeholder op for buffer casts on tensors by @qedawkins in https://github.com/iree-org/iree/pull/20589 * Call IREE_TRACE_APP_ENTER/EXIT in compiler tool main functions. by @benvanik in https://github.com/iree-org/iree/pull/20686 * Revert "Call IREE_TRACE_APP_ENTER/EXIT in compiler tool main functions." by @benvanik in https://github.com/iree-org/iree/pull/20691 * Revert "Update workflows to run on macOS 15 (#20675)" by @marbre in https://github.com/iree-org/iree/pull/20690 * Add `flow.tensor.bitcast` for torch view as complex/real. by @benvanik in https://github.com/iree-org/iree/pull/20689 * [Dispatch Creation] Fix infinite reshape loop by @IanWood1 in https://github.com/iree-org/iree/pull/20162 * Fold iree_tensor_ext.dispatch.workload.ordinal on constants. by @benvanik in https://github.com/iree-org/iree/pull/20687 * Add relative error for buffer comparison by @nirvedhmeshram in https://github.com/iree-org/iree/pull/19464 * [Encoding] Allow `PadEncodingAttribute` to support dynamic padding. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20662 * [Codegen] Clean up TilingInterfaceUtils. NFC. by @kuhar in https://github.com/iree-org/iree/pull/20661 * [iree-benchmark] Ensure destructors run before `IREE_TRACE_APP_EXIT` by @rkayaith in https://github.com/iree-org/iree/pull/20694 * [DispatchCreation] Set padding encodings on intermediate tensors. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20634 * [GPU] Cross lane reduction rather than serial by @pashu123 in https://github.com/iree-org/iree/pull/20680 * [Codegen] Drop read_only from LoadFromMemrefOp. by @hanhanW in https://github.com/iree-org/iree/pull/20693 * [Im2col] Remain batch dimension untiled during decomposition when it is contiguous and innermost by @yzhang93 in https://github.com/iree-org/iree/pull/20633 * Runtime float type conversion helpers: Fix handling of denormals. by @bjacob in https://github.com/iree-org/iree/pull/20676 * [NFC] Converting the VM dialect to use tablegen passes. by @benvanik in https://github.com/iree-org/iree/pull/20698 * [LLVMGPU] Vector distribute config to handle dyn dims by @pashu123 in https://github.com/iree-org/iree/pull/20603 * [VectorDistribution] Improve vector.broadcast distribution by @Groverkss in https://github.com/iree-org/iree/pull/20652 * [Codegen][GPU] Improve intrinsic based Attention heuristics by @Groverkss in https://github.com/iree-org/iree/pull/20695 * Adding pass documentation for IREE dialects and pipelines. by @benvanik in https://github.com/iree-org/iree/pull/20705 * [AMDGPU] Support mask optimization for multiple users by @nirvedhmeshram in https://github.com/iree-org/iree/pull/20697 * [Im2col] Fix bug when there is no batch dimension by @yzhang93 in https://github.com/iree-org/iree/pull/20711 * Runtime float conversion helpers: add FP6, FP4 and E8M0 types. by @bjacob in https://github.com/iree-org/iree/pull/20707 * [Codegen][DerivedConfig] Add support to set outermost tile size as vector size by @yzhang93 in https://github.com/iree-org/iree/pull/20692 * Fixing missing allocator arg on the vulkan dynamic symbol table. by @benvanik in https://github.com/iree-org/iree/pull/20712 * [Encoding] Add convertType interface to generalize type conversion by @jtuyls in https://github.com/iree-org/iree/pull/20700 * [Codegen][GPU] Keep range and divisibility annotations on push constants by @krzysz00 in https://github.com/iree-org/iree/pull/19348 * [Codegen][AMDGPU] Add pingpong to default gfx942 tuning by @qedawkins in https://github.com/iree-org/iree/pull/20678 * [Encoding] Drop resolver interface implementation for SpecializedEncodingAttr by @hanhanW in https://github.com/iree-org/iree/pull/20718 * Bump version to 3.5.0 after 3.4.0 release. by @ScottTodd in https://github.com/iree-org/iree/pull/20721 * [LinalgExt] Implement tiling interface for map_scatter by @Max191 in https://github.com/iree-org/iree/pull/20688 * [NFC][Encoding] Move materializeEncodingValueFn to type converter by @jtuyls in https://github.com/iree-org/iree/pull/20720 * Adding user-defined IREE_ALLOCATOR_SYSTEM support. by @benvanik in https://github.com/iree-org/iree/pull/20727 * Added missing dependencies for the bazel build by @TheCBaH in https://github.com/iree-org/iree/pull/20719 * Print opt flags to more accurately reproduce (.linked -> .optimized) by @newling in https://github.com/iree-org/iree/pull/20716 * [NFC] Use ShapedType::isDynamicShape when possible. by @hanhanW in https://github.com/iree-org/iree/pull/20731 * [Codegen][Common] Add transform op to check for lowering configs when matching by @qedawkins in https://github.com/iree-org/iree/pull/20724 * [DispatchCreation] Avoid hoisting set encoding operations with padding encodings by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20733 * [Codegen] Add pass to combine layout transformations by @Max191 in https://github.com/iree-org/iree/pull/20655 * [NFC] Move pack/unpack e2e tests to linalg/. by @hanhanW in https://github.com/iree-org/iree/pull/20728 * Adding mimalloc v3 as an optional system allocator. by @benvanik in https://github.com/iree-org/iree/pull/20730 * [Docs] Make op/attr/type summary styles consistent by @qedawkins in https://github.com/iree-org/iree/pull/20726 * Integrate llvm/llvm-project@15f7c6e by @IanWood1 in https://github.com/iree-org/iree/pull/20725 * [Codegen] Fix invalid use of iterators in `PropagateReshapesByExpansion` by @rkayaith in https://github.com/iree-org/iree/pull/20740 * [AMDGPU] Drop dynamic M bounds checks for pingpong by @qedawkins in https://github.com/iree-org/iree/pull/20738 * [DispatchCreation] Use patterns to bubble up expand shape across collapse shapes. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20648 * Continue trying executable loaders when a loader reports NOT_FOUND. by @benvanik in https://github.com/iree-org/iree/pull/20745 * Adding a `hal.executable.export` condition region. by @benvanik in https://github.com/iree-org/iree/pull/20739 * [LinalgExt][NFC] Move transformtion method declarations to Transforms.h by @hanhanW in https://github.com/iree-org/iree/pull/20747 * Pingpong: add medium-sized expanded-shape FP8 by @bjacob in https://github.com/iree-org/iree/pull/20735 * Add regression test for [#20740] / [#20736] by @rkayaith in https://github.com/iree-org/iree/pull/20750 * [Encoding] Use struct directive for encodingAttr assembly format by @jtuyls in https://github.com/iree-org/iree/pull/20746 * [hip] Add flag for disabling caching for async allocations. by @AWoloszyn in https://github.com/iree-org/iree/pull/20753 * e2e matmul test improvements: faster diagnostics, finer control with environment variables by @bjacob in https://github.com/iree-org/iree/pull/20755 * [Flow] Improve DumpDispatchGraph pass for programs at model level. by @hanhanW in https://github.com/iree-org/iree/pull/20756 * [GPU] Enable vector distribute pipeline for Matvecs by default by @pashu123 in https://github.com/iree-org/iree/pull/20706 * Adding quality and benchmark config docs by @geomin12 in https://github.com/iree-org/iree/pull/20759 * Integrate LLVM to llvm/llvm-project@8404b29 by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20757 * Fix iree-codegen-llvmcpu-configuration-pipeline registration by @RSchwan in https://github.com/iree-org/iree/pull/20761 * Metal HAL: remove shadowed variable by @ziereis in https://github.com/iree-org/iree/pull/20760 * [AMDGPU] Define gfx950 target and its MFMAs by @krzysz00 in https://github.com/iree-org/iree/pull/20623 * Emit a warning when one of the iree-input-demote-* passes is used. by @benvanik in https://github.com/iree-org/iree/pull/20784 * Workaround for stack overflow in stream refine usage. by @benvanik in https://github.com/iree-org/iree/pull/20749 * [NFC][Codegen] Move EncodingNop LayoutAttrInterface to external model by @jtuyls in https://github.com/iree-org/iree/pull/20778 * [HAL] Add F8E8M0FNU by @tgymnich in https://github.com/iree-org/iree/pull/20783 * [Codegen][NFC] Move bufferization test out from LLVMCPU/test. by @hanhanW in https://github.com/iree-org/iree/pull/20789 * [DispatchCreation] White list ops that can be cloned. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20791 * [CPU][NFC] Lit tests cleanup and improvements. by @hanhanW in https://github.com/iree-org/iree/pull/20790 * [NFC][Codegen] Move getEncodingInfo to PackedLayoutAttrInterface by @jtuyls in https://github.com/iree-org/iree/pull/20780 * [NFC] Move LayoutAttrInterface to Encoding by @jtuyls in https://github.com/iree-org/iree/pull/20782 * [LinalgExt] Remove region from LinalgExt::GatherOp by @Groverkss in https://github.com/iree-org/iree/pull/20776 * [NFC][Encoding] Move convertType to LayoutAttrInterface by @jtuyls in https://github.com/iree-org/iree/pull/20794 * [Codegen][LLVMGPU] Optionally linearize the number of workgroups specified by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20787 * [GPU] Enable vector distribute on reduction operations by default by @pashu123 in https://github.com/iree-org/iree/pull/20751 * [Flow] Set known dimensions on concat output by @AaronStGeorge in https://github.com/iree-org/iree/pull/20795 * [DispatchCreation] Remove unit dim from attn mask by @IanWood1 in https://github.com/iree-org/iree/pull/20796 * [ROCm] Set ABI version control variable correctly by @krzysz00 in https://github.com/iree-org/iree/pull/20800 * Removing invalid folder for vm.add + vm.sub ops. by @benvanik in https://github.com/iree-org/iree/pull/20808 * [Encoding] Add getOffsetSizesStrides interface for load/store materialization by @jtuyls in https://github.com/iree-org/iree/pull/20741 * [NFC] Refactor duplicated getEncodingInfo logic by @jtuyls in https://github.com/iree-org/iree/pull/20820 * [GPU] Increase the VAE benchmark threshold by @pashu123 in https://github.com/iree-org/iree/pull/20809 * [PJRT] Fix tensor element type for signed integers by @PragmaTwice in https://github.com/iree-org/iree/pull/19496 * [Codegen] split-k on argmax op by @bangtianliu in https://github.com/iree-org/iree/pull/20717 * [GlobalOptimization] Do not hoist fill-like operations by @Groverkss in https://github.com/iree-org/iree/pull/19719 * [NFC] Cleaning up flow canonicalize pass. by @benvanik in https://github.com/iree-org/iree/pull/20826 * [DispatchCreation] Remove CollapseReductionDimensionsPass by @IanWood1 in https://github.com/iree-org/iree/pull/20829 * [DispatchCreation] Set limits on when padding encoding is applied. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20732 * [iree-test-suite] Update the sharktank models benchmark time by @pashu123 in https://github.com/iree-org/iree/pull/20830 * [LinalgExt] Add a canonicalization pattern to drop unused results from sort op by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/20827 * Update `hanhanW` for CODEOWNERS based on recent activities. by @hanhanW in https://github.com/iree-org/iree/pull/20840 * Sink cast-like flow ops across flow.tensor.transfer/barrier. by @benvanik in https://github.com/iree-org/iree/pull/20839 * [Codegen][GPU] Add support for allocating private memory for unused DPS results by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/20793 * [Codegen] Make ReconcileTranslationInfo work with multiple exports by @qedawkins in https://github.com/iree-org/iree/pull/20801 * Cleaning up iree_hal_module_debug_sink_t destroy. by @benvanik in https://github.com/iree-org/iree/pull/20841 * [Codegen][ROCDL] Drop nominal support for dynamic shared mem by @qedawkins in https://github.com/iree-org/iree/pull/20805 * Integrate llvm-project@faf5d747f174cc by @krzysz00 in https://github.com/iree-org/iree/pull/20828 * [Codegen][NFC] Refresh remove_single_iteration_loop.mlir test. by @hanhanW in https://github.com/iree-org/iree/pull/20842 * [TensorExt] Drop space from count_from_slice printer by @qedawkins in https://github.com/iree-org/iree/pull/20850 * [LinalgExt] Clone iree_linalg_ext.gather (5/5) by @IanWood1 in https://github.com/iree-org/iree/pull/20563 * Fix logic for yieldReplacements in tileDispatchUsingForall by @pashu123 in https://github.com/iree-org/iree/pull/20844 * [Dispatch Creation] Handle `linalg.fill` in collapse dimensions by @IanWood1 in https://github.com/iree-org/iree/pull/20863 * [VectorExt] Fix transfer_gather printer by @IanWood1 in https://github.com/iree-org/iree/pull/20860 * [Codegen] Fix dominance issue in collapse shape fusion by @jtuyls in https://github.com/iree-org/iree/pull/20864 * [VectorExt] Vectorize `iree_linalg_ext.gather` by @IanWood1 in https://github.com/iree-org/iree/pull/20807 * [Dispatch Creation] Clone iree_linalg_ext.gather for attn by @IanWood1 in https://github.com/iree-org/iree/pull/20866 * [LinalgExt] Add map_scatter e2e tests for CPU and VMVX backends. by @hanhanW in https://github.com/iree-org/iree/pull/20861 * [Codegen][GPU] Support padding in CombineLayoutTransformation by @Max191 in https://github.com/iree-org/iree/pull/20797 * Fix padding to nop encoding specialization by @jtuyls in https://github.com/iree-org/iree/pull/20837 * [CodeGen] Fix a MemoryEffectsOpInterface bug in FuseConsumerOp. by @hanhanW in https://github.com/iree-org/iree/pull/20869 * [GPU] Vector distribution support for multiple stores by @pashu123 in https://github.com/iree-org/iree/pull/20816 * [AMDGPU] Rewrite some gpu.shuffle xor to ds_swizzle, per upstream by @krzysz00 in https://github.com/iree-org/iree/pull/20868 * [Codegen] Support multiple forall ops in ReconcileTranslationInfo by @Max191 in https://github.com/iree-org/iree/pull/20848 * [Codegen] Add ukernel support for argmax on BF16 and enable optional max value return by @bangtianliu in https://github.com/iree-org/iree/pull/20768 * [VectorExt] Fix illegal transfer_read during gather vectorization by @IanWood1 in https://github.com/iree-org/iree/pull/20876 * Align iree_hal_sync_device_t allocation to 16 bytes. by @FantasqueX in https://github.com/iree-org/iree/pull/20773 * [LinalgExt] Canonicalize gather to an extract_slice by @IanWood1 in https://github.com/iree-org/iree/pull/20878 * [NFC][Codegen] Rename early bufferization op operands by @Max191 in https://github.com/iree-org/iree/pull/20874 * [LinalgExt] Fold unit dims for iree_linalg_ext.gather by @Groverkss in https://github.com/iree-org/iree/pull/20877 * [Preprocessing] Add SinkReshapesPass in MakeSingleDispatchPassPipeline by @yzhang93 in https://github.com/iree-org/iree/pull/20882 * [Codegen] Add patterns to fold reshapes into load_from/store_to_memref by @Max191 in https://github.com/iree-org/iree/pull/20881 * [Flow] Dump affinity info in DumpDispatchGraph pass. by @hanhanW in https://github.com/iree-org/iree/pull/20888 * [Dispatch Creation] Fix GatherFusionPattern crash by @IanWood1 in https://github.com/iree-org/iree/pull/20887 * Temporary automatic reference counting(ish) pass for inserting async deallocations. by @benvanik in https://github.com/iree-org/iree/pull/20765 * Adding tryLookupResourceUsageAffinity. by @benvanik in https://github.com/iree-org/iree/pull/20891 * Adding support for `#hal.device.optimal<...>` through to runtime. by @benvanik in https://github.com/iree-org/iree/pull/20879 * Fix Link error when `IREECompiler.lib` hits 4GiB by @amd-justchen in https://github.com/iree-org/iree/pull/20892 * [Codegen][NFC] Make namespace usage follow IREE::[Encoding|Codegen]. by @hanhanW in https://github.com/iree-org/iree/pull/20894 * Integrate llvm-project@7a8090c037255b54895d61df2eb141fee48d6d83 by @Groverkss in https://github.com/iree-org/iree/pull/20873 * Add support for dynamic unit trip scf.for to scf.if by @nirvedhmeshram in https://github.com/iree-org/iree/pull/20880 * Adding --iree-rocm-container-type= flag. by @benvanik in https://github.com/iree-org/iree/pull/20902 * [NFC] Rename load_from/store_to_memref to load_from/store_to_buffer by @Max191 in https://github.com/iree-org/iree/pull/20897 * [Codegen] Add pass for specializing executable variants by @qedawkins in https://github.com/iree-org/iree/pull/20771 * Lower `linalg.copy` to direct global load by @lialan in https://github.com/iree-org/iree/pull/20568 * [PJRT] Support PJRT_Memory related APIs in IREE PJRT plugin by @PragmaTwice in https://github.com/iree-org/iree/pull/20911 * Removing canonicalization from CloneToConsumersPass. by @benvanik in https://github.com/iree-org/iree/pull/20917 * Speeding up two hotspots in large programs. by @benvanik in https://github.com/iree-org/iree/pull/20909 * Ignoring single-user ops in CloneToConsumers. by @benvanik in https://github.com/iree-org/iree/pull/20921 * Cleaning up some solver/affinity logging/comments. by @benvanik in https://github.com/iree-org/iree/pull/20922 * [Integrate] Make IREE compatible with the new memref.assume_alignment semantic change. by @hanhanW in https://github.com/iree-org/iree/pull/20913 * [NFC] Simplify constant checks with isZeroInteger and isOneInteger utils. by @hanhanW in https://github.com/iree-org/iree/pull/20915 * Refresh the uses of memref.assume_alignment in lit tests. by @hanhanW in https://github.com/iree-org/iree/pull/20925 * [PJRT] Enable CUDA build for PJRT plugin in pkgci by @PragmaTwice in https://github.com/iree-org/iree/pull/20927 * [LinalgExt] Add gather reshape propagation by @IanWood1 in https://github.com/iree-org/iree/pull/20916 * Adding affinity solver max iterations flag and upping default. by @benvanik in https://github.com/iree-org/iree/pull/20923 * Integrate llvm-project@d45031ce5281 by @hanhanW in https://github.com/iree-org/iree/pull/20924 * [PJRT] Add simple level control to the logger in PJRT plugin by @PragmaTwice in https://github.com/iree-org/iree/pull/20932 * [CPU][RISCV][NFC] Trim IRs from lowering strategy selection tests. by @hanhanW in https://github.com/iree-org/iree/pull/20933 * [Integrate] Drop two reverts for getBackwardSlice changes. by @hanhanW in https://github.com/iree-org/iree/pull/20934 * [GPU] Fix reduction kernel config for vectordistribute by @pashu123 in https://github.com/iree-org/iree/pull/20903 * [LinalgExt][NFC] Check for tensor semantics directly by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/20936 * Revert "[Integrate] Drop two reverts for getBackwardSlice changes." by @hanhanW in https://github.com/iree-org/iree/pull/20941 * Register `ub` dialect and `gpu` passes by @rkayaith in https://github.com/iree-org/iree/pull/20938 * [Codegen] Propagate relayout ops before combining by @Max191 in https://github.com/iree-org/iree/pull/20901 * [Codegen] Enable reshape into buffer folding in BlockDynamicDimensions by @Max191 in https://github.com/iree-org/iree/pull/20898 * Restrict GPUAllocPrivateMemoryForDPSOps to pure tensor semantics by @bjacob in https://github.com/iree-org/iree/pull/20939 * [PJRT] Update PJRT API version to 0.68 by @PragmaTwice in https://github.com/iree-org/iree/pull/20930 * [NFC][LinalgExt] Remove duplicate logic from ReshapeFusion by @IanWood1 in https://github.com/iree-org/iree/pull/20940 * Adds a limit to the solver update-on-initialize recursion depth. by @benvanik in https://github.com/iree-org/iree/pull/20944 * Integrate llvm-project@28eb66b79413 by @hanhanW in https://github.com/iree-org/iree/pull/20942 * [NFC][PJRT] Add a notice for log level setting to PJRT README by @PragmaTwice in https://github.com/iree-org/iree/pull/20951 * Remove reference to linalg op tests in iree-test-suites. by @ScottTodd in https://github.com/iree-org/iree/pull/20956 * Precomputing pinned value affinities during analysis. by @benvanik in https://github.com/iree-org/iree/pull/20945 * [NFC] Fix test issues on Windows. by @lialan in https://github.com/iree-org/iree/pull/20957 * Remove IREE_DISABLE_THREAD_SAFETY_ANALYSIS by @bjacob in https://github.com/iree-org/iree/pull/20954 * Integrate llvm-project@587d6fcbb685e3a57 by @hanhanW in https://github.com/iree-org/iree/pull/20948 * [CPU][NFC] Trim IRs from tile_and_fuse and illegal_configuration tests. by @hanhanW in https://github.com/iree-org/iree/pull/20967 * Fix 'failed to legalize' in padding materialization by @jtuyls in https://github.com/iree-org/iree/pull/20969 * Use adaptor instead of storeOp to fix 'failed to legalize' by @jtuyls in https://github.com/iree-org/iree/pull/20971 * [Codegen][GPU][NFC] Remove dead MMA interface method by @krzysz00 in https://github.com/iree-org/iree/pull/20960 * [GPU] Add overflow flag to index addition in prefetching pass by @nirvedhmeshram in https://github.com/iree-org/iree/pull/20975 * Run windows_x64_msvc on postsubmit and opt-in on presubmit (retry). by @ScottTodd in https://github.com/iree-org/iree/pull/20958 * [DumpExecutableBenchmarks] Use MapVector to simplify code by @rkayaith in https://github.com/iree-org/iree/pull/20976 * Allow importing an external stream into HIP. by @AWoloszyn in https://github.com/iree-org/iree/pull/20972 * [Codegen][GPU] Refactor the way use_direct_load is propagated. by @lialan in https://github.com/iree-org/iree/pull/20926 * [CodeGen][NFC] Delete empty file that was accidentally added. by @hanhanW in https://github.com/iree-org/iree/pull/20983 * [Encoding] Teach specialize encodings to handle pad encodings. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/20845 * [Integrate] Mirror and prioritize the old ConvertVectorStore pattern. by @hanhanW in https://github.com/iree-org/iree/pull/20981 * [NFC] Refresh the interface names for Encoding dialect and data-tiling specifics. by @hanhanW in https://github.com/iree-org/iree/pull/20985 * [Util] Improve Util::FoldDimOp folder to handle memref.assume_alignment ops. by @hanhanW in https://github.com/iree-org/iree/pull/20984 * Assign optimal affinities to allocations during ScheduleAllocation. by @benvanik in https://github.com/iree-org/iree/pull/20965 * Prefetch shared memory in presence of scf.if by @nirvedhmeshram in https://github.com/iree-org/iree/pull/20904 * Bump dawidd6/action-download-artifact from 9 to 10 in the github-actions group by @dependabot in https://github.com/iree-org/iree/pull/20978 * Integrate llvm-project@7797824297e17d4c02fbb1cb904c7919f21af47e by @nirvedhmeshram in https://github.com/iree-org/iree/pull/20987 * [CodeGen] Drop the workaround for memref.assume_alignment chain. by @hanhanW in https://github.com/iree-org/iree/pull/20973 * [Codegen][GPU][NFC] More MMA dead code removal by @krzysz00 in https://github.com/iree-org/iree/pull/20980 * [Codegen] Use attributes to define default tuning specs by @qedawkins in https://github.com/iree-org/iree/pull/20979 * Release our buffer reference regardless of buffer_view success. by @AWoloszyn in https://github.com/iree-org/iree/pull/20988 * Disable flaky tensorcore_vectorization test by @qedawkins in https://github.com/iree-org/iree/pull/21002 * [docs] Update sharktuner documentation by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/20704 * [CPU][NFC] Trim more redundant IRs from lit tests. by @hanhanW in https://github.com/iree-org/iree/pull/21003 * [Stream] Adding AffinityTopologyAttrInterface and HAL implementation. by @ziereis in https://github.com/iree-org/iree/pull/20885 * [Codegen][GPU] Lower gpu.subgroup_reduce to DPP intrinsics on AMD GPUs by @Muzammiluddin-Syed-ECE in https://github.com/iree-org/iree/pull/20468 * [Codegen][NFC] Create tiling utilities file by @AaronStGeorge in https://github.com/iree-org/iree/pull/20961 * [Codegen] Create `scf.forall` -> `scf.for` pass by @AaronStGeorge in https://github.com/iree-org/iree/pull/20962 * [ROCM] Fix tuning module string parameter by @qedawkins in https://github.com/iree-org/iree/pull/21008 * [Dispatch Creation] Merge bubbling of expand and extract by @IanWood1 in https://github.com/iree-org/iree/pull/20989 * Integrate llvm-20250604 by @nirvedhmeshram in https://github.com/iree-org/iree/pull/21010 * Remove restriction on IGEMM lowering for dilated convolutions by @yzhang93 in https://github.com/iree-org/iree/pull/21011 * Add note about GPU time synchronization by @erieaton-amd in https://github.com/iree-org/iree/pull/20997 * Revert "[Codegen][ROCDL] Drop nominal support for dynamic shared mem … by @pravg-amd in https://github.com/iree-org/iree/pull/21020 * [LLVMGPU] Fix linking error when one of the variants has no modules. by @MaheshRavishankar in https://github.com/iree-org/iree/pull/21027 * Folding util.assume.int values that have a single possible value. by @benvanik in https://github.com/iree-org/iree/pull/21025 * [NFC][DispatchCreation] Add better extract of expand test by @IanWood1 in https://github.com/iree-org/iree/pull/21013 * Properly order multiple emplaced dispatch results. by @benvanik in https://github.com/iree-org/iree/pull/21026 * Fix LHS addressing in medium pingpong f16 by @bjacob in https://github.com/iree-org/iree/pull/21017 * [GPU] When Prefetching do not duplicate read stage ops in write stage by @nirvedhmeshram in https://github.com/iree-org/iree/pull/21031 * Integrates/llvm 20250606 by @nirvedhmeshram in https://github.com/iree-org/iree/pull/21030 * Adding IREE::HAL::AnnotateTargetDevicesPass. by @benvanik in https://github.com/iree-org/iree/pull/21022 * [ROCm][Vulkan] Add known targets for Radeon R9070 and 9060XT by @kuhar in https://github.com/iree-org/iree/pull/21035 * Adding AMDGPU HAL driver skeleton. by @benvanik in https://github.com/iree-org/iree/pull/20990 * [Codegen] split-k on argmax to ensure ukernel support by @bangtianliu in https://github.com/iree-org/iree/pull/20906

Commit history: https://github.com/iree-org/iree/compare/v3.4.0...v3.5.0

Source: README.md, updated 2025-06-11