| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| arm64-v8a-linux-1.13.0.tgz | 2023-07-19 | 676.6 MB | |
| armeabi-v7a-android-1.13.0.tgz | 2023-07-19 | 580.3 MB | |
| arm64-v8a-android-1.13.0.tgz | 2023-07-19 | 665.1 MB | |
| armeabi-v7a-softfp-linux-1.13.0.tgz | 2023-07-19 | 580.6 MB | |
| MegEngine v1.13.tar.gz | 2023-07-07 | 7.9 MB | |
| MegEngine v1.13.zip | 2023-07-07 | 11.1 MB | |
| README.md | 2023-07-07 | 3.7 kB | |
| Totals: 7 Items | 2.5 GB | 0 | |
MegEngine
HighLight
- MegEngine 支持 Trace 后的图使用 XLA 进行编译优化并执行,在 cuda11.8/cudnn8.6.0 上典型分类网络可获得 10%~80% 的速度提升。此特性为试验性特性。关于此功能更多信息请参考文档链接
- 后续版本将不再支持 cuda10.1。
Bugfix
Dataloader
- 优化 dataloader 的报错机制,避免 Dataloader worker 闪退及卡死的情况。
- 消除 pyarrow.SerializationContext() 的 future warning,提升使用体验。
- 修复 pyarrow 版本高于 1.12 时反复 warning 的问题。
第三方硬件
- 支持 atlas 启用 aipp 后输入 format 可以为多种类型(nhwc、nchw、nc1hwc0)。
通用组件
- 修复 slice 的 start 为负数时,index 结果错误的问题。
- 修复由于 ArgSpec 中的参数类型信息被序列化导致的 TracedModule 兼容性问题。
New Features
Python API
- 支持 megengine tensor 与 dlpack 的互相转换。
- interpolate op 新增 trilinear 模式。
CUDA
- 添加 cuda/naive mha proxy 实现。
通用组件
- jit.trace 支持 without host 模式, 目前主要用途是接入其他深度学习编译器(例如 xla),without host 为 True 时,被 trace 包装的函数经过编译后不会再执行函数原始的 python 代码,也不会检查算子序列是否与 trace 记录的序列一致,使用时需要您保证被 trace 部分完全静态。
- 支持外部框架 tensor 与 mge tensor 做计算,例如 mge.tensor(troch.tensor)+mge.tensor 即获取两者相加的结果。
XLA
- 实现 mge op 到 XLA HLO IR 的 lowering rule,支持在 MegEngine 中编译并调用 XLA。
MegEngine
HighLight
- MegEngine supports XLA to compile, optimize and execute graphs after Trace. Typical classification networks on cuda11.8/cudnn8.6.0 can achieve a speed increase of 10%~80%. This feature is experimental. For more information about this function, please refer to Here
- Subsequent versions will no longer support cuda10.1.
Bugfix
Dataloader
- fix dataloader worker crash quietly in some cases.
- Remove the warning of pyarrow on some interfaces.
- Fix the problem of repeated warnings when pyarrow version is higher than 1.12.
第三方硬件
- Enabled multi-type input format when using atlas with aipp (nhwc、nchw、nc1hwc0).
通用组件
- Fixed the problem that the index result was wrong when the start of the slice was negative.
- Fixed TracedModule compatibility issue due to parameter type information in ArgSpec being serialized.
New Features
Python API
- Support the conversion between megengine tensor and dlpack tensor.
- Add trilinear mode for interpolate operator.
CUDA
- Add cuda/naive MHA proxy implementation.
通用组件
- jit.trace supports without host mode. When without host is True, the function wrapped by trace will not execute the original python code of the function after compilation, nor will it check whether the operator sequence is consistent with the sequence recorded by trace. When using it, you need to ensure that the traced part is completely static.
- Support external framework tensor and mge tensor to do calculations, for example, mge.tensor(troch.tensor)+mge.tensor is to get the result of the addition of the two.
XLA
- Implement the lowering rules from mge Op to XLA HLO IR, and support compiling and calling XLA in MegEngine.