| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| Neuron SDK Release - July 31, 2025 source code.tar.gz | 2025-07-31 | 56.9 MB | |
| Neuron SDK Release - July 31, 2025 source code.zip | 2025-07-31 | 57.9 MB | |
| README.md | 2025-07-31 | 3.2 kB | |
| Totals: 3 Items | 114.8 MB | 0 | |
Neuron 2.25.0 delivers updates across several key areas: inference performance optimizations, expanded model support, enhanced profiling capabilities, improved monitoring and observability tools, framework updates, and refreshed development environments and container offerings. The release includes bug fixes across the SDK components, along with updated tutorials and documentation for new features and model deployments.
Inference Optimizations (NxD Core and NxDI) Neuron 2.25.0 introduces performance optimizations and new capabilities including:
- On-device Forward Pipeline, reducing latency by up to 43% in models like Pixtral
- Context and Data Parallel support for improved batch scaling
- Chunked Attention for efficient long sequence processing
- 128k context length support for Llama 70B models
- Automatic Aliasing (Beta) for faster tensor operations
- Disaggregated Serving (Beta) showing 20% improvement in ITL/TTST
Model Support (NxDI) Neuron 2.25.0 expands model support to include:
- Qwen3 dense models (0.6B to 32B parameters)
- Flux.1-dev model for text-to-image generation (Beta)
- Pixtral-Large-Instruct-2411 for image-to-text generation (Beta)
Profiling Updates Enhancements to profiling capabilities include:
- Addition of timestamp sync points to align device execution with CPU events
- Expanded JSON output providing the same detailed data set used by the Neuron Profiler UI
- New total active time metric showing accelerator utilization as percentage of total runtime
- Fixed DMA active time calculation for more accurate measurements
Monitoring and Observability - neuron-ls now displays CPU and NUMA node affinity information - neuron-ls adds NeuronCore IDs display for each Neuron Device - neuron-monitor improves accuracy of device utilization metrics
Framework Updates - JAX 0.6.1 support added, maintaining compatibility with versions 0.4.31-0.4.38 and 0.5 - vLLM support upgraded to version 0.9.x V0
Development Environment Updates Neuron SDK updated to version 2.25.0 in: - Deep Learning AMIs on Ubuntu 22.04 and Amazon Linux 2023 - Multi-framework DLAMI with environments for both PyTorch and JAX - PyTorch 2.7 Single Framework DLAMI - JAX 0.6 Single Framework DLAMI
Container Support Neuron SDK updated to version 2.25.0 in:
- PyTorch 2.7 Training and Inference DLCs
- JAX 0.6 Training DLC
- vLLM 0.9.1 Inference DLC
- Neuron Device Plugin and Scheduler container images for Kubernetes integration