Alternatives to NVIDIA HPC SDK

Compare NVIDIA HPC SDK alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to NVIDIA HPC SDK in 2026. Compare features, ratings, user reviews, pricing, and more from NVIDIA HPC SDK competitors and alternatives in order to make an informed decision for your business.

  • 1
    CUDA

    CUDA

    NVIDIA

    CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs. In GPU-accelerated applications, the sequential part of the workload runs on the CPU – which is optimized for single-threaded performance – while the compute intensive portion of the application runs on thousands of GPU cores in parallel. When using CUDA, developers program in popular languages such as C, C++, Fortran, Python and MATLAB and express parallelism through extensions in the form of a few basic keywords. The CUDA Toolkit from NVIDIA provides everything you need to develop GPU-accelerated applications. The CUDA Toolkit includes GPU-accelerated libraries, a compiler, development tools and the CUDA runtime.
  • 2
    Linaro Forge
    Linaro Forge is an integrated HPC debugging and performance analysis suite that helps developers build reliable, optimized code for servers and high-performance computing environments by combining three core tools, Linaro DDT, a market-leading debugger for C, C++, Fortran, and Python applications; Linaro MAP, a performance profiler that highlights bottlenecks and suggests optimization strategies; and Linaro Performance Reports, which generate concise, one-page summaries of application performance. It supports a wide range of parallel architectures and programming models, including MPI, OpenMP, CUDA, and GPU-accelerated environments on x86-64, 64-bit Arm, and other CPUs and GPUs, and offers a common user interface that makes it easy to switch between debugging and profiling during development.
  • 3
    Arm Allinea Studio
    Arm Allinea Studio is a suite of tools for developing server and HPC applications on Arm-based platforms. It contains Arm-specific compilers and libraries, and debug and optimization tools. Arm Performance Libraries provide optimized standard core math libraries for high-performance computing applications on Arm processors. The library routines, which are available through both Fortran and C interfaces. Arm Performance Libraries are built with OpenMP across many BLAS, LAPACK, FFT, and sparse routines in order to maximize your performance in multi-processor environments.
  • 4
    NVIDIA GPU-Optimized AMI
    The NVIDIA GPU-Optimized AMI is a virtual machine image for accelerating your GPU accelerated Machine Learning, Deep Learning, Data Science and HPC workloads. Using this AMI, you can spin up a GPU-accelerated EC2 VM instance in minutes with a pre-installed Ubuntu OS, GPU driver, Docker and NVIDIA container toolkit. This AMI provides easy access to NVIDIA's NGC Catalog, a hub for GPU-optimized software, for pulling & running performance-tuned, tested, and NVIDIA certified docker containers. The NGC catalog provides free access to containerized AI, Data Science, and HPC applications, pre-trained models, AI SDKs and other resources to enable data scientists, developers, and researchers to focus on building and deploying solutions. This GPU-optimized AMI is free with an option to purchase enterprise support offered through NVIDIA AI Enterprise. For how to get support for this AMI, scroll down to 'Support Information'
    Starting Price: $3.06 per hour
  • 5
    Bright Cluster Manager
    NVIDIA Bright Cluster Manager offers fast deployment and end-to-end management for heterogeneous high-performance computing (HPC) and AI server clusters at the edge, in the data center, and in multi/hybrid-cloud environments. It automates provisioning and administration for clusters ranging in size from a couple of nodes to hundreds of thousands, supports CPU-based and NVIDIA GPU-accelerated systems, and enables orchestration with Kubernetes. Heterogeneous high-performance Linux clusters can be quickly built and managed with NVIDIA Bright Cluster Manager, supporting HPC, machine learning, and analytics applications that span from core to edge to cloud. NVIDIA Bright Cluster Manager is ideal for heterogeneous environments, supporting Arm® and x86-based CPU nodes, and is fully optimized for accelerated computing with NVIDIA GPUs and NVIDIA DGX™ systems.
  • 6
    NVIDIA NGC
    NVIDIA GPU Cloud (NGC) is a GPU-accelerated cloud platform optimized for deep learning and scientific computing. NGC manages a catalog of fully integrated and optimized deep learning framework containers that take full advantage of NVIDIA GPUs in both single GPU and multi-GPU configurations. NVIDIA train, adapt, and optimize (TAO) is an AI-model-adaptation platform that simplifies and accelerates the creation of enterprise AI applications and services. By fine-tuning pre-trained models with custom data through a UI-based, guided workflow, enterprises can produce highly accurate models in hours rather than months, eliminating the need for large training runs and deep AI expertise. Looking to get started with containers and models on NGC? This is the place to start. Private Registries from NGC allow you to secure, manage, and deploy your own assets to accelerate your journey to AI.
  • 7
    Arm Forge
    Build reliable and optimized code for the right results on multiple Server and HPC architectures, from the latest compilers and C++ standards to Intel, 64-bit Arm, AMD, OpenPOWER, and Nvidia GPU hardware. Arm Forge combines Arm DDT, the leading debugger for time-saving high-performance application debugging, Arm MAP, the trusted performance profiler for invaluable optimization advice across native and Python HPC codes, and Arm Performance Reports for advanced reporting capabilities. Arm DDT and Arm MAP are also available as standalone products. Efficient application development for Linux Server and HPC with Full technical support from Arm experts. Arm DDT is the debugger of choice for developing of C++, C, or Fortran parallel, and threaded applications on CPUs, and GPUs. Its powerful intuitive graphical interface helps you easily detect memory bugs and divergent behavior at all scales, making Arm DDT the number one debugger in research, industry, and academia.
  • 8
    NVIDIA Magnum IO
    NVIDIA Magnum IO is the architecture for parallel, intelligent data center I/O. It maximizes storage, network, and multi-node, multi-GPU communications for the world’s most important applications, using large language models, recommender systems, imaging, simulation, and scientific research. Magnum IO utilizes storage I/O, network I/O, in-network compute, and I/O management to simplify and speed up data movement, access, and management for multi-GPU, multi-node systems. It supports NVIDIA CUDA-X libraries and makes the best use of a range of NVIDIA GPU and networking hardware topologies to achieve optimal throughput and low latency. In multi-GPU, multi-node systems, slow CPU, single-thread performance is in the critical path of data access from local or remote storage devices. With storage I/O acceleration, the GPU bypasses the CPU and system memory, and accesses remote storage via 8x 200 Gb/s NICs, achieving up to 1.6 TB/s of raw storage bandwidth.
  • 9
    NVIDIA Parabricks
    NVIDIA® Parabricks® is the only GPU-accelerated suite of genomic analysis applications that delivers fast and accurate analysis of genomes and exomes for sequencing centers, clinical teams, genomics researchers, and high-throughput sequencing instrument developers. NVIDIA Parabricks provides GPU-accelerated versions of tools used every day by computational biologists and bioinformaticians—enabling significantly faster runtimes, workflow scalability, and lower compute costs. From FastQ to Variant Call Format (VCF), NVIDIA Parabricks accelerates runtimes across a series of hardware configurations with NVIDIA A100 Tensor Core GPUs. Genomic researchers can experience acceleration across every step of their analysis workflows, from alignment to sorting to variant calling. When more GPUs are used, a near-linear scaling in compute time is observed compared to CPU-only systems, allowing up to 107X acceleration.
  • 10
    NVIDIA Base Command Manager
    NVIDIA Base Command Manager offers fast deployment and end-to-end management for heterogeneous AI and high-performance computing clusters at the edge, in the data center, and in multi- and hybrid-cloud environments. It automates the provisioning and administration of clusters ranging in size from a couple of nodes to hundreds of thousands, supports NVIDIA GPU-accelerated and other systems, and enables orchestration with Kubernetes. The platform integrates with Kubernetes for workload orchestration and offers tools for infrastructure monitoring, workload management, and resource allocation. Base Command Manager is optimized for accelerated computing environments, making it suitable for diverse HPC and AI workloads. It is available with NVIDIA DGX systems and as part of the NVIDIA AI Enterprise software suite. High-performance Linux clusters can be quickly built and managed with NVIDIA Base Command Manager, supporting HPC, machine learning, and analytics applications.
  • 11
    NVIDIA Isaac
    NVIDIA Isaac is an AI robot development platform that comprises NVIDIA CUDA-accelerated libraries, application frameworks, and AI models to expedite the creation of AI robots, including autonomous mobile robots, robotic arms, and humanoids. The platform features NVIDIA Isaac ROS, a collection of CUDA-accelerated computing packages and AI models built on the open source ROS 2 framework, designed to streamline the development of advanced AI robotics applications. Isaac Manipulator, built on Isaac ROS, enables the development of AI-powered robotic arms that can seamlessly perceive, understand, and interact with their environments. Isaac Perceptor facilitates the rapid development of advanced AMRs capable of operating in unstructured environments like warehouses or factories. For humanoid robotics, NVIDIA Isaac GR00T serves as a research initiative and development platform for general-purpose robot foundation models and data pipelines.
  • 12
    oneAPI

    oneAPI

    Intel

    Intel oneAPI is an open, unified programming model designed to simplify development across CPUs, GPUs, and other accelerators. It provides developers with a highly productive software stack for AI, HPC, and accelerated computing workloads. oneAPI supports scalable hybrid parallelism, enabling performance portability across different hardware architectures. The platform includes optimized libraries, SYCL-based C++ extensions, and powerful developer tools for profiling, debugging, and optimization. Developers can build, optimize, and deploy applications with confidence across data centers, edge systems, and PCs. oneAPI is built on open standards to avoid vendor lock-in while maximizing performance. It empowers developers to write code once and run it efficiently everywhere.
  • 13
    NVIDIA DGX Cloud
    NVIDIA DGX Cloud offers a fully managed, end-to-end AI platform that leverages the power of NVIDIA’s advanced hardware and cloud computing services. This platform allows businesses and organizations to scale AI workloads seamlessly, providing tools for machine learning, deep learning, and high-performance computing (HPC). DGX Cloud integrates seamlessly with leading cloud providers, delivering the performance and flexibility required to handle the most demanding AI applications. This service is ideal for businesses looking to enhance their AI capabilities without the need to manage physical infrastructure.
  • 14
    NVIDIA Isaac Sim
    NVIDIA Isaac Sim is an open source reference robotics simulation application built on NVIDIA Omniverse, enabling developers to design, simulate, test, and train AI-driven robots in physically realistic virtual environments. It is built atop Universal Scene Description (OpenUSD), offering full extensibility so developers can create custom simulators or seamlessly integrate Isaac Sim's capabilities into existing validation pipelines. The platform supports three essential workflows; large-scale synthetic data generation for training foundation models with photorealistic rendering and automatic ground truth labeling; software-in-the-loop testing, which connects actual robot software with simulated hardware to validate control and perception systems; and robot learning through NVIDIA’s Isaac Lab, which accelerates training of behaviors in simulation before real-world deployment. Isaac Sim delivers GPU-accelerated physics (via NVIDIA PhysX) and RTX-enabled sensor simulation.
  • 15
    NVIDIA TensorRT
    NVIDIA TensorRT is an ecosystem of APIs for high-performance deep learning inference, encompassing an inference runtime and model optimizations that deliver low latency and high throughput for production applications. Built on the CUDA parallel programming model, TensorRT optimizes neural network models trained on all major frameworks, calibrating them for lower precision with high accuracy, and deploying them across hyperscale data centers, workstations, laptops, and edge devices. It employs techniques such as quantization, layer and tensor fusion, and kernel tuning on all types of NVIDIA GPUs, from edge devices to PCs to data centers. The ecosystem includes TensorRT-LLM, an open source library that accelerates and optimizes inference performance of recent large language models on the NVIDIA AI platform, enabling developers to experiment with new LLMs for high performance and quick customization through a simplified Python API.
  • 16
    NVIDIA DRIVE
    Software is what turns a vehicle into an intelligent machine. The NVIDIA DRIVE™ Software stack is open, empowering developers to efficiently build and deploy a variety of state-of-the-art AV applications, including perception, localization and mapping, planning and control, driver monitoring, and natural language processing. The foundation of the DRIVE Software stack, DRIVE OS is the first safe operating system for accelerated computing. It includes NvMedia for sensor input processing, NVIDIA CUDA® libraries for efficient parallel computing implementations, NVIDIA TensorRT™ for real-time AI inference, and other developer tools and modules to access hardware engines. The NVIDIA DriveWorks® SDK provides middleware functions on top of DRIVE OS that are fundamental to autonomous vehicle development. These consist of the sensor abstraction layer (SAL) and sensor plugins, data recorder, vehicle I/O support, and a deep neural network (DNN) framework.
  • 17
    NVIDIA Morpheus
    NVIDIA Morpheus is a GPU-accelerated, end-to-end AI framework that enables developers to create optimized applications for filtering, processing, and classifying large volumes of streaming cybersecurity data. Morpheus incorporates AI to reduce the time and cost associated with identifying, capturing, and acting on threats, bringing a new level of security to the data center, cloud, and edge. Morpheus also extends human analysts’ capabilities with generative AI by automating real-time analysis and responses, producing synthetic data to train AI models that identify risks accurately and run what-if scenarios. Morpheus is available as open-source software on GitHub for developers interested in using the latest pre-release features and who want to build from source. Get unlimited usage on all clouds, access to NVIDIA AI experts, and long-term support for production deployments with a purchase of NVIDIA AI Enterprise.
  • 18
    ccminer

    ccminer

    ccminer

    ccminer is an open-source project for CUDA compatible GPUs (nVidia). The project is compatible with both Linux and Windows platforms. This site is intended to share cryptocurrencies mining tools you can trust. Available open-source binaries will be compiled and signed by us. Most of these projects are open-source but could require technical abilities to be compiled correctly.
  • 19
    Amazon EC2 P5 Instances
    Amazon Elastic Compute Cloud (Amazon EC2) P5 instances, powered by NVIDIA H100 Tensor Core GPUs, and P5e and P5en instances powered by NVIDIA H200 Tensor Core GPUs deliver the highest performance in Amazon EC2 for deep learning and high-performance computing applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce the cost to train ML models by up to 40%. These instances help you iterate on your solutions at a faster pace and get to market more quickly. You can use P5, P5e, and P5en instances for training and deploying increasingly complex large language models and diffusion models powering the most demanding generative artificial intelligence applications. These applications include question-answering, code generation, video and image generation, and speech recognition. You can also use these instances to deploy demanding HPC applications at scale for pharmaceutical discovery.
  • 20
    NVIDIA Clara
    Clara’s domain-specific tools, AI pre-trained models, and accelerated applications are enabling AI breakthroughs in numerous fields, including medical devices, imaging, drug discovery, and genomics. Explore the end-to-end pipeline of medical device development and deployment with the Holoscan platform. Build containerized AI apps with the Holoscan SDK and MONAI, and streamline deployment in next-generation AI devices with the NVIDIA IGX developer kits. The NVIDIA Holoscan SDK includes healthcare-specific acceleration libraries, pre-trained AI models, and reference applications for computational medical devices.
  • 21
    Amazon EC2 G4 Instances
    Amazon EC2 G4 instances are optimized for machine learning inference and graphics-intensive applications. It offers a choice between NVIDIA T4 GPUs (G4dn) and AMD Radeon Pro V520 GPUs (G4ad). G4dn instances combine NVIDIA T4 GPUs with custom Intel Cascade Lake CPUs, providing a balance of compute, memory, and networking resources. These instances are ideal for deploying machine learning models, video transcoding, game streaming, and graphics rendering. G4ad instances, featuring AMD Radeon Pro V520 GPUs and 2nd-generation AMD EPYC processors, deliver cost-effective solutions for graphics workloads. Both G4dn and G4ad instances support Amazon Elastic Inference, allowing users to attach low-cost GPU-powered inference acceleration to Amazon EC2 and reduce deep learning inference costs. They are available in various sizes to accommodate different performance needs and are integrated with AWS services such as Amazon SageMaker, Amazon ECS, and Amazon EKS.
  • 22
    AWS Elastic Fabric Adapter (EFA)
    Elastic Fabric Adapter (EFA) is a network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale on AWS. Its custom-built operating system (OS) bypass hardware interface enhances the performance of inter-instance communications, which is critical to scaling these applications. With EFA, High-Performance Computing (HPC) applications using the Message Passing Interface (MPI) and Machine Learning (ML) applications using NVIDIA Collective Communications Library (NCCL) can scale to thousands of CPUs or GPUs. As a result, you get the application performance of on-premises HPC clusters with the on-demand elasticity and flexibility of the AWS cloud. EFA is available as an optional EC2 networking feature that you can enable on any supported EC2 instance at no additional cost. Plus, it works with the most commonly used interfaces, APIs, and libraries for inter-node communications.
  • 23
    NVIDIA Quadro Virtual Workstation
    NVIDIA Quadro Virtual Workstation delivers Quadro-level computing power directly from the cloud, allowing businesses to combine the performance of a high-end workstation with the flexibility of cloud computing. As workloads grow more compute-intensive and the need for mobility and collaboration increases, cloud-based workstations, alongside traditional on-premises infrastructure, offer companies the agility required to stay competitive. The NVIDIA virtual machine image (VMI) comes with the latest GPU virtualization software pre-installed, including updated Quadro drivers and ISV certifications. The virtualization software runs on select NVIDIA GPUs based on Pascal or Turing architectures, enabling faster rendering and simulation from anywhere. Key benefits include enhanced performance with RTX technology support, certified ISV reliability, IT agility through fast deployment of GPU-accelerated virtual workstations, scalability to match business needs, and more.
  • 24
    Amazon EC2 P4 Instances
    Amazon EC2 P4d instances deliver high performance for machine learning training and high-performance computing applications in the cloud. Powered by NVIDIA A100 Tensor Core GPUs, they offer industry-leading throughput and low-latency networking, supporting 400 Gbps instance networking. P4d instances provide up to 60% lower cost to train ML models, with an average of 2.5x better performance for deep learning models compared to previous-generation P3 and P3dn instances. Deployed in hyperscale clusters called Amazon EC2 UltraClusters, P4d instances combine high-performance computing, networking, and storage, enabling users to scale from a few to thousands of NVIDIA A100 GPUs based on project needs. Researchers, data scientists, and developers can utilize P4d instances to train ML models for use cases such as natural language processing, object detection and classification, and recommendation engines, as well as to run HPC applications like pharmaceutical discovery and more.
    Starting Price: $11.57 per hour
  • 25
    NVIDIA RAPIDS
    The RAPIDS suite of software libraries, built on CUDA-X AI, gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces. RAPIDS also focuses on common data preparation tasks for analytics and data science. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs. RAPIDS also includes support for multi-node, multi-GPU deployments, enabling vastly accelerated processing and training on much larger dataset sizes. Accelerate your Python data science toolchain with minimal code changes and no new tools to learn. Increase machine learning model accuracy by iterating on models faster and deploying them more frequently.
  • 26
    NVIDIA Iray
    NVIDIA® Iray® is an intuitive physically based rendering technology that generates photorealistic imagery for interactive and batch rendering workflows. Leveraging AI denoising, CUDA®, NVIDIA OptiX™, and Material Definition Language (MDL), Iray delivers world-class performance and impeccable visuals—in record time—when paired with the newest NVIDIA RTX™-based hardware. The latest version of Iray adds support for RTX, which includes dedicated ray-tracing-acceleration hardware support (RT Cores) and an advanced acceleration structure to enable real-time ray tracing in your graphics applications. In the 2019 release of the Iray SDK, all render modes utilize NVIDIA RTX technology. In combination with AI denoising, this enables you to create photorealistic rendering in seconds instead of minutes. Using Tensor Cores on the newest NVIDIA hardware brings the power of deep learning to both final-frame and interactive photorealistic renderings.
  • 27
    QumulusAI

    QumulusAI

    QumulusAI

    QumulusAI delivers supercomputing without constraint, combining scalable HPC with grid-independent data centers to break bottlenecks and power the future of AI. QumulusAI is universalizing access to AI supercomputing, removing the constraints of legacy HPC and delivering the scalable, high-performance computing AI demands today. And tomorrow too. No virtualization overhead, no noisy neighbors, just dedicated, direct access to AI servers optimized with NVIDIA’s latest GPUs (H200) and Intel/AMD CPUs. QumulusAI offers HPC infrastructure uniquely configured around your specific workloads, instead of legacy providers’ one-size-fits-all approach. We collaborate with you through design, deployment, to ongoing optimization, adapting as your AI projects evolve, so you get exactly what you need at each step. We own the entire stack. That means better performance, greater control, and more predictable costs than with other providers who coordinate with third-party vendors.
  • 28
    Fortran

    Fortran

    Fortran

    Fortran has been designed from the ground up for computationally intensive applications in science and engineering. Mature and battle-tested compilers and libraries allow you to write code that runs close to the metal, fast. Fortran is statically and strongly typed, which allows the compiler to catch many programming errors early on for you. This also allows the compiler to generate efficient binary code. Fortran is a relatively small language that is surprisingly easy to learn and use. Expressing most mathematical and arithmetic operations over large arrays is as simple as writing them as equations on a whiteboard. Fortran is a natively parallel programming language with intuitive array-like syntax to communicate data between CPUs. You can run almost the same code on a single CPU, on a shared-memory multicore system, or on a distributed-memory HPC or cloud-based system.
  • 29
    NVIDIA Virtual PC
    NVIDIA GRID® Virtual PC (GRID vPC) and Virtual Apps (GRID vApps) are virtualization solutions that deliver a user experience that’s nearly indistinguishable from a native PC. With server-side graphics and comprehensive management and monitoring capabilities, GRID future-proofs your VDI environment. Deliver the power of GPU acceleration to every VM (virtual machine) in your organization, creating an unparalleled user experience that leaves your IT team with the time they need to work on business goals and strategy. Whether you’re home or in the office, the way people work is changing dynamically. Today’s applications demand exponentially more graphics power. Although tools like MS teams and Zoom help teams collaborate in real-time, regardless of location, modern workers require multiple monitors to run a range of apps, simultaneously.​ GPU-acceleration with NVIDIA vPC takes on the needs of the new digital world.
  • 30
    NemoClaw

    NemoClaw

    NVIDIA

    NemoClaw from NVIDIA is an AI development framework designed to help developers build and deploy intelligent AI agents and automation workflows. Built on NVIDIA’s NeMo ecosystem, the platform provides tools for creating advanced AI applications powered by large language models and GPU acceleration. NemoClaw allows developers to integrate AI agents that can interact with data, tools, and external services to perform complex tasks automatically. The framework supports scalable deployment on NVIDIA GPUs, enabling high-performance AI processing for demanding workloads. Developers can use NemoClaw to build applications such as conversational agents, workflow automation tools, and AI-powered assistants. The platform also includes capabilities for integrating custom tools and APIs, giving agents the ability to perform real-world actions. By combining NVIDIA’s AI infrastructure with agent-based development, NemoClaw helps organizations build powerful AI-driven systems efficiently.
  • 31
    NVIDIA Modulus
    NVIDIA Modulus is a neural network framework that blends the power of physics in the form of governing partial differential equations (PDEs) with data to build high-fidelity, parameterized surrogate models with near-real-time latency. Whether you’re looking to get started with AI-driven physics problems or designing digital twin models for complex non-linear, multi-physics systems, NVIDIA Modulus can support your work. Offers building blocks for developing physics machine learning surrogate models that combine both physics and data. The framework is generalizable to different domains and use cases—from engineering simulations to life sciences and from forward simulations to inverse/data assimilation problems. Provides parameterized system representation that solves for multiple scenarios in near real time, letting you train once offline to infer in real time repeatedly.
  • 32
    AI-Q NVIDIA Blueprint
    Create AI agents that reason, plan, reflect, and refine to produce high-quality reports based on source materials of your choice. An AI research agent, informed by many data sources, can synthesize hours of research in minutes. The AI-Q NVIDIA Blueprint enables developers to build AI agents that use reasoning and connect to many data sources and tools to distill in-depth source materials with efficiency and precision. Using AI-Q, agents summarize large data sets, generating tokens 5x faster and ingesting petabyte-scale data 15x faster with better semantic accuracy. Multimodal PDF data extraction and retrieval with NVIDIA NeMo Retriever, 15x faster ingestion of enterprise data, 3x lower retrieval latency, multilingual and cross-lingual, reranking to further improve accuracy, and GPU-accelerated index creation and search.
  • 33
    Fortran Package Manager
    Package manager and build system for Fortran. There are already many packages available for use with fpm, providing an easily accessible and rich ecosystem of general-purpose and high-performance code. Fortran Package Manager (fpm) is a package manager and build system for Fortran. Its key goal is to improve the user experience of Fortran programmers. It does so by making it easier to build your Fortran program or library, run the executables, tests, and examples, and distribute it as a dependency to other Fortran projects. Fpm’s user interface is modeled after Rust’s Cargo. Its long-term vision is to nurture and grow the ecosystem of modern Fortran applications and libraries. The Fortran package manager has a plugin system that allows it to easily extend its functionality. The fpm-search project is a plugin to query the package registry. Since it is built with fpm we can easily install it on our system.
  • 34
    NVIDIA Holoscan
    NVIDIA® Holoscan is a domain-agnostic AI computing platform that delivers the accelerated, full-stack infrastructure required for scalable, software-defined, and real-time processing of streaming data running at the edge or in the cloud. Holoscan supports a camera serial interface and front-end sensors for video capture, ultrasound research, data acquisition, and connection to legacy medical devices. Use the NVIDIA Holoscan SDK’s data transfer latency tool to measure complete, end-to-end latency for video processing applications. Access AI reference pipelines for radar, high-energy light sources, endoscopy, ultrasound, and other streaming video applications. NVIDIA Holoscan includes optimized libraries for network connectivity, data processing, and AI, as well as examples to create and run low-latency data-streaming applications using either C++, Python, or Graph Composer.
  • 35
    FPT Cloud

    FPT Cloud

    FPT Cloud

    FPT Cloud is a next‑generation cloud computing and AI platform that streamlines innovation by offering a robust, modular ecosystem of over 80 services, from compute, storage, database, networking, and security to AI development, backup, disaster recovery, and data analytics, built to international standards. Its offerings include scalable virtual servers with auto‑scaling and 99.99% uptime; GPU‑accelerated infrastructure tailored for AI/ML workloads; FPT AI Factory, a comprehensive AI lifecycle suite powered by NVIDIA supercomputing (including infrastructure, model pre‑training, fine‑tuning, model serving, AI notebooks, and data hubs); high‑performance object and block storage with S3 compatibility and encryption; Kubernetes Engine for managed container orchestration with cross‑cloud portability; managed database services across SQL and NoSQL engines; multi‑layered security with next‑gen firewalls and WAFs; centralized monitoring and activity logging.
  • 36
    RocketWhisper

    RocketWhisper

    Mojosoft Co., Ltd.

    RocketWhisper is a powerful desktop speech recognition and transcription application that runs 100% offline on your computer. Your voice data never leaves your machine - complete privacy guaranteed. Powered by OpenAI's Whisper engine with NVIDIA GPU (CUDA) acceleration, RocketWhisper delivers fast and accurate speech-to-text conversion for professionals, content creators, and anyone who works with voice and text. Key Features: - 100% offline processing - voice data never leaves your PC - OpenAI Whisper engine for high-accuracy speech recognition - NVIDIA CUDA GPU acceleration - up to 10x faster than CPU - Real-time voice-to-text input with global hotkey (Push-to-Talk with Right Alt) - Batch transcription of multiple audio/video files (MP3, WAV, M4A, MP4, MKV, AVI, etc.) - SRT/VTT subtitle export for video content - AI text formatting with LLM integration (OpenAI, Anthropic, Google Gemini, Grok, local LLM)
    Starting Price: $32 one-time
  • 37
    NVIDIA Blueprints
    NVIDIA Blueprints are reference workflows for agentic and generative AI use cases. Enterprises can build and operationalize custom AI applications, creating data-driven AI flywheels, using Blueprints along with NVIDIA AI and Omniverse libraries, SDKs, and microservices. Blueprints also include partner microservices, reference code, customization documentation, and a Helm chart for deployment at scale. With NVIDIA Blueprints, developers benefit from a unified experience across the NVIDIA stack, from cloud and data centers to NVIDIA RTX AI PCs and workstations. Use NVIDIA Blueprints to create AI agents that use sophisticated reasoning and iterative planning to solve complex problems. Check out new NVIDIA Blueprints, which equip millions of enterprise developers with reference workflows for building and deploying generative AI applications. Connect AI applications to enterprise data using industry-leading embedding and reranking models for information retrieval at scale.
  • 38
    RightNow AI

    RightNow AI

    RightNow AI

    RightNow AI is an AI-powered platform designed to automatically profile, detect bottlenecks, and optimize CUDA kernels for peak performance. It supports all major NVIDIA architectures, including Ampere, Hopper, Ada Lovelace, and Blackwell GPUs. It enables users to generate optimized CUDA kernels instantly using natural language prompts, eliminating the need for deep GPU expertise. With serverless GPU profiling, users can identify performance issues without relying on local hardware. RightNow AI replaces complex legacy optimization tools with a streamlined solution, offering features such as inference-time scaling and performance benchmarking. Trusted by leading AI and HPC teams worldwide, including Nvidia, Adobe, and Samsung, RightNow AI has demonstrated performance improvements ranging from 2x to 20x over standard implementations.
    Starting Price: $20 per month
  • 39
    NVIDIA Isaac Lab
    NVIDIA Isaac Lab is a GPU‑accelerated, open source robot learning framework built on top of Isaac Sim, designed to unify and simplify robotics research workflows such as reinforcement learning, imitation learning, and motion planning. It leverages realistic sensor and physics simulation to support accurate training of embodied agents, providing ready‑to‑use environments, spanning manipulators, quadrupeds, and humanoids—with support for 30+ benchmark tasks and integration with popular RL libraries like RL Games, Stable Baselines, RSL RL, and SKRL. Isaac Lab features a modular, configuration‑driven design that enables developers to easily create, modify, and scale learning environments; it also supports collecting demonstrations via peripherals (gamepads, keyboards) and allows custom actuator models to facilitate sim‑to‑real transfer. The framework is built for both local and cloud deployment, accommodating flexible scaling of compute resources.
  • 40
    NVIDIA PhysicsNeMo
    NVIDIA PhysicsNeMo is an open source Python deep-learning framework for building, training, fine-tuning, and inferring physics-AI models that combine physics knowledge with data to accelerate simulations, create high-fidelity surrogate models, and enable near-real-time predictions across domains such as computational fluid dynamics, structural mechanics, electromagnetics, weather and climate, and digital twin applications. It provides scalable, GPU-accelerated tools and Python APIs built on PyTorch and released under the Apache 2.0 license, offering curated model architectures including physics-informed neural networks, neural operators, graph neural networks, and generative AI–based approaches so developers can harness physics-driven causality alongside observed data for engineering-grade modeling. PhysicsNeMo includes end-to-end training pipelines from geometry ingestion to differential equations, reference application recipes to jump-start workflows.
  • 41
    Unicorn Render

    Unicorn Render

    Unicorn Render

    Unicorn Render is a professional rendering software that enables users to produce stunning realistic pictures and achieve high-end rendering levels without any prior skills. It offers a user-friendly interface designed to provide everything needed to obtain amazing results with minimal controls. Available as a standalone application or as a plugin, Unicorn Render integrates advanced AI technology and professional visualization tools. The software supports GPU+CPU acceleration through deep learning photorealistic rendering technology and NVIDIA CUDA technology, allowing joint support for CUDA GPUs and multicore CPUs. It features real-time progressive physics illumination, a Metropolis Light Transport sampler (MLT), a caustic sampler, and native NVIDIA MDL material support. Unicorn Render's WYSIWYG editing mode ensures that 100% of editing can be done in final image quality, eliminating surprises in the production of the final image.
  • 42
    NVIDIA virtual GPU
    NVIDIA virtual GPU (vGPU) software enables powerful GPU performance for workloads ranging from graphics-rich virtual workstations to data science and AI, enabling IT to leverage the management and security benefits of virtualization as well as the performance of NVIDIA GPUs required for modern workloads. Installed on a physical GPU in a cloud or enterprise data center server, NVIDIA vGPU software creates virtual GPUs that can be shared across multiple virtual machines, and accessed by any device, anywhere. Deliver performance virtually indistinguishable from a bare metal environment. Leverage common data center management tools such as live migration. Provision GPU resources with fractional or multi-GPU virtual machine (VM) instances. Responsive to changing business requirements and remote teams.
  • 43
    IONOS Cloud GPU Servers
    IONOS GPU Servers provide an accelerated computing infrastructure designed to handle workloads that require significantly more processing power than traditional CPU-based systems. It integrates enterprise-grade NVIDIA GPUs such as the H100, H200, and L40s, as well as specialized AI accelerators like Intel Gaudi, enabling massive parallel processing for compute-intensive applications. GPU-accelerated instances extend cloud infrastructure with dedicated graphics processors so virtual machines can perform complex calculations and data-heavy operations much faster than conventional servers. It is particularly suitable for artificial intelligence, deep learning, and data science tasks that involve training models on large datasets or performing high-speed inference operations. It also supports big data analytics, scientific simulations, and visualization workloads such as 3D rendering or modeling that require high computational throughput.
    Starting Price: $3,990 per month
  • 44
    NVIDIA AI Data Platform
    ​NVIDIA's AI Data Platform is a comprehensive solution designed to accelerate enterprise storage and optimize AI workloads, facilitating the development of agentic AI applications. It integrates NVIDIA Blackwell GPUs, BlueField-3 DPUs, Spectrum-X networking, and NVIDIA AI Enterprise software to enhance performance and accuracy in AI workflows. NVIDIA AI Data Platform optimizes workload distribution across GPUs and nodes, leveraging intelligent routing, load balancing, and advanced caching to enable scalable, complex AI processes. This infrastructure supports the deployment and scaling of AI agents across hybrid data centers, transforming raw data into actionable insights in real-time. ​With the platform, enterprises can process and extract insights from structured or unstructured data, unlocking valuable insights from all available data sources, text, PDF, images, and video.
  • 45
    NVIDIA NeMo Megatron
    NVIDIA NeMo Megatron is an end-to-end framework for training and deploying LLMs with billions and trillions of parameters. NVIDIA NeMo Megatron, part of the NVIDIA AI platform, offers an easy, efficient, and cost-effective containerized framework to build and deploy LLMs. Designed for enterprise application development, it builds upon the most advanced technologies from NVIDIA research and provides an end-to-end workflow for automated distributed data processing, training large-scale customized GPT-3, T5, and multilingual T5 (mT5) models, and deploying models for inference at scale. Harnessing the power of LLMs is made easy through validated and converged recipes with predefined configurations for training and inference. Customizing models is simplified by the hyperparameter tool, which automatically searches for the best hyperparameter configurations and performance for training and inference on any given distributed GPU cluster configuration.
  • 46
    NVIDIA Base Command
    NVIDIA Base Command™ is a software service for enterprise-class AI training that enables businesses and their data scientists to accelerate AI development. Part of the NVIDIA DGX™ platform, Base Command Platform provides centralized, hybrid control of AI training projects. It works with NVIDIA DGX Cloud and NVIDIA DGX SuperPOD. Base Command Platform, in combination with NVIDIA-accelerated AI infrastructure, provides a cloud-hosted solution for AI development, so users can avoid the overhead and pitfalls of deploying and running a do-it-yourself platform. Base Command Platform efficiently configures and manages AI workloads, delivers integrated dataset management, and executes them on right-sized resources ranging from a single GPU to large-scale, multi-node clusters in the cloud or on-premises. Because NVIDIA’s own engineers and researchers rely on it every day, the platform receives continuous software enhancements.
  • 47
    NVIDIA Nemotron
    NVIDIA Nemotron is a family of open-source models developed by NVIDIA, designed to generate synthetic data for training large language models (LLMs) for commercial applications. The Nemotron-4 340B model, in particular, is a significant release by NVIDIA, offering developers a powerful tool to generate high-quality data and filter it based on various attributes using a reward model.
  • 48
    VMware Private AI Foundation
    VMware Private AI Foundation is a joint, on‑premises generative AI platform built on VMware Cloud Foundation (VCF) that enables enterprises to run retrieval‑augmented generation workflows, fine‑tune and customize large language models, and perform inference in their own data centers, addressing privacy, choice, cost, performance, and compliance requirements. It integrates the Private AI Package (including vector databases, deep learning VMs, data indexing and retrieval services, and AI agent‑builder tools) with NVIDIA AI Enterprise (comprising NVIDIA microservices like NIM, NVIDIA’s own LLMs, and third‑party/open source models from places like Hugging Face). It supports full GPU virtualization, monitoring, live migration, and efficient resource pooling on NVIDIA‑certified HGX servers with NVLink/NVSwitch acceleration. Deployable via GUI, CLI, and API, it offers unified management through self‑service provisioning, model store governance, and more.
  • 49
    NVIDIA NeMo Retriever
    NVIDIA NeMo Retriever is a collection of microservices for building multimodal extraction, reranking, and embedding pipelines with high accuracy and maximum data privacy. It delivers quick, context-aware responses for AI applications like advanced retrieval-augmented generation (RAG) and agentic AI workflows. As part of the NVIDIA NeMo platform and built with NVIDIA NIM, NeMo Retriever allows developers to flexibly leverage these microservices to connect AI applications to large enterprise datasets wherever they reside and fine-tune them to align with specific use cases. NeMo Retriever provides components for building data extraction and information retrieval pipelines. The pipeline extracts structured and unstructured data (e.g., text, charts, tables), converts it to text, and filters out duplicates. A NeMo Retriever embedding NIM converts the chunks into embeddings and stores them in a vector database, accelerated by NVIDIA cuVS, for enhanced performance and speed of indexing.
  • 50
    Google Cloud GPUs
    Speed up compute jobs like machine learning and HPC. A wide selection of GPUs to match a range of performance and price points. Flexible pricing and machine customizations to optimize your workload. High-performance GPUs on Google Cloud for machine learning, scientific computing, and 3D visualization. NVIDIA K80, P100, P4, T4, V100, and A100 GPUs provide a range of compute options to cover your workload for each cost and performance need. Optimally balance the processor, memory, high-performance disk, and up to 8 GPUs per instance for your individual workload. All with the per-second billing, so you only pay only for what you need while you are using it. Run GPU workloads on Google Cloud Platform where you have access to industry-leading storage, networking, and data analytics technologies. Compute Engine provides GPUs that you can add to your virtual machine instances. Learn what you can do with GPUs and what types of GPU hardware are available.
    Starting Price: $0.160 per GPU