OnnxStream

The challenge is to run Stable Diffusion 1.5, which includes a large transformer model with almost 1 billion parameters, on a Raspberry Pi Zero 2, which is a microcomputer with 512MB of RAM, without adding more swap space and without offloading intermediate results on disk. The recommended minimum RAM/VRAM for Stable Diffusion 1.5 is typically 8GB. Generally, major machine learning frameworks and libraries are focused on minimizing inference latency and/or maximizing throughput, all of which at the cost of RAM usage. So I decided to write a super small and hackable inference library specifically focused on minimizing memory consumption: OnnxStream. OnnxStream is based on the idea of decoupling the inference engine from the component responsible for providing the model weights, which is a class derived from WeightsProvider. A WeightsProvider specialization can implement any type of loading, caching, and prefetching of the model parameters.

Features

OnnxStream can consume even 55x less memory than OnnxRuntime with only a 50% to 200% increase in latency
Documentation available
OnnxStream is based on the idea of decoupling the inference engine from the component responsible of providing the model weights
Major machine learning frameworks and libraries are focused on minimizing inference latency
Examples available
The OnnxStream Stable Diffusion example implementation now supports SDXL 1.0

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow OnnxStream

OnnxStream Web Site

Other Useful Business Software

Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now

Rate This Project

User Reviews

Be the first to post a review of OnnxStream!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

C++

Related Categories

C++ Machine Learning Software, C++ Raspberry Pi Software, C++ LLM Inference Tool

Registered

2024-08-14

Similar Business Software

Auth0

Auth0 takes a modern approach to Identity, providing secure access to any application, for any user. Safeguarding billions of login transactions each month, Auth0 delivers convenience, privacy, and security so customers can focus on innovation. Auth0 is part of Okta, The World’s Identity...

See Software
Vertex AI

Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery...

See Software
TruGrid

TruGrid SecureRDP secures access to Windows desktops and applications from any location. It is a DaaS solution that employs a Zero Trust model without firewall exposure. Key Benefits of TruGrid SecureRDP: - No Firewall Exposure & No VPN Required: Secure remote access without exposing...

See Software
Google Compute Engine

Compute Engine is Google's infrastructure as a service (IaaS) platform for organizations to create and run cloud-based virtual machines. Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a...

See Software
Google Cloud Platform

Google Cloud is a cloud-based service that allows you to create anything from simple websites to complex applications for businesses of all sizes. New customers get $300 in free credits to run, test, and deploy workloads. All customers can use 25+ products for free, up to monthly usage...

See Software
Pipefy

Pipefy is the AI-driven Business Orchestration and Automation Technologies (BOAT) platform that delivers enterprise results in days, not months. Designed as a secure orchestration layer, Pipefy bridges the gap between rigid legacy systems (ERPs/CRMs) and agile business needs. It allows IT...

See Software

Report inappropriate content

OnnxStream

Lightweight inference library for ONNX files, written in C++

Get an email when there's a new version of OnnxStream

Features

Project Samples

Project Activity

Categories

License

Follow OnnxStream

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered