Connect home devices into a powerful cluster to accelerate LLM
FAIR Sequence Modeling Toolkit 2
The official SuiteSparse library: a suite of sparse matrix algorithms
UCCL is an efficient communication library for GPUs
MyDumper project
TT-NN operator library, and TT-Metalium low level kernel programming
OneFlow is a deep learning framework designed to be user-friendly
A Ruby/Rack web server built for concurrency
GPU DataFrame Library
Solving the Satoshi Puzzle
Binary Modular DataFlow Machine (BMDFM)
GPU Raytracer from scratch in C++/CUDA
A flexible and efficient library for deep learning
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning
Dataflow Run Time
Mirror of the Glasgow Haskell Compiler
A Torch implementation of the object detection network
Performance and Productivity at Scale
Pattern-based multi/many-core parallel programming framework
An Embedded C++ Domain-Specific Language
Persistent shared object memory and parallelism for Node.js and Python
Parallel pairwise correlation computation on Intel Xeon Phi clusters