Open Source Windows Large Language Models (LLM) - Page 2

  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    LLaMA 3

    LLaMA 3

    The official Meta Llama 3 GitHub site

    This repository is the former home for Llama 3 model artifacts and getting-started code, covering pre-trained and instruction-tuned variants across multiple parameter sizes. It introduced the public packaging of weights, licenses, and quickstart examples that helped developers fine-tune or run the models locally and on common serving stacks. As the Llama stack evolved, Meta consolidated repositories and marked this one deprecated, pointing users to newer, centralized hubs for models, utilities, and docs. Even as a deprecated repo, it documents the transition path and preserves references that clarify how Llama 3 releases map into the current ecosystem. Practically, it functioned as a bridge between Llama 2 and later Llama releases by standardizing distribution and starter code for inference and fine-tuning. Teams still treat it as historical reference material for version lineage and migration notes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    AxonHub

    AxonHub

    Use any SDK to call 100+ LLMs

    AxonHub is an open-source AI gateway platform designed to simplify the process of integrating and switching between different large language model providers. The system acts as a compatibility layer that allows developers to use the same SDK interface while routing requests to various AI services behind the scenes. Instead of rewriting code when switching providers such as OpenAI or Anthropic, developers can simply change configuration settings within the gateway. AxonHub translates requests from one provider’s API format into another, enabling seamless interoperability across different AI platforms. The system also provides infrastructure features such as request routing, failover mechanisms, load balancing, and cost management for AI applications. This architecture makes it easier to experiment with multiple models and manage production deployments that rely on several providers simultaneously.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    MiniMax-M2.1

    MiniMax-M2.1

    MiniMax M2.1, a SOTA model for real-world dev & agents.

    MiniMax-M2.1 is an open-source, state-of-the-art agentic language model released to democratize high-performance AI capabilities. It goes beyond a simple parameter upgrade, delivering major gains in coding, tool use, instruction following, and long-horizon planning. The model is designed to be transparent, controllable, and accessible, enabling developers to build autonomous systems without relying on closed platforms. MiniMax-M2.1 excels in real-world software engineering tasks, including multilingual development and complex workflow automation. It demonstrates strong generalization across agent frameworks and consistently improves upon its predecessor, MiniMax-M2. Benchmarks show that it rivals or approaches top proprietary models while remaining fully open for local deployment and customization.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Neuron AI

    Neuron AI

    The PHP Agentic Framework to build production-ready AI driven apps

    Neuron AI is a PHP agentic framework for building production-ready AI applications that connect models, memory, vector databases, and tools into working agents. It is designed for developers who want to create systems such as RAG pipelines, multi-agent workflows, and business process automations without having to hand-build every integration from scratch. The framework provides an Agent class that can be extended to inherit core capabilities like memory, tools, function calling, and retrieval-augmented generation. Its design is modular, so developers can swap model providers with minimal changes to their application code, which makes it practical for teams that need flexibility across vendors. The project also supports structured output, monitoring, MCP connectivity, and workflow patterns that include human-in-the-loop intervention when automated flows need review or correction.
    Downloads: 6 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    TaxHacker

    TaxHacker

    Self-hosted AI accounting app. LLM analyzer for receipts

    TaxHacker is an open-source, self-hosted accounting application that uses artificial intelligence to automate financial record management for freelancers, independent developers, and small businesses. The system is designed to simplify bookkeeping by automatically processing financial documents such as receipts, invoices, and transaction records. It integrates large language models to analyze these documents, extract relevant financial information, and categorize expenses or income based on configurable rules. Users can deploy the application on their own infrastructure, ensuring that financial data remains private and under their control rather than being processed by external services. The software provides tools for tracking income streams, monitoring expenses, and organizing financial records in a structured format. Because the system supports customizable prompts and categories, users can adapt the AI analysis to match their accounting workflows or tax requirements.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    node-llama-cpp

    node-llama-cpp

    Run AI models locally on your machine with node.js bindings for llama

    node-llama-cpp is a JavaScript and Node.js binding that allows developers to run large language models locally using the high-performance inference engine provided by llama.cpp. The library enables applications built with Node.js to interact directly with local LLM models without requiring a remote API or external service. By using native bindings and optimized model execution, the framework allows developers to integrate advanced language model capabilities into desktop applications, server software, and command-line tools. The system automatically detects the available hardware on a machine and selects the most appropriate compute backend, including CPU or GPU acceleration. Developers can use the library to perform tasks such as text generation, conversational chat, embedding generation, and structured output generation. Because it runs models locally, the platform is particularly useful for privacy-sensitive environments or offline AI deployments.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    GPT Neo

    GPT Neo

    An implementation of model parallel GPT-2 and GPT-3-style models

    An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very inefficient at those scales. This, as well as the fact that many GPUs became available to us, among other things, prompted us to move development over to GPT-NeoX. All evaluations were done using our evaluation harness. Some results for GPT-2 and GPT-3 are inconsistent with the values reported in the respective papers. We are currently looking into why, and would greatly appreciate feedback and further testing of our eval harness.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    LLamaSharp

    LLamaSharp

    C#/.NET binding of llama.cpp, including LLaMa/GPT model inference

    The C#/.NET binding of llama.cpp. It provides APIs to infer the LLaMa Models and deploy it on the local environment. It works on both Windows, Linux and MAC without the requirement for compiling llama.cpp yourself. Its performance is close to llama.cpp. Furthermore, it provides integrations with other projects such as BotSharp to provide higher-level applications and UI.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    LangChain.js

    LangChain.js

    Building applications with LLMs through composability

    Building applications with LLMs through composability. Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications. This is built to integrate as seamlessly as possible with the LangChain Python package. Specifically, this means all objects (prompts, LLMs, chains, etc) are designed in a way where they can be serialized and shared between languages.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 10
    Lemon AI

    Lemon AI

    Full-stack Open-source Self-Evolving General AI Agent

    LemonAI is an open-source full-stack framework for building autonomous AI agents capable of performing complex tasks such as research, programming, data analysis, and document processing. The platform is designed to run primarily on local infrastructure, providing a privacy-focused alternative to cloud-dependent agent platforms. It integrates with local large language models through tools such as Ollama, vLLM, and other model runtimes while also allowing optional connections to external cloud models. The system includes a multi-agent architecture that supports planning, action execution, reflection, and memory, allowing the agent to reason through tasks and refine results iteratively. A key component of the framework is a virtual machine sandbox environment that safely executes code generated by the agent without affecting the host system.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    MetaGPT

    MetaGPT

    The Multi-Agent Framework

    The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo. Assign different roles to GPTs to form a collaborative software entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories / competitive analysis/requirements/data structures / APIs / documents, etc. Internally, MetaGPT includes product managers/architects/project managers/engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    Superset LLM

    Superset LLM

    Run an army of Claude Code, Codex, etc. on your machine

    Superset is a development environment and terminal-based platform designed to orchestrate multiple AI coding agents simultaneously within a single workspace. The tool enables developers to run many autonomous coding agents in parallel without the typical overhead of manually managing multiple terminals, repositories, or branches. Each agent task is isolated in its own Git worktree, ensuring that code changes from different agents do not interfere with each other while allowing developers to track their progress independently. The platform includes built-in monitoring capabilities so users can observe the activity of each agent, receive notifications when tasks are completed, and quickly review changes produced by automated coding workflows. Superset also integrates tools for reviewing code differences, editing generated outputs, and managing the development environment directly from the interface.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    python-whatsapp-bot

    python-whatsapp-bot

    Build AI WhatsApp Bots with Pure Python

    python-whatsapp-bot is an open-source framework that demonstrates how to build AI-powered WhatsApp bots using pure Python and the official WhatsApp Cloud API. The project provides a practical implementation of a messaging automation system using the Flask web framework to handle webhook events and process incoming messages in real time. Developers can configure the bot to receive user messages through the WhatsApp API, route them through application logic, and generate automated responses powered by AI services such as large language models. The repository includes example scripts and project structures that illustrate how to integrate OpenAI or similar AI models into the bot workflow, enabling conversational agents capable of answering questions or performing automated tasks.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    rtk

    rtk

    CLI proxy that reduces LLM token consumption

    rtk is an open-source command-line proxy designed to optimize interactions between AI coding agents and the terminal by reducing unnecessary token consumption. When AI assistants execute shell commands during software development tasks, the resulting terminal output often contains large amounts of repetitive or irrelevant information that can overwhelm the model’s context window. RTK intercepts these command outputs and compresses them into concise summaries before sending them to the language model. This process helps maintain important information while removing redundant data such as boilerplate logs, long directory listings, or repetitive test outputs. By minimizing the amount of noise sent to the AI model, the tool improves reasoning quality and allows longer development sessions within the same context window. The system is implemented as a lightweight Rust binary that runs locally and integrates easily with common AI coding environments.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    Autonomous Agents

    Autonomous Agents

    Autonomous Agents (LLMs) research papers. Updated Daily

    Autonomous-Agents is a research-focused repository that collects implementations, experiments, and academic resources related to autonomous multi-agent systems and intelligent robotics. The project explores how multiple agents can cooperate and interact with complex environments through machine learning, imitation learning, and multimodal sensing. It includes frameworks that integrate visual perception, tactile sensing, and spatial reasoning to guide the actions of robotic agents during manipulation or collaborative tasks. One of the central concepts explored in the repository is the integration of different sensory modalities using advanced machine learning techniques such as Feature-wise Linear Modulation and graph-based attention mechanisms. These methods allow agents to combine visual and geometric information while maintaining awareness of the spatial relationships between agents and objects.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    GraphRAG

    GraphRAG

    A modular graph-based Retrieval-Augmented Generation (RAG) system

    The GraphRAG project is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    LLM Datasets

    LLM Datasets

    Curated list of datasets and tools for post-training

    LLM Datasets curates and standardizes datasets commonly used to train and fine-tune large language models, reducing the overhead of hunting down sources and normalizing formats. The repository aims to make datasets easy to inspect and transform, with scripts for downloading, deduping, cleaning, and converting to formats like JSONL that slot into training pipelines. It highlights instruction-tuning and conversation-style corpora while also pointing to code, math, or domain-specific sets for targeted capabilities. Quality is a recurring theme: examples and utilities help filter low-value samples, enforce length limits, and split train/validation consistently so results are comparable. Licensing and provenance are surfaced to encourage compliant usage and to guide dataset selection in commercial settings. For practitioners, the repo is a practical “starting pantry” that accelerates experimentation and helps keep data wrangling from dominating the project timeline.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    OpenOutreach

    OpenOutreach

    Linkedin Automation Tool

    OpenOutreach is a self-hosted, open-source LinkedIn automation platform built for B2B lead generation and outbound prospecting. Instead of requiring a prebuilt contact list, it starts from a product description and target market definition, then uses AI to discover and prioritize likely leads on LinkedIn. The system generates search queries, evaluates candidate profiles, and learns over time which contacts best match the ideal customer profile. According to the repository, it combines large language model classification with a Bayesian machine learning layer based on profile embeddings, which helps it shift from broad exploration to more confident qualification as it gathers more decisions. It is designed to automate personalized outreach as well, including connection requests and follow-up messaging, while keeping deployment under the user’s control through a local or self-hosted setup.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    RWKV Runner

    RWKV Runner

    A RWKV management and startup tool, full automation, only 8MB

    RWKV (pronounced as RwaKuv) is an RNN with GPT-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, fast training, saves VRAM, "infinite" ctxlen, and free text embedding. Moreover it's 100% attention-free. Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility issues, go to the Configs page and turn off Use Custom CUDA kernel to Accelerate.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    oh my PI

    oh my PI

    AI Coding agent for the terminal

    Oh-My-Pi is an open-source AI agent toolkit focused on creating intelligent coding assistants that operate directly from the terminal environment. The project provides a command-line coding agent capable of analyzing repositories, generating commits, editing code, and interacting with development tools through an integrated tool system. Instead of functioning as a simple prompt-based assistant, the system includes an agent architecture that can inspect Git repositories, analyze changes, and perform development actions with fine-grained control. The platform also supports tool-based workflows where the agent can run shell commands, read files, modify code, and stage changes during development tasks. It includes infrastructure for integrating different AI providers and models through a unified API layer, allowing developers to switch between models while keeping the same agent interface.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Cake

    Cake

    Distributed LLM and StableDiffusion inference

    Cake is a compact, powerful toolkit that combines a flexible TCP/UDP proxy, port forwarding system, and connection manager designed for both development and penetration testing scenarios. It enables users to create complex networking flows where traffic can be proxied, relayed, and manipulated between endpoints — useful for debugging networked applications, inspecting protocols, or tunneling traffic through different hops. The tool is designed to work with multiple protocols and supports dynamic rule definitions so that incoming and outgoing connections can be routed, rewritten, or logged according to user-defined policies. Unlike many simple proxies, Cake can act as a full connection broker: it can bind to arbitrary interfaces, handle simultaneous upstream/downstream sessions, and apply traffic rules on the fly. This makes it suitable for troubleshooting tricky network behavior, simulating network conditions, or chaining services in a modular test environment.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    DocStrange

    DocStrange

    Extract and convert data from any document, images, pdfs, word doc

    DocStrange is an open-source document understanding and extraction library designed to convert complex files into structured, LLM-ready outputs such as Markdown, JSON, CSV, and HTML. Developed by Nanonets, the project combines OCR, layout detection, table understanding, and structured extraction into one end-to-end pipeline, which reduces the need to stitch together multiple separate services. It is built for developers who need high-quality parsing from scans, photos, PDFs, office files, and other document sources while preserving privacy and control over the processing flow. One of its key differentiators is deployment flexibility: it offers a cloud API for managed usage as well as a fully private offline mode that runs locally on a GPU. The platform also supports synchronous extraction, streaming responses, and asynchronous processing for larger documents, which makes it adaptable to both interactive workflows and heavier back-end pipelines.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Emscripten

    Emscripten

    Emscripten: An LLVM-to-WebAssembly Compiler

    Emscripten is a complete open-source compiler toolchain that transforms C, C++, and other LLVM-based source code into WebAssembly (and JavaScript), enabling native‑like applications to run in web browsers, Node.js, and other Wasm environments. While Emscripten mostly focuses on compiling C and C++ using Clang, it can be integrated with other LLVM-using compilers (for example, Rust has Emscripten integration, with the wasm32-unknown-emscripten and asmjs-unknown-emscripten targets). Emscripten provides Web support for popular portable APIs such as OpenGL and SDL2, allowing complex graphical native applications to be ported, such as the Unity game engine and Google Earth. It can probably port your codebase, too.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Grounded Docs

    Grounded Docs

    Open-Source Alternative to Context7, Nia, and Ref.Tools

    Grounded Docs is an open-source implementation of a Model Context Protocol server designed to expose documentation and structured information as tools that AI agents can query. The project allows language models and agent frameworks to retrieve and interact with documentation through standardized MCP interfaces. By acting as an intermediary layer between documentation sources and AI tools, the server enables models to access structured documentation in a consistent and machine-readable format. This makes it easier for AI systems to answer technical questions, generate code examples, or retrieve reference material without requiring developers to manually integrate documentation into prompts. The architecture follows the MCP specification, which allows AI assistants and agent frameworks to connect to external tools through standardized protocols.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Hollama

    Hollama

    A minimal LLM chat app that runs entirely in your browser

    Hollama is a lightweight open-source chat application designed to run entirely within the browser while interacting with large language model servers. The project provides a minimal but powerful user interface for communicating with local or remote LLMs, including servers powered by Ollama or OpenAI-compatible APIs. Because the application runs as a static web interface, it does not require complex backend infrastructure and can be easily deployed or self-hosted. Hollama supports both text-based and multimodal interactions, allowing users to work with models that process images as well as text. The interface includes features for editing prompts, retrying responses, copying generated code snippets, and storing conversation history locally within the browser. Mathematical expressions can be rendered using KaTeX, and Markdown formatting allows code blocks and structured outputs to appear clearly within conversations.
    Downloads: 3 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB