Search Results for "availability and fault tolerance"

Showing 108 open source projects for "availability and fault tolerance"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    Erlang/OTP

    Erlang/OTP

    Build massively scalable soft real-time systems

    Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability. Some of its uses are in telecoms, banking, e-commerce, computer telephony and instant messaging. Erlang's runtime system has built-in support for concurrency, distribution and fault tolerance. OTP is set of Erlang libraries and design principles providing middle-ware to develop these systems. It includes its own distributed database, applications to interface towards other languages, debugging and release handling tools. ...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 2
    Bitalosdb

    Bitalosdb

    Bitalosdb is a high-performance KV storage engine

    BitalosDB is a distributed, high-performance key-value database designed for cloud-native applications. It is optimized for scalability, supporting large workloads while maintaining low latency and high availability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Infinispan

    Infinispan

    Infinispan is an open source data grid platform

    Infinispan is a distributed in-memory data grid and caching system designed for high-performance computing. It allows applications to scale dynamically by distributing data across multiple nodes, reducing latency and improving resilience.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    SOFAJRaft

    SOFAJRaft

    A production-grade java implementation of RAFT consensus algorithm

    SOFAJRaft is a production-level, high-performance Java implementation based on the RAFT consistency algorithm that supports MULTI-RAFT-GROUP for high-load, low-latency scenarios. With SOFAJRaft you can focus on your business area. SOFAJRaft handles all RAFT-related technical challenges. SOFAJRaft is very user-friendly, which provides several examples, making it easy to understand and use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    dqlite

    dqlite

    Embeddable, replicated and fault tolerant SQL engine

    Dqlite is a fast, embedded, persistent SQL database with Raft consensus that is perfect for fault-tolerant IoT and Edge devices. Dqlite (distributed SQLite) extends SQLite across a cluster of machines, with automatic failover and high-availability to keep your application running. It uses C-Raft, an optimised Raft implementation in C, to gain high-performance transactional consensus and fault tolerance while preserving SQlite’s outstanding efficiency and tiny footprint. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    rqlite

    rqlite

    The lightweight, distributed relational database built on SQLite

    rqlite is an easy-to-use, lightweight, distributed relational database, which uses SQLite as its storage engine. rqlite is simple to deploy, operating it is very straightforward, and its clustering capabilities provide you with fault-tolerance and high availability. rqlite is available for Linux, macOS, and Microsoft Windows. rqlite gives you the functionality of a rock solid, fault-tolerant, replicated relational database, but with very easy installation, deployment, and operation. With it you've got a lightweight and reliable distributed relational data store. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Jepsen

    Jepsen

    A framework for distributed systems verification, with fault injection

    Jepsen is a framework for testing the correctness of distributed systems, especially under network partitions, concurrency, and failures. It is widely used to verify the consistency and availability guarantees of databases, consensus systems, and other distributed architectures. Jepsen simulates real-world failure conditions and analyzes the system’s behavior using linearizability and other formal models.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Olric

    Olric

    Distributed, in-memory key/value store and cache

    A lightweight, distributed in-memory data store designed for key-value caching and ephemeral storage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ThingsBoard

    ThingsBoard

    Device management, data collection, processing and visualization

    ...It enables device connectivity via industry standard IoT protocols, MQTT, CoAP and HTTP and supports both cloud and on-premises deployments. ThingsBoard combines scalability, fault-tolerance and performance so you will never lose your data. Provision, monitor and control your IoT entities in a secure way using rich server-side APIs. Define relations between your devices, assets, customers or any other entities. Collect and store telemetry data in a scalable and fault-tolerant way. Visualize your data with built-in or custom widgets and flexible dashboards. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Liftbridge

    Liftbridge

    Lightweight, fault-tolerant message streams

    Lightweight, fault-tolerant message streams. Extend NATS with a Kafka-like durable pub/sub log API. Use Liftbridge as a simpler and lighter alternative to systems like Kafka and Pulsar or to add streaming semantics to an existing NATS deployment. Stream replication provides high availability and durability of messages. Clustering and partitioning provide horizontal scalability for streams and their consumers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Apache RocketMQ

    Apache RocketMQ

    Distributed messaging and streaming platform with low latency

    Apache RocketMQ is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability. Messaging patterns including publish/subscribe, request/reply and streaming. Financial grade transactional message. Built-in fault tolerance and high availability configuration options base on DLedger. A variety of cross language clients, such as Java, C/C++, Python, Go. Pluggable transport protocols, such as TCP, SSL, AIO. Built-in message tracing capability, also support opentracing. Versatile big-data and streaming ecosytem integration. Message retroactivity by time or offset. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    ClickHouse

    ClickHouse

    A fast open-source OLAP database management system

    ...It is able to process hundreds of millions to more than a billion rows and tens of gigabytes of data per single server per second. Apart from its blazing speed, ClickHouse is highly reliable and fault tolerant. It supports multi-master asynchronous replication with the option of being deployed across multiple datacenters. With all nodes equal, there is no single point of failure. Downtime of a single node or even the whole datacenter won't affect the system's availability. ClickHouse also has exceptional hardware efficiency and a host of other features, including a feature-rich SQL database, vectorized query execution, real-time query processing and data ingestion, and more.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    Polly

    Polly

    A .NET resilience and transient-fault-handling library for policies

    ...By providing resilience strategies in fluent-to-express policies such as Retry, WaitAndRetry, and CircuitBreaker, Polly can help you reduce fragility, and keep your systems and customers connected. Example usages are fault-tolerance for any distributed systems and inter-process calls, such as WCF, RESTful calls between microservices, calls to cloud services, Internet of Things connectivity, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    OceanBase

    OceanBase

    OceanBase is an enterprise distributed relational database

    ...It is developed entirely by Ant Group. OceanBase Database is built on a common server cluster. Based on the Paxos protocol and its distributed structure, OceanBase Database provides high availability and linear scalability. OceanBase Database is not dependent on specific hardware architectures. Single server failure recovers automatically. OceanBase Database supports cross-city disaster tolerance for multiple IDCs and zero data loss. OceanBase Database meets the financial industry Level 6 disaster recovery standard (RPO=0, RTO<=30 seconds). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Solace Agent Mesh

    Solace Agent Mesh

    An event-driven framework designed to build multi-agent AI systems

    ...The framework uses an asynchronous messaging architecture powered by an event broker, enabling agents to communicate reliably without tight coupling, which significantly improves scalability and fault tolerance. It introduces a standardized agent-to-agent communication protocol that allows different agents, regardless of their implementation or location, to exchange tasks, share data, and coordinate workflows efficiently. Solace Agent Mesh also includes orchestration mechanisms that dynamically break down user requests into smaller tasks and assign them to the most appropriate agents in real time.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Arroyo

    Arroyo

    Distributed stream processing engine in Rust

    Arroyo is a distributed stream processing engine written in Rust, designed to efficiently perform stateful computations on streams of data. Unlike traditional batch processing, streaming engines can operate on both bounded and unbounded sources, emitting results as soon as they are available.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    CoreNet

    CoreNet

    CoreNet: A library for training deep neural networks

    ...CoreNet integrates tightly with Apple’s proprietary ML stack and hardware, serving as the foundation for research in computer vision, language models, and multimodal systems within Apple AI. The framework includes monitoring tools, fault tolerance mechanisms, and efficient checkpointing for massive training runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    StreamString

    StreamString

    Simple storage of data and structures in streams with fault tolerance

    Simple storage of data and data structures in streams using UTF8 strings with checksum and fault/noise tolerance. The software is in a pre-pre alpha stage right now, but it is usable. The interface is undergoing a major rewrite using a result pattern instead of exceptions (exceptions will possibly be optionally enabled in the result pattern though). This unit is written for FreePascal V3.2 or later. I may update it to also support Delphi, but for now i use the Freepascal generics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Spring Cloud Tencent

    Spring Cloud Tencent

    A Spring Cloud based Service Governance Framework

    Spring Cloud Tencent is a Spring Cloud-based Service Governance Framework provided by Tencent. Spring Cloud Tencent is an one-stop microservice solution which implements the standard Spring Cloud SPI. It integrates Spring Cloud with Tencent middlewares and makes it easy to develop microservice.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Membrane Core

    Membrane Core

    The core of Membrane Framework, multimedia processing framework

    membrane_core is the foundation of the Membrane multimedia framework for Elixir, providing the abstractions and runtime needed to build real-time audio and video pipelines. It models media processing as a graph of lightweight, supervised OTP processes—elements connected by links—so work is isolated, fault-tolerant, and easy to scale or reconfigure at runtime. The core defines a clear lifecycle and callback API for elements, plus concepts like buffers, events, and capabilities/format...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    KubeRay

    KubeRay

    A toolkit to run Ray applications on Kubernetes

    KubeRay is a powerful, open-source Kubernetes operator that simplifies the deployment and management of Ray applications on Kubernetes. It offers several key components. KubeRay core: This is the official, fully-maintained component of KubeRay that provides three custom resource definitions, RayCluster, RayJob, and RayService. These resources are designed to help you run a wide range of workloads with ease.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Zeebe

    Zeebe

    Distributed Workflow Engine for Microservices Orchestration

    Automate processes at scale with unprecedented performance and resilience. Zeebe is the workflow and decision engine that powers Camunda Platform 8. Zeebe’s cloud-native design provides the performance, resilience, and security enterprises need to future-proof their process orchestration efforts. Zeebe distributes data across all brokers in a cluster with storage directly on the server filesystem. If one broker goes down, another can replace it with no data loss. This pre-configured...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    tigerbeetle

    tigerbeetle

    The financial transactions database designed for mission critical safe

    TigerBeetle is production-ready on Linux and seamlessly integrated with major programming languages. TigerBeetle is a financial transactions database designed for mission-critical safety and performance to power the next 30 years of OLTP. TigerBeetle redesigns the distributed database storage engine and consensus protocol for the OLTP workload. This solves the problem of OLTP write contention to unlock three orders of magnitude more performance than a general purpose (OLGP) database. The...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Roxy-WI

    Roxy-WI

    Web interface for managing Haproxy, Nginx, Apache and Keepalived

    For those who need a convenient interface for managing all services in one place. Roxy-WI was created for people who want to have a fault-tolerant infrastructure, but do not want to plunge deep into the details of setting up and creating a cluster based on HAProxy, NGINX, Apache, and Keepalived. Use Roxy-WI to build a high available cluster for a couple of clicks: install HAProxy, NGINX, Apache, Keepalived, and its exporters, and carry out the initial configuration for the services. Collect...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    springcloud-learning

    springcloud-learning

    Build microservices with the Spring Cloud ecosystem

    springcloud-learning is a hands-on tutorial repository that walks Java developers through building microservices with the Spring Cloud ecosystem. It breaks concepts into small, runnable modules so you can focus on one capability at a time—service discovery, configuration management, gateway routing, fault tolerance, messaging, and observability. The code emphasizes practical integration with common infrastructure like Nacos/Eureka for registry, Nacos/Config Server for configuration, Sentinel/Resilience4j for resilience, and gateways for routing and cross-cutting concerns. Each module typically includes minimal bootstrapping, clear dependencies, and example endpoints, making it easy to start, test, and reason about the behavior. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB