Search Results for "artificial intelligence projects in c# - text recognition" - Page 4

Showing 531 open source projects for "artificial intelligence projects in c# - text recognition"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    TensorRT

    TensorRT

    C++ library for high performance inference on NVIDIA GPUs

    NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 2
    KrillinAI

    KrillinAI

    Video translation and dubbing tool powered by LLMs

    KrillinAI is an end-to-end content localization, translation, and dubbing tool aimed at helping creators transform videos into multiple languages with minimal manual effort. It integrates several stages of the pipeline: video acquisition (either from local files or remote via download tools), speech recognition (ASR), subtitle segmentation and alignment, machine translation (with context-aware translation to preserve semantics), and voice cloning + text-to-speech (TTS) to produce dubbed...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    GoogleTest

    GoogleTest

    Google Testing and Mocking Framework

    GoogleTest is Google's C++ mocking and test framework. It's used by many internal projects at Google, as well as a number of notable projects such as The Chromium projects, the OpenCV computer vision library, and the LLVM compiler. This GoogleTest project is actually a union of what used to be two separate projects: the old GoogleTest and GoogleMock, an extension of GoogleTest for writing and using C++ mock classes. Since they were so closely related, they were merged to create an even...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 4
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 20 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Gemini Next Chat

    Gemini Next Chat

    Deploy your private Gemini application for free with one click

    Gemini Next Chat is an open-source web application that allows you to deploy your own private chat interface powered by Google’s Gemini models (e.g., Gemini 1.5, Gemini 2.0, etc.). It is built with Next.js/TypeScript and targets developers and hobbyists who want a self-hosted solution for interacting with advanced multimodal models (text, image, voice). It supports features like image recognition, voice-based conversation, plugins (web search, ArXiv search, weather, etc.), and client apps...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Ultravox

    Ultravox

    Fast multimodal LLM for real-time voice interaction and AI apps

    Ultravox is an open source multimodal large language model designed specifically for real-time voice-based interactions. It is built to process both text and spoken audio directly, eliminating the need for a separate speech recognition stage and enabling more seamless conversational experiences. Ultravox works by combining text prompts with encoded audio inputs, allowing it to understand spoken language alongside written instructions in a unified pipeline. Internally, it leverages pretrained...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Aix-DB

    Aix-DB

    Based on the LangChain/LangGraph framework

    Aix-DB is an open-source intelligent data analysis platform that combines large language models with database technologies to enable conversational data exploration. The system is designed as a ChatBI solution that allows users to query datasets using natural language and receive structured insights, charts, and visualizations automatically. Built on frameworks such as LangChain and LangGraph, Aix-DB integrates retrieval-augmented generation and Text-to-SQL capabilities to convert user...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. It is designed to help...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    VideoChat

    VideoChat

    Real-time voice interactive digital human

    VideoChat is a real-time voice-interactive “digital human” system that combines automatic speech recognition, large language models, text-to-speech, and talking-head generation into a single conversational pipeline. It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head. It is built as a Gradio Python demo, exposing a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    A library for audio and music analysis, and feature extraction. Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training and is used to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Open Model Zoo

    Open Model Zoo

    Pre-trained Deep Learning models and demos

    Open Model Zoo is a large repository of high-quality pre-trained deep learning models and demonstration applications designed to work with the OpenVINO™ toolkit, offering a comprehensive starting point for a wide range of AI and computer vision workloads. It includes hundreds of models covering object detection, classification, segmentation, pose estimation, speech recognition, text-to-speech, and more, many of which are already converted into formats optimized for inference on CPUs, GPUs,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Generative AI Docs

    Generative AI Docs

    Documentation for Google's Gen AI site - including Gemini API & Gemma

    Generative AI Docs is Google’s official documentation repository for Gemini, Vertex AI, and related generative AI APIs. It contains guides, API references, and examples for developers building applications using Google’s large language models, text-to-image models, embeddings, and multimodal capabilities. The repository includes markdown source files that power the Google AI developer documentation site, as well as sample code snippets in Python, JavaScript, and other languages that...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Google Research: Language

    Google Research: Language

    Shared repository for open-sourced projects from the Google AI Lang

    Google Research: Language is a shared repository maintained by Google Research that contains open-source projects developed by the Google AI Language team. The repository hosts multiple subprojects related to natural language processing, machine learning, and large-scale language understanding systems. Many of the projects included in the repository correspond to research papers released by Google researchers and provide implementations of new NLP algorithms or experimental frameworks. These...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 17
    Vowpal Wabbit

    Vowpal Wabbit

    Machine learning system which pushes the frontier of machine learning

    Vowpal Wabbit is a machine learning system that pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. There is a specific focus on reinforcement learning with several contextual bandit algorithms implemented and the online nature lending to the problem well. Vowpal Wabbit is a destination for implementing and maturing state-of-the-art algorithms with performance in mind. The input format for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Z-Image

    Z-Image

    Image generation model with single-stream diffusion transformer

    Z-Image is an efficient, open-source image generation foundation model built to make high-quality image synthesis more accessible. With just 6 billion parameters — far fewer than many large-scale models — it uses a novel “single-stream diffusion Transformer” architecture to deliver photorealistic image generation, demonstrating that excellence does not always require extremely large model sizes. The project includes several variants: Z-Image-Turbo, a distilled version optimized for speed and...
    Downloads: 43 This Week
    Last Update:
    See Project
  • 19
    spacy-llm

    spacy-llm

    Integrating LLMs into structured NLP pipelines

    Large Language Models (LLMs) feature powerful natural language understanding capabilities. With only a few (and sometimes no) examples, an LLM can be prompted to perform custom NLP tasks such as text categorization, named entity recognition, coreference resolution, information extraction and more. This package integrates Large Language Models (LLMs) into spaCy, featuring a modular system for fast prototyping and prompting, and turning unstructured responses into robust outputs for various...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    FAY

    FAY

    Framework for building AI-powered interactive digital humans and agent

    Fay is an open source framework designed to build and deploy interactive digital humans powered by large language models. It acts as a middleware layer that connects digital character technologies with conversational AI systems and business applications. Fay supports various types of digital humans, including 2.5D and 3D avatars, and can be integrated with applications running on mobile devices, PCs, web platforms, and embedded systems. Its architecture allows developers to combine different...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    comfyui-mixlab-nodes

    comfyui-mixlab-nodes

    Workflow and speech recognition app

    comfyui-mixlab-nodes is a large collection of custom nodes for ComfyUI that turns workflows into interactive apps and adds real-time multimedia, LLM, and TTS capabilities. It introduces a “Workflow-to-APP” concept, where a ComfyUI graph can be transformed into a Web App through an AppInfo node, complete with categories, batch prompts, and editable configurations. The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Chinese-XLNet

    Chinese-XLNet

    Chinese XLNet pre-trained model

    Chinese-XLNet is a Chinese language pre-trained model based on the XLNet architecture, providing an advanced foundation for natural language processing tasks in Mandarin and other Chinese dialects. Unlike traditional masked language modeling, XLNet uses a permutation language modeling objective that captures bidirectional context more effectively by training over all possible token orderings, yielding richer contextual representations. This model is trained on large-scale Chinese text...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    VideoPipe

    VideoPipe

    A cross-platform video structuring (video analysis) framework

    VideoPipe is an open-source C++ framework designed for building modular video analysis pipelines that process and structure video data using computer vision models. It operates using a pipeline architecture where independent nodes can be combined flexibly to create customized workflows for tasks such as object detection, face recognition, and behavior analysis. The framework is designed to be lightweight and portable, with minimal dependencies compared to other video processing systems,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Streamer-Sales

    Streamer-Sales

    LLM Large Model of Selling Anchor

    Streamer-Sales is an open-source large language model system designed specifically for e-commerce live streaming and automated product promotion. The project focuses on generating persuasive product descriptions and live presentation scripts that mimic the style of professional online sales hosts. By analyzing product characteristics and marketing information, the model can produce engaging explanations that emphasize benefits, features, and emotional appeal to encourage viewers to make...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Beads

    Beads

    A memory upgrade for your coding agent

    Beads is an open-source project providing a distributed, structured memory system for AI coding agents, replacing ad-hoc text plans with a git-backed graph that represents tasks, dependencies, and progress in a persistent, queryable format. Instead of storing plans as unstructured Markdown or ephemeral notes, Beads organizes agent state, task artifacts, and relationships as nodes and edges in a version-controlled graph so that long-horizon projects don’t lose context or coherence as the...
    Downloads: 6 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB