Showing 391 open source projects for "vocabulary"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short phrase or exemplars, scaling to a vastly larger set of categories than traditional closed-set models. ...
    Downloads: 89 This Week
    Last Update:
    See Project
  • 2
    Vosk Speech Recognition Toolkit

    Vosk Speech Recognition Toolkit

    Offline speech recognition API for Android, iOS, Raspberry Pi

    ...It enables speech recognition for 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. More to come. Vosk models are small (50 Mb) but provide continuous large vocabulary transcription, zero-latency response with streaming API, reconfigurable vocabulary and speaker identification. Speech recognition bindings are implemented for various programming languages like Python, Java, Node.JS, C#, C++, Rust, Go and others. Vosk supplies speech recognition for chatbots, smart home appliances, and virtual assistants. ...
    Downloads: 86 This Week
    Last Update:
    See Project
  • 3
    My Vocabulary

    My Vocabulary

    Simple Vocabulary app

    A tiny, always-on-top overlay flashcard app for effortless vocabulary learning while you work or browse. ⚠️ Note about full-screen games Some games use exclusive fullscreen (DirectX/OpenGL/Vulkan). In that mode overlays cannot draw on top. Switch to borderless windowed (fullscreen) mode instead.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    HomeRobot

    HomeRobot

    Mobile manipulation research tools for roboticists

    ...It provides interfaces for Detic, Grounded-SAM, and Contact-GraspNet, allowing open-vocabulary detection and 3D grasping.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Rime ICE

    Rime ICE

    rime-ice is a highly optimized schema for the RIME input method

    rime-ice is a highly optimized schema for the RIME (中州韻) input method engine, offering a clean, intelligent, and efficient Chinese input experience. Built with modular configuration files and designed for performance, rime-ice provides powerful input suggestions, simplified vocabulary, and flexible customization, catering to users who want a streamlined and practical Chinese typing setup.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    alphageometry

    alphageometry

    AI-driven neuro-symbolic solver for high-school geometry problems

    ...The DDAR solver focuses purely on rule-based reasoning, while AlphaGeometry enhances this by using a learned model to suggest auxiliary constructions when logical reasoning alone is insufficient. The repository includes pre-trained weights, vocabulary files, and detailed configuration options for reproducing experiments.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 7
    Schema.DTS

    Schema.DTS

    JSON-LD TypeScript types for Schema vocabulary

    The project provides a comprehensive set of TypeScript typings based on the Schema vocabulary, enabling developers to author JSON-LD structured data with strong type safety. It supplies both high-level discriminated unions and helper types to model contexts, graphs, and linked data relationships with clarity and accuracy. Usage examples demonstrate how one can import types like Person, WithContext, or Graph and compose JSON-LD objects in a way that aligns with semantic-web and knowledge-graph practices. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SentencePiece

    SentencePiece

    Unsupervised text tokenizer for Neural Network-based text generation

    SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabulary size is predetermined prior to the neural model training. SentencePiece implements subword units (e.g., byte-pair-encoding (BPE) [Sennrich et al.]) and unigram language model [Kudo.]) with the extension of direct training from raw sentences. SentencePiece allows us to make a purely end-to-end system that does not depend on language-specific pre/postprocessing. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Pot Desktop

    Pot Desktop

    A cross-platform software for text translation and recognition

    ...The tool supports external plugin extensions, which means its functionality can be expanded far beyond the built-in options: you can add translation engines, OCR backends, TTS engines, vocabulary export (e.g. for language learning), and more. Pot-Desktop works on Windows, macOS, and Linux (including Wayland environments), and offers convenient installers or package-manager installation methods (e.g. via brew or .deb, etc.), so it’s accessible for users on all major desktop OSes.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Powerful App Monitoring Without Surprise Bills Icon
    Powerful App Monitoring Without Surprise Bills

    AppSignal starts at $23/month with all features included. No overages, no hidden fees. 30-day free trial.

    Tired of monitoring tools that punish you for scaling? AppSignal offers transparent, predictable pricing with every feature unlocked on every plan. Track errors, monitor performance, detect anomalies, and manage logs across Ruby, Python, Node.js, and more. Trusted by developers since 2012 with free dev-to-dev support. No credit card required to start your 30-day trial.
    Try AppSignal Free
  • 10
    English-level-up-tips

    English-level-up-tips

    An advanced guide to learn English which might benefit you a lot

    English-level-up-tips is a comprehensive open-source guide designed to help learners improve their English language skills across a broad range of competencies, from vocabulary and grammar to listening, speaking, reading, and writing. Structured as a language learning tutorial, the project aggregates tips, strategies, explanations, and resources that go beyond simple phrase lists, encouraging learners to develop a deep understanding of how English works and how to use it effectively. The repository includes structured sections that address different skill areas with lessons, exercises, and recommended approaches tailored to learners at various stages of proficiency. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    IBM-ODM-Docker

    IBM-ODM-Docker

    This repository allows to deploy an IBM Operational Decision Manager

    ...IBM ODM is a decisioning platform to automate your business policies. Business rules are used at the heart of the platform to implement decision logic on a business vocabulary and run it as web decision services.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    ...It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into token IDs and decode tokens back into text. It is intentionally small and readable so developers can understand each stage of BPE, including the mechanics of pair counting, merge application, and vocabulary growth. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ML Ferret

    ML Ferret

    Refer and Ground Anything Anywhere at Any Granularity

    Ferret is Apple’s end-to-end multimodal large language model designed specifically for flexible referring and grounding: it can understand references of any granularity (boxes, points, free-form regions) and then ground open-vocabulary descriptions back onto the image. The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo presents the vision-language pipeline, model assets, and paper resources that show how Ferret answers questions, follows instructions, and returns grounded outputs rather than just text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Mistral Finetune

    Mistral Finetune

    Memory-efficient and performant finetuning of Mistral's models

    ...The repo includes utilities for data preprocessing (e.g. reformat_data.py), validation scripts, and example YAML configs for training variants like 7B base or instruct models. It supports function-calling style datasets (via "messages" keys) as well as plain text formats, with guidelines on formatting, tokenization, and vocabulary extension (e.g. extending vocab to 32768 for some models) before finetuning. The project also provides tutorial notebooks (e.g. mistral_finetune_7b.ipynb) to walk through the steps.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Hera

    Hera

    Hera is an Argo Python SDK

    ...Hera aims to make the construction and submission of various Argo Project resources easy and accessible to everyone! Hera abstracts away low-level setup details while still maintaining a consistent vocabulary with Argo.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Amical

    Amical

    Open Source AI Dictation App

    ...It leverages both local and cloud-based AI models, letting users seamlessly switch between providers for the ideal balance of speed, precision, and control, and understands the context of each app in use to automatically format text in a tone and style appropriate to the platform. Users can enhance transcription accuracy with custom vocabulary tailored to industry jargon, proper nouns, and personal terms, and set up personalized voice shortcuts to trigger workflows or dictate across applications. Amical supports multilingual dictation with over 50 languages at native-level accuracy. Its features include a floating desktop widget for easy access, voice-activated commands, custom hotkeys, transcription history, and more.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    Front-End Design Checklist

    Front-End Design Checklist

    The Design Checklist for Creative Web Designers

    ...The resource includes checks for responsive breakpoints, interaction states, accessibility considerations, and asset preparation, reducing rework later in the build. It promotes shared vocabulary and artifacts, helping teams avoid ambiguities around components, states, and edge cases. By using it early in the process, teams can prevent visual drift, inconsistent spacing, and incomplete specifications. The result is a repeatable, predictable path from mockup to production-quality UI.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    API Platform Core

    API Platform Core

    The server component of API Platform, hypermedia and GraphQL APIs

    ...It is a component of the API Platform framework and it can be integrated with the Symfony framework using the bundle distributed with the library. It natively supports popular open formats including JSON for Linked Data (JSON-LD), Hydra Core Vocabulary, OpenAPI v2 (formerly Swagger) and v3, HAL and Problem Details. Build a working and fully-featured CRUD API in minutes. Leverage the awesome features of the tool to develop complex and high-performance API-first projects. Extend or override everything you want.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Read Frog

    Read Frog

    Open Source Immersive Translate

    Read Frog is an open-source browser extension designed to transform everyday web reading into an immersive language learning experience powered by artificial intelligence. The tool integrates translation, contextual explanations, and content analysis directly into the browsing workflow so users can learn languages naturally while reading authentic online content. Instead of forcing learners to switch between translation tools and the original text, the extension displays translations...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Marksheet

    Marksheet

    Free tutorial to learn HTML and CSS

    ...It explains core building blocks—elements, attributes, selectors, the box model, positioning—and connects them to the mental models needed for real layouts. The writing style aims to demystify jargon and teach a consistent vocabulary so learners can understand documentation and tutorials elsewhere. It includes diagrams and compact examples that illustrate concepts without burying readers in boilerplate. The material emphasizes progressive mastery, encouraging learners to build small pages and refine them with better structure and style. It’s a useful reference to revisit after your first projects, reinforcing fundamentals that make larger frameworks easier to learn later.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A Fluent Builder For Schema.org Types

    A Fluent Builder For Schema.org Types

    A fluent builder Schema.org types and ld+json generator

    spatie/schema-org provides a fluent builder for all Schema.org types and their properties. The code in src is generated from Schema.org's JSON-LD standards file, so it provides objects and methods for the entire core vocabulary. The classes and methods are also fully documented as a quick reference. We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall. If you don't want to break the chain of a large schema object, you can use the if method to conditionally modify the schema. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    WeClone

    WeClone

    One-stop solution for creating your digital avatar from chat history

    ...It is intended primarily as an experimental exploration of digital personality modeling and conversational AI personalization. By processing large volumes of conversation data, WeClone can build a profile of an individual’s writing tone, vocabulary preferences, and conversational tendencies. Developers can use the resulting model to create chatbots that simulate a specific user’s communication patterns for testing or research purposes. Overall, WeClone explores the idea of digital identity replication through machine learning and conversational modeling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    llmx.txt hub

    llmx.txt hub

    The largest directory for AI-ready documentation and tools

    llms-txt-hub serves as a central directory and knowledge base for the emerging llms.txt convention, a simple, text-based way for project owners to communicate preferences to AI tools. It catalogs implementations across projects and platforms, helping maintain a shared understanding of how LLM-powered services should interact with code and documentation. The repository aims to standardize patterns for allowlists, denylists, attribution, rate expectations, and contact information, mirroring...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    ...The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports customizable text pre-processors, which can correct pronunciations, tweak formatting, or handle domain-specific vocabulary before sending it to the API. gTTS is primarily aimed at developers who want a quick way to add cloud-backed speech to scripts, apps, or pipelines without managing any model weights locally. A small CLI utility, gtts-cli, makes it easy to test or batch-generate MP3 files right from the shell.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    seq2seq-couplet

    seq2seq-couplet

    Play couplet with seq2seq model

    ...In addition to local execution, the project includes Docker files, which make it easier to package and deploy the application in a more reproducible way. The repository also points users to an external dataset source and documents vocabulary formatting requirements for custom datasets, showing that it is meant for both experimentation and extension.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB