Linguistics Software for BSD

Browse free open source Linguistics software and projects for BSD below. Use the toggles on the left to filter open source Linguistics software by OS, license, language, programming language, and project status.

  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 1
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering, 19(2), 259-284. Rasooli, M. S., Kahefi, O., & Minaei-Bidgoli, B. (2011). Effect of adaptive spell checking in Persian. In NLP-KE Contributors: Omid Kashefi Azadeh Zamanifar Masoumeh Mashaiekhi Meisam Pourafzal Reza Refaei Mohammad Hedayati Kamiar Kanani Mehrdad Senobari Sina Iravanin Mohammad Sadegh Rasooli Mohsen Hoseinalizadeh Mitra Nasri Alireza Dehlaghi Fatemeh Ahmadi Neda PourMorteza
    Leader badge
    Downloads: 425 This Week
    Last Update:
    See Project
  • 2
    iramuteq
    IRAMUTEQ : Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires. Logiciel de traitement de données pour des corpus texte ou de type individus/caractères. Permet notamment de réaliser des analyses de type "ALCESTE"
    Leader badge
    Downloads: 774 This Week
    Last Update:
    See Project
  • 3

    Presage

    the intelligent predictive text entry platform

    Presage (formerly Soothsayer) is an intelligent predictive text entry system. Presage generates predictions by modelling natural language as a combination of redundant information sources. Presage computes probabilities for words which are most likely to be entered next by merging predictions generated by the different predictive algorithms. Presage's modular and extensible architecture allows its language model to be extended and customized to utilize statistical, syntactic, and semantic predictive algorithms. Presage's predictive capabilities are implemented by predictive plugins. Predictive plugins use services provided by the platform to implement multiple prediction techniques.
    Leader badge
    Downloads: 347 This Week
    Last Update:
    See Project
  • 4
    Artha ~ The Open Thesaurus
    Artha is a handy thesaurus based on WordNet with distinct features like global hotkey look-up, passive desktop notifications, regular expression based search, etc.. Artha may be used as a free open-source replacement to the proprietary WordWeb Pro.
    Leader badge
    Downloads: 48 This Week
    Last Update:
    See Project
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • 5
    Mishkal: Arabic Text Vocalization

    Mishkal: Arabic Text Vocalization

    Arabic Text Vocalization system

    Automatic system of vocalization of arabic text.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    Google Translate PHP

    Google Translate PHP

    Free Google Translate API PHP Package

    A simple and effective PHP library for translating text using Google Translate without needing an API key. It allows developers to integrate real-time translation features into their applications with minimal setup and supports multiple languages, leveraging Google Translate’s unofficial endpoint.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    STranslate

    STranslate

    A ready-to-go translation ocr tool developed with WPF/WPF

    STranslate is a lightweight, open-source machine translation front end that lets users translate text between languages using a variety of supported back-end translation engines or APIs, offering a simple GUI for quick translation tasks without needing to write code or use complex web UIs. The application is designed to be small, cross-platform, and flexible, giving users the ability to type or paste text and receive instant translations while optionally selecting the desired language pairs or switching between multiple service providers. By abstracting backend complexity, STranslate makes it easy for both casual users and developers to get translations in local apps, offline modes (where supported), or even integrate translation workflows into larger projects via plugins or scripting hooks. It often includes features like clipboard monitoring, keyboard shortcuts, history tracking, and configurable translation preferences that enhance daily productivity.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Varamozhi is a free English-Malayalam transliteration library. It can transliterate Malayalam text between Malayalam and English scripts. Varamozhi takes as the input, the mapping between a Malayalam font and a transliteration scheme; outputs functions i
    Leader badge
    Downloads: 71 This Week
    Last Update:
    See Project
  • 9
    Apertium: Machine Translation Toolbox

    Apertium: Machine Translation Toolbox

    The free and open-source rule-based machine translation platform

    Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs.
    Downloads: 13 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10

    Wordcorr

    Data management for comparative linguistics

    Wordcorr automates the tedious and risky process of tabulating and managing the sound correspondences used in working out the historical development of natural languages. Initial support was from NSF.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 11
    Free Dictionaries
    Free translating dictionaries. Source format: TEI-P5 XML. Delivery formats: DICT, Stardict, etc. The dictionaries may include information on the pronunciation, etymology and such, in a platform-independent format. Access: web/plugins/standalone.
    Downloads: 59 This Week
    Last Update:
    See Project
  • 12
    UnsupervisedMT

    UnsupervisedMT

    Phrase-Based & Neural Unsupervised Machine Translation

    Unsupervised Machine Translation is a research repository that implements both phrase-based SMT and neural MT approaches for translation without parallel corpora. The neural component supports multiple architectures—seq2seq, biLSTM with attention, and Transformer—and allows extensive parameter sharing across languages to improve data efficiency. Training relies on denoising auto-encoding and back-translation, with on-the-fly, multithreaded generation of synthetic parallel data to continually refresh supervision signals. The project also provides scripts to fetch and preprocess monolingual data, learn BPE codes, and train cross-lingual embeddings that bootstrap unsupervised alignment between languages. Beyond the core EMNLP 2018 setup, the codebase exposes additional, optional capabilities such as multi-language training, language model pretraining with shared parameters, and adversarial training.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Fresh Memory

    Fresh Memory

    Flashcards application with Spaced Repetition method

    Fresh Memory is an application that helps to learn large amounts of any material with Spaced Repetition method. The most important subject is learning foreign words, but Fresh Memory can be also used to learn anything else. The learning data is stored as flash cards and dictionaries. The flash cards may have several fields, and the user controls what combination of fields to learn. The flashcards can have formatted text and images.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    srt-translator

    srt-translator

    Subtitle translator from one natural language to other.

    Translating subtitles in format SubRip from one natural language to other. It is based on Google Translate without API and therefore without payment. Translator have automatic and manual spell checkers.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    LaBB-CAT

    LaBB-CAT

    A linguistic annotation store

    LABB-CAT is a browser-based linguistics research tool that stores recordings and regular-expression searchable text transcripts of interviews. The search results, entire transcripts, and media, can be viewed or exported in a variety of format
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    The Dictionary System
    The application Dictionary System (DS) is a web application designed for creation of one-way bilingual dictionaries or encyclopaedias offering a working environment for creation of a dictionary and a web page which enables the general public to search in the dictionary. It is so-called DWS application (Dictionary Writing System) or DPS (Dictionary Production / Publishing System). Aplikace Dictionary System (dále DS) je webová aplikace. Je to tzv. DWS aplikace (Dictionary Writing System) nebo také tvz. DPS (Dictionary Production/Publishing System). Aplikace Dictionary System nabízí pracovní prostředí pro tvorbu jednosměrných dvojjazyčných slovníků nebo encyklopedií a webové stránky, které umožňují vyhledávat ve slovníku široké veřejnosti.
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Maskouk : Arabic Collocations
    Maskouk: Arabic Collocations Dictionary المسكوكات اللفظية العربيو، المتلازمات المتواردات
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    XML-Print

    XML-Print

    XML-Print: typesetting arbitrary XML documents in high quality

    "XML-Print" is a joint project of the FH Worms (Prof. Marc W. Küster) and the University of Trier (Prof. Claudine Moulin) with support from TU Darmstadt (Prof. Andrea Rapp). Its goal is the creation of a XML formatter designated especially for the needs of the “Digital Humanties”. The project is funded by the DFG. Please visit https://sites.google.com/a/budabe.eu/xmlprint_de/kontakt and let us know, what you think about XML-Print – Does it meet your expectations? – What is missing? – Do you use it regularly? Thank you.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    Better PO Editor is an editor for .po files, used to generate compiled gettext .mo files which are used by many programs and websites to localize the user interface. It offers great features... It's worth to give it a try! PLEASE NOTE: the project moved to GitHub: see https://github.com/mlocati/betterpoeditor/releases
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Unicode Conversion Gateway is a web-based proxy server to convert some of the Indian language web pages encoded in proprietary encodings into Unicode.Padma, a popular Firefox extension, is extended and reimplemented in PHP to create this proxy server
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    Al-Mintiq: Arabic eSpeak

    Al-Mintiq: Arabic eSpeak

    Arabic voice files for eSpeak system

    Arabic files and voices for eSpeak Text to speech system, المنطيق : ملفات اللغة العربية لبرنامج توليد الكلام من النص إسبيك
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    AzConvert is an open source program to convert different scripts of Azerbaijani language (Latin, Arabic and Cyrillic) to each other. It's written in Qt.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24

    Online Transcription Editor (OTE)

    A tool for Visual Transcriptions of biblical texts at INTF and ITSEE

    The Online Transcription Editor was developed as part of the joined project "Workspace for Collaborative Editing". It is used for transcriptions at the INTF in Munster and the ITSEE in Birmingham.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    A project that aims to create reusable components (C++ libraries, COM components, and Edit controls) for Phonetic Transliteration of Indian languages, such as Telugu, Tamil, Kannada etc.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB