Browse free open source Linguistics software and projects for Windows and ChromeOS below. Use the toggles on the left to filter open source Linguistics software by OS, license, language, programming language, and project status.

  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    CRFSharp

    CRFSharp

    CRFSharp is a .NET(C#) implementation of Conditional Random Field

    CRFSharp(aka CRF#) is a .NET(C#) implementation of Conditional Random Fields, an machine learning algorithm for learning from labeled sequences of examples. It is widely used in Natural Language Process (NLP) tasks, for example: word breaker, postagging, named entity recognized, query chunking and so on. CRF#'s mainly algorithm is the same as CRF++ written by Taku Kudo. It encodes model parameters by L-BFGS. Moreover, it has many significant improvement than CRF++, such as totally parallel encoding, optimizing memory usage and so on. Currently, when training corpus, compared with CRF++, CRF# can make full use of multi-core CPUs and only uses very low memory, and memory grow is very smoothly and slowly while amount of training corpus, tags increase. with multi-threads process, CRF# is more suitable for large data and tags training than CRF++ now. For example, in machine with 64GB, CRF# encodes model with more than 4.5 hundred million features quickly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    LanguageTool

    Proofreading Software for 20+ Languages

    LanguageTool is an Open Source language/grammar checker. *** THIS REPOSITORY IS OUT OF DATE, see https://github.com/languagetool-org INSTEAD ***
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    This project is used to segment text into semantic parts by meaning of language model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Thinknowlogy

    Thinknowlogy

    The world's only naturally intelligent knowledge technology

    Natural intelligence is the utilization of naturally occurring logic. This naturally occurring logic provides concrete clues for organizing natural objects, like: - Grouping objects that belong together, - Separating objects that don't belong together, - Archiving objects that have become less important. Natural language and spatial information are sources of natural intelligence: - Natural language is providing concrete logic for organizing knowledge objects, - Spatial information provides concrete logic for organizing spatial objects (utilized in, e.g., self-driving cars). In this way, our brains know how to organize their knowledge and spatial information. I focus on natural language because this source of natural intelligence is hardly understood by scientists. Hence, the inability of Large Language Models to organize changes in their knowledge independently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    WordSegment

    WordSegment

    wordseg project is a word segment module implemented by C#

    wordseg project is a word segment module implemented by C#. It is used to segment text into tokens and to label token's attribute according its context and semantic by front-maximum matching and CRF algorithms. The following are some sentences need to be segmented: 张晓晨和付仲恺一起坐在家(西坝河东里社区)里的沙发上看非诚勿扰。 百度公司的名字源于“众里寻他千百度”这诗句。 After above sentences be segmented by wordseg, the result as follows for each sentence: 张晓晨[PER] 和 付仲恺[PER] 一起 坐 在 家 ( 西坝河东里社区[LOC] ) 里 的 沙发[PDT] 上 看 非 诚 勿扰 。 百度公司[ORG] 的 名字 源于 “ 众 里 寻 他 千百度 ” 这 诗句 。 In above, if a token has some attributes, the attribute result will be appended into the corresponding token within "[]". Since wordseg has introduced statistics model to segment text by context, for same sub string in different context, dif
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB