Best Open Source Mac Linguistics Software 2026

Linguistics Software for Mac

Linguistics Mac BSD License Clear Filters

Browse free open source Linguistics software and projects for Mac below. Use the toggles on the left to filter open source Linguistics software by OS, license, language, programming language, and project status.

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

TEI LingSIG

Production space for the TEI Linguistics SIG

This used to be the experimentation and production space for the Special Interest Group (SIG) of the Text Encoding Initiative (TEI) called "TEI for Linguists", LingSIG for short. Currently, this is a storage place for documents produced by the SIG. Use https://github.com/LingSIG to access the current production space.

Downloads: 2 This Week

Last Update: 2025-09-18
See Project
2

Australian National Corpus

An ongoing project to collate and provide access to language data

Includes • Scripts for the program/ code developed • High level architecture diagrams • Install guides for developers • Links to end user documentation on the AusNC website Note: The BSD license applies to customised plug-ins, scripts and ingest programs developed by the AusNC project team. Additional open source, 3rd party software products used by the AusNC solution are referenced on our SF wiki space.

Downloads: 1 This Week

Last Update: 2016-11-29
See Project
3

LINNAEUS

Entity recognition and normalization software for biomedical text

Downloads: 1 This Week

Last Update: 2016-05-05
See Project
4

MITRE Annotation Toolkit

A toolkit for managing and manipulating text annotations

The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process. It can be customized for specific tasks (e.g., named entity identification, de-identification of medical records). The goal of MAT is not to help you configure your training engine (in the default case, the Carafe CRF system) to achieve the best possible performance on your data. MAT is for "everything else": all the tools you end up wishing you had.

Downloads: 1 This Week

Last Update: 2023-04-19
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

Annoschemer

Annoschemer is a little tool for easy editing of MMAX2 annotationschemes.

Downloads: 0 This Week

Last Update: 2014-07-15
See Project
6

Argument Dynamics Simulation

A simulation package for investigating the dynamics of complex controversy.

Downloads: 0 This Week

Last Update: 2013-06-30
See Project
7

BioContext

Software for extraction of biomedical information from literature

Downloads: 0 This Week

Last Update: 2012-02-12
See Project
8

BioLemmatizer

Lemmatization tool for morphological analysis of biomedical literature

The BioLemmatizer is a domain-specific lemmatization tool for the morphological analysis of biomedical literature. It is tailored to the biological domain through integration of several published lexical resources related to molecular biology. It focuses on the inflectional morphology of English, including the plural form of nouns, the conjugations of verbs, and the comparative and superlative form of adjectives and adverbs. README: https://sourceforge.net/projects/biolemmatizer/files/ The BioLemmatizer 1.2 release adds an optional functionality to normalize British English spellings into American English spellings and then retrieve corresponding lemmas. If you use the BioLemmatizer to support academic research, please cite the following paper: Haibin Liu, Tom Christiansen, William A Baumgartner Jr, and Karin Verspoor BioLemmatizer: a lemmatization tool for morphological processing of biomedical text Journal of Biomedical Semantics 2012, 3:3.

Downloads: 0 This Week

Last Update: 2013-10-23
See Project
9

CRFSharp

CRFSharp is a .NET(C#) implementation of Conditional Random Field

CRFSharp(aka CRF#) is a .NET(C#) implementation of Conditional Random Fields, an machine learning algorithm for learning from labeled sequences of examples. It is widely used in Natural Language Process (NLP) tasks, for example: word breaker, postagging, named entity recognized, query chunking and so on. CRF#'s mainly algorithm is the same as CRF++ written by Taku Kudo. It encodes model parameters by L-BFGS. Moreover, it has many significant improvement than CRF++, such as totally parallel encoding, optimizing memory usage and so on. Currently, when training corpus, compared with CRF++, CRF# can make full use of multi-core CPUs and only uses very low memory, and memory grow is very smoothly and slowly while amount of training corpus, tags increase. with multi-threads process, CRF# is more suitable for large data and tags training than CRF++ now. For example, in machine with 64GB, CRF# encodes model with more than 4.5 hundred million features quickly.

Downloads: 0 This Week

Last Update: 2015-08-03
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

Encode Arabic

Encode Arabic provides tools for encoding and decoding Arabic in Haskell, Python, Perl, or LaTeX. Interprets the ArabTeX notation to generate original orthography or phonetic transcription. Supports Buckwalter and other romanizations. Converts legacy byte encodings into Unicode. http://github.com/otakar-smrz/encode-arabic

1 Review

Downloads: 0 This Week

Last Update: 2016-06-28
See Project
11

FlexiTerm

NOTE: For latest version, please visit https://github.com/ispasic/FlexiTerm. FlexiTerm is an open-source software tool for automatic term recognition. FlexiTerm uses a range of methods to neutralise the main sources of term variation. FlexiTerm is robust enough for less formally structured texts, such as those found in patient blogs or medical notes.

Downloads: 0 This Week

Last Update: 2018-10-05
See Project
12

KneeTex

KneeTex is an open–source, stand–alone application for information extraction from narrative reports that describe an MRI scan of the knee. Given an MRI report as input, the system outputs the corresponding clinical findings in the form of JavaScript Object Notation objects. The extracted information is mapped onto TRAK, an ontology that formally models knowledge relevant for the rehabilitation of knee conditions. As a result, formally structured and coded information allows for complex searches to be conducted efficiently over the original MRI reports, thereby effectively supporting epidemiologic studies of knee conditions.

Downloads: 0 This Week

Last Update: 2015-09-11
See Project
13

MetaSyntaxesSharp

A collection of Metasyntaxes like EBNF for .Net including a definition file parser and an expression tree.

Downloads: 0 This Week

Last Update: 2013-04-22
See Project
14

MinGen

MinGen is a Minimalist generator, the logical opposite to a parser. MinGen generates syntactically valid sentences by following the rules of Minimalism.

Downloads: 0 This Week

Last Update: 2013-11-01
See Project
15

Morfologik

ATTENTION! Morfologik is now at GitHub: https://github.com/morfologik/

1 Review

Downloads: 0 This Week

Last Update: 2015-09-10
See Project
16

Open Source Linguistics

A public repository of open source scripts and small programs related to linguistics and language.

Downloads: 0 This Week

Last Update: 2014-08-22
See Project
17

Open Translation Engine

The Open Translation Engine (OTE) is a web-based translation dictionary manager. The OTE allows a community of users to create and manage one or many translation dictionaries. The OTE is written in PHP and uses a MySQL database.

1 Review

Downloads: 0 This Week

Last Update: 2013-12-12
See Project
18

Semantic Segment

This project is used to segment text into semantic parts by meaning of language model.

Downloads: 0 This Week

Last Update: 2015-05-19
See Project
19

WordSegment

wordseg project is a word segment module implemented by C#

wordseg project is a word segment module implemented by C#. It is used to segment text into tokens and to label token's attribute according its context and semantic by front-maximum matching and CRF algorithms. The following are some sentences need to be segmented: 张晓晨和付仲恺一起坐在家（西坝河东里社区）里的沙发上看非诚勿扰。百度公司的名字源于“众里寻他千百度”这诗句。 After above sentences be segmented by wordseg, the result as follows for each sentence: 张晓晨[PER] 和付仲恺[PER] 一起坐在家（西坝河东里社区[LOC] ）里的沙发[PDT] 上看非诚勿扰。百度公司[ORG] 的名字源于 “ 众里寻他千百度 ” 这诗句。 In above, if a token has some attributes, the attribute result will be appended into the corresponding token within "[]". Since wordseg has introduced statistics model to segment text by context, for same sub string in different context, dif

Downloads: 0 This Week

Last Update: 2014-03-05
See Project
20

stocleka

stocleka is a project divided into a UI and a library for cleaning user stories and converting them to arff files (used for Weka). it may be mainly used for research and scientific purposes.

Downloads: 0 This Week

Last Update: 2013-04-12
See Project
21

texrex

Web corpus creation software (moved to GitHub)

This project has moved to GitHub: https://github.com/rsling/texrex https://github.com/rsling/cow

Downloads: 0 This Week

Last Update: 2016-04-20
See Project
22

wquery

WQuery is a domain-specific query language designed to process WordNet-like lexical databases. It may be used as a standalone application or as an API to a lexical database in Java based systems.

Downloads: 0 This Week

Last Update: 2013-05-02
See Project