Best Open Source Desktop Operating Systems Natural Language Processing (NLP) Tools 2026

TXM

Unicode XML TEI text analysis platform

TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP full text search engine (http://cwb.sourceforge.net) and a range of statistical functions (factorial analysis, classification, cooccurrency analysis, etc.) based on R packages (http://www.r-project.org). Read the scientific background at the Textométrie project web site http://textometrie.ens-lyon.fr/?lang=en. Read a full description at the TEI Tools wiki http://wiki.tei-c.org/index.php/TXM.

Downloads: 12 This Week

Last Update: 2024-12-09

See Project

Darkbot

The IRC's Talking Robot

[ Please read https://sourceforge.net/p/darkbot/news/2014/01/darkbots-revitalization/ ] Darkbot is a portable IRC chat robot written in the C language that can be taught responses to user inquiries, and even have conversations with them. Darkbot was originally created by Jason Hamilton as an aid for help channels on Intenet Relay Chat.

Downloads: 6 This Week

Last Update: 2014-07-02

See Project

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . The actual development and issue tracking can be found here: https://bitbucket.org/cryanfuse/crgrep

3 Reviews

Downloads: 1 This Week

Last Update: 2023-04-23

See Project

Arabic Corpus

Text categorization, arabic language processing, language modeling

The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on Arabic Corpora,JOURNAL OF DIGITAL INFORMATION MANAGEMENT,vol. 9, N. 5, pp.185-192. 2) For Khaleej-2004 corpus --------------------------------- M. Abbas, K. Smaili (2005) Comparison of Topic Identification Methods for Arabic Language, RANLP05 : Recent Advances in Natural Language Processing ,pp. 14-17, 21-23 september 2005, Borovets, Bulgary. More useful references to check: ------------------------------------------- https://sites.google.com/site/mouradabbas9/corpora

Downloads: 2 This Week

Last Update: 2019-03-05

See Project

Bermuda Text-to-Speech

This project includes basic NLP and DSP techniques for Text-to-Speech

See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.

Downloads: 0 This Week

Last Update: 2014-03-24

See Project

MII Medical NLP Toolkit

This is a toolkit for medical natural language processing (NLP). The core engine is general enough to be used in a variety of text processing domains, though the toolkit includes specific support for medical reports and patient de-identification.

Downloads: 0 This Week

Last Update: 2014-07-02

See Project

Modular Suite of NLP Tools

This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.

Downloads: 0 This Week

Last Update: 2014-06-09

See Project

OBO Annotator

The OBO-Annotator is a semantic NLP tool that is designed to give its end-users a great deal of flexibility to combine any number of OBO ontologies from the OBO foundry regardless of their format and use them to annotate text-bases.

Downloads: 0 This Week

Last Update: 2014-10-08

See Project

SALM

A toolkit with using Suffix Array indexing for empirical natural language processing. Providing functions such as searching the occurrences of n-grams in the corpus and suffix array language model which can use arbitrarily long history.

Downloads: 0 This Week

Last Update: 2015-04-16

See Project

UniBurma

NOTE: I couldn't keep up this project to align with latest Unicode spec. Not sure I may be continuing. You can try Myanmar3 from Myanmar NLP or WinUniInnwa or https://sourceforge.net/projects/prahita/ or something better compliant font. ~Victor --- [This is UniBurma - UniMM project workshop area. This project currently have two productions, UniBurma and UniMM. For more descriptive info about this project, please visit http://unimm.org/. You can browse lastest source from SVN trunk.]

Downloads: 0 This Week

Last Update: 2023-07-22

See Project

training with NLP

Making training programs work with chunking strategies

Strategies make up everything we do. We have thinking strategies that are effective. We have thinking strategies that are less so! It has been proven possible to secure elements of successful strategies and apply these to our training of all kinds of skills. Looking at the practices of successful people, and breaking down the parts of their effective behaviours we can teach these in order to improve particular areas of effectiveness. This is the basis of this learning and teaching strategy!

Downloads: 0 This Week

Last Update: 2013-08-19

See Project

Open Source Desktop Operating Systems Natural Language Processing (NLP) Tools

Natural Language Processing (NLP) Tools for Desktop Operating Systems

TXM

Darkbot

Common Resource Grep - crgrep

Arabic Corpus

Bermuda Text-to-Speech

MII Medical NLP Toolkit

Modular Suite of NLP Tools

OBO Annotator

SALM

UniBurma

training with NLP

Related Searches