Search Results for "chinese word segmentation"

Showing 141 open source projects for "chinese word segmentation"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 1
    IK Analysis for Elasticsearch

    IK Analysis for Elasticsearch

    A plugin that integrates Lucene IK analyzer into elasticsearch

    IK Analyzer is an open source, lightweight Chinese word segmentation toolkit developed based on java language. Since the release of version 1.0 in December 2006, IKAnalyzer has launched 4 major versions. Initially, it was a Chinese word segmentation component based on the open source project Luence as the main application, combined with dictionary word segmentation and grammar analysis algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    ZhParser

    ZhParser

    PostgreSQL extension for full-text search of Chinese language

    zhparser is a PostgreSQL extension for full-text search of Chinese text. It integrates with PostgreSQL's text search engine to tokenize Chinese characters using a dictionary-based segmentation algorithm. zhparser is a valuable tool for improving search accuracy and performance in Chinese-language applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    HanLP

    HanLP

    Han Language Processing

    ...Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis. It comes with pretrained models for numerous languages including Chinese and English. It offers efficient performance, clear structure and customizable features, with plenty more amazing features to look forward to on the roadmap.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Underthesea

    Underthesea

    Underthesea - Vietnamese NLP Toolkit

    Underthesea is a Vietnamese NLP toolkit providing various text processing capabilities, including word segmentation, part-of-speech tagging, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Dawarich

    Dawarich

    Self-hostable alternative to Google Timeline

    Dawarich is a command-line tool (likely Ruby-based) for transforming and analyzing Arabic text data with normalization, diacritic handling, segmentation, and morphological tokenization. Designed for text mining and NLP workflows in Arabic-language contexts.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    RealtimeSTT

    RealtimeSTT

    A robust, efficient, low-latency speech-to-text library

    RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Fanyi

    Fanyi

    A 🇨🇳 and 🇺🇸 translate tool in your command line

    Fanyi is a tool for translating words between the Chinese and English languages, right in your command line. It’s a good supportive tool for learning and reading the Chinese language from English, or the other way around. All translation data is fetched from iciba.com and fanyi.youdao.com, and with each translation comprehensive and related samples are given for better understanding and proper usage. There are translations for words as well as sentences, and in Mac/Linux bash, words can even...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    koishi-plugin-novelai

    koishi-plugin-novelai

    Koishi plugin for NovelAI image generation with advanced controls

    ...It supports multiple configuration options, including model switching, sampler selection, and adjustable image sizes, giving users control over output quality and style. It includes advanced prompt syntax to refine results and allows automatic translation of Chinese keywords to improve usability across languages. A customizable banned word list helps filter unwanted content, while timed message recall can automatically delete generated outputs after a set period. Built on Koishi’s modular system, koishi-plugin-novelai can be extended with additional integrations, making it adaptable for different bot environments and use cases.
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Collapse Launcher

    Collapse Launcher

    An Advanced Launcher for miHoYo/HoYoverse Games

    Collapse was originally designed for Honkai Impact 3rd. However, as the project evolved, this launcher is now a game client for all currently released miHoYo Games. Collapse came from the Honkai Impact translation in Chinese and Japanese. The word came from or Bēng huài in Chinese and also or Houkai in Japanese, both meaning "Collapse" which is why we chose it as our launcher name with the added inspiration that this was supposed to be an alternative (enhanced) launcher for Honkai Impact 3rd in the first place.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    BaikalDB

    BaikalDB

    BaikalDB, A Distributed HTAP Database

    ...In a typical scenario, hundreds of millions of rows can be scanned and aggregated in few seconds. BaikalDB also supports full-text search by building inverted indices after word segmentation. Users can harness the fuzzy search features simply by adding a FULLTEXT KEY type index when creating tables.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Tally

    Tally

    Your favorite dark mode word counter, now with even more themes!

    Tally - Word Counter is a free online tool to count the number of characters, words, paragraphs, and lines in your text. It can also show counts for different types of characters like letters, digits, spaces, punctuation, and symbols/special characters. Make sure you have the right number of words for your essay or post by counting them instantly with Tally.
    Downloads: 140 This Week
    Last Update:
    See Project
  • 15
    Qwen-Image

    Qwen-Image

    Qwen-Image is a powerful image generation foundation model

    ...Qwen-Image supports sophisticated editing tasks such as style transfer, object insertion and removal, detail enhancement, and even human pose manipulation, making it suitable for both professional and casual users. It also includes advanced image understanding capabilities like object detection, semantic segmentation, depth and edge estimation, and novel view synthesis.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16
    The Ethers Project

    The Ethers Project

    Complete Ethereum library and wallet implementation in JavaScript

    A complete Ethereum wallet implementation and utilities in JavaScript (and TypeScript). Keep your private keys in your client, safe and sound. Import and export JSON wallets (Geth, Parity and crowdsale) Import and export BIP 39 mnemonic phrases (12 word backup phrases) and HD Wallets (English as well as Czech, French, Italian, Japanese, Korean, Simplified Chinese, Spanish, Traditional Chinese. Meta-classes create JavaScript objects from any contract ABI, including ABIv2 and Human-Readable ABI. Connect to Ethereum nodes over JSON-RPC, INFURA, Etherscan, Alchemy, Ankr or MetaMask. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Qwen-2.5-VL

    Qwen-2.5-VL

    Qwen2.5-VL is the multimodal large language model series

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 18
    ChatGPT Academic

    ChatGPT Academic

    ChatGPT extension for scientific research work

    ChatGPT extension for scientific research work, specially optimized academic paper polishing experience, supports custom shortcut buttons, supports custom function plug-ins, supports markdown table display, double display of Tex formulas, complete code display function, new local Python/C++/Go project tree Analysis function/Project source code self-translation ability, newly added PDF and Word document batch summary function/PDF paper full-text translation function. All buttons are...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Apache OpenOffice

    Apache OpenOffice

    The free and Open Source productivity suite

    Free alternative for Office productivity tools: Apache OpenOffice - formerly known as OpenOffice.org - is an open-source office productivity software suite containing word processor, spreadsheet, presentation, graphics, formula editor, and database management applications. OpenOffice is available in many languages, works on all common computers, stores data in ODF - the international open standard format - and is able to read and write files in other formats, included the format used by the...
    Leader badge
    Downloads: 233,761 This Week
    Last Update:
    See Project
  • 20
    PaddleNLP

    PaddleNLP

    Easy-to-use and powerful NLP library with Awesome model zoo

    PaddleNLP It is a natural language processing development library for flying paddles, with Easy-to-use text area API, Examples of applications for multiple scenarios, and High-performance distributed training Three major features, aimed at improving the modeling efficiency of the flying oar developer's text field, aiming to improve the developer's development efficiency in the text field, and provide rich examples of NLP applications. Provide rich industry-level pre-task capabilities Taskflow And process-wide text area API: Support for the loading of rich Chinese data sets Dataset API, can flexibly and efficiently complete data pretreatment Data API, Preset 60 + pre-training word vector Embedding API, Providing 100 + pre-training model Transformer API Wait, the efficiency of NLP task modeling can be greatly improved.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Flying-Bird-Wallpaper

    Flying-Bird-Wallpaper

    Flying Bird Wallpaper is a feature-rich desktop wallpaper application

    Flying Bird Wallpaper is a feature-rich desktop wallpaper application that supports multiple wallpaper types including images, videos, rhythm wallpapers, and solid colors, making your desktop unique and vibrant.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    WinMerge

    WinMerge

    Windows visual diff and merge for files and directories

    WinMerge is a Windows tool for visual difference display and merging, for both files and directories. It is highly useful for determining what has changed between file versions, and then merging those changes. WinMerge has Unicode support, Flexible syntax coloring editor, Visual SourceSafe integration, and Windows Shell integration. Regexp filtering for filenames and lines. Side-by-side line difference and highlights differences inside lines. A file map shows the overall file differences in...
    Leader badge
    Downloads: 31,150 This Week
    Last Update:
    See Project
  • 23
    QPrompt

    QPrompt

    Personal teleprompter software for all video makers.

    Free teleprompter software for all video creators. Built with ease of use, fast performance, control accuracy, and cross-platform support in mind. QPrompt works with studio teleprompters and tablet teleprompters, cellphones, and webcams. It was also designed for use in video conferences.
    Leader badge
    Downloads: 1,403 This Week
    Last Update:
    See Project
  • 24
    phpList

    phpList

    Powerful Open Source Email Marketing app with analytics & segmentation

    phpList delivers Open Source email marketing, including analytics, list segmentation, content personalisaton and bounce processing. Extensive technical features and a secure and stable codebase are the result of over 17 years of continuous development. Used in 95 countries, available in 20+ languages, and used to send 25 billion email campaigns last year. Deploy it with your own SMTP server, or get a free hosted account at http://phplist.com.
    Leader badge
    Downloads: 127 This Week
    Last Update:
    See Project
  • 25
    jEdit

    jEdit

    jEdit is a programmer's text editor written in Java.

    jEdit is a programmer's text editor written in Java. It uses the Swing toolkit for the GUI and can be configured as a rather powerful IDE through the use of its plugin architecture.
    Leader badge
    Downloads: 587 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB