IK Analyzer is an open source, lightweight Chinese word segmentation toolkit developed based on java language. Since the release of version 1.0 in December 2006, IKAnalyzer has launched 4 major versions. Initially, it was a Chinese word segmentation component based on the open source project Luence as the main application, combined with dictionary word segmentation and grammar analysis algorithms. Starting from version 3.0, IK has developed into a common word segmentation component for Java, independent of the Lucene project, and at the same time provides a default optimized implementation of Lucene. In the 2012 version, IK implemented a simple word segmentation ambiguity elimination algorithm, marking the evolution of the IK tokenizer from pure dictionary word segmentation to analog semantic word segmentation.

Features

  • Adopt the unique "forward iterative most fine-grained segmentation algorithm", support two segmentation modes of fine-grained and intelligent word segmentation
  • The 2012 version of the smart word segmentation mode supports simple word segmentation and ambiguity processing and quantifier merge output
  • Adopts multi-sub-processor analysis mode, supports: word segmentation processing such as English letters, numbers, Chinese vocabulary, etc., compatible with Korean and Japanese characters
  • Optimized dictionary storage, smaller memory footprint
  • Support user dictionary extension definition. In particular, in the 2012 version, the dictionary supports Chinese, English, and number mixed words
  • Provides a simple word segmentation ambiguity elimination algorithm

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow IK Analysis for Elasticsearch

IK Analysis for Elasticsearch Web Site

Other Useful Business Software
Build on Google Cloud with $300 in Free Credit Icon
Build on Google Cloud with $300 in Free Credit

New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
Start Free Trial
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of IK Analysis for Elasticsearch!

Additional Project Details

Operating Systems

Windows

Programming Language

Java

Related Categories

Java Browser Extensions and Plugins, Java Languages Software

Registered

2021-05-17