基于Java的SpringBoot项目，如何使用中文分词进行全文检索

在 Spring Boot 里做中文分词 + 全文检索，常见就两条路线：1）轻量简单：MySQL 全文索引 + 中文分词插件（适合数据量不大）2）专业高效：Elasticsearch + IK 分词器（适合大数据、高并发）

一、简单方案：MySQL + 中文分词（推荐快速上手）

1. 安装 MySQL 中文分词插件

MySQL 自带分词对中文很差，必须装插件：

ngram 插件
（MySQL 5.7+ 内置，最简单）
mysql-jieba 分词插件
（效果更好，需手动安装）

1）使用内置 ngram（最简单）

修改 my.cnf / my.ini：

ft_min_word_len = 1ngram_token_size = 2

重启 MySQL。

2. 建表 + 建立全文索引

ALTER TABLE articleADD FULLTEXT INDEX idx_ft_title_content (title, content) WITH PARSER ngram;

3. Spring Boot 查询（MyBatis/MyBatis-Plus）

<select id="searchByKeyword" resultType="Article">
    SELECT * FROM article
    WHERE MATCH(title, content) AGAINST(#{keyword} IN NATURAL LANGUAGE MODE)
</select>

优点：零学习成本，不用额外服务。缺点：分词效果一般，大数据量性能一般。

二、企业级方案：Elasticsearch + IK 分词（推荐正式项目）

1. 环境准备

安装 Elasticsearch
安装 IK 分词器（elasticsearch-analysis-ik）
- ik_smart
  ：粗粒度分词
- ik_max_word
  ：细粒度全分词
- 两种分词模式：

2. Spring Boot 集成 ES

1）引入依赖


<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

2）application.yml


spring:
  elasticsearch:
    uris: http://localhost:9200

3）实体类（指定 IK 分词）

@Document(indexName = "article")
public class Article {
    @Id
    private Long id;
    @Field(type = FieldType.Text, analyzer = "ik_max_word", searchAnalyzer = "ik_smart")
    private String title;
    @Field(type = FieldType.Text, analyzer = "ik_max_word", searchAnalyzer = "ik_smart")
    private String content;
}

4）Repository

public interface ArticleRepository extends ElasticsearchRepository<Article, Long> {
    // 多字段全文检索
    Page<Article> findByTitleOrContentContaining(String title, String content, Pageable pageable);
}

5）使用 NativeQuery 精准搜索

@Autowired
private ElasticsearchOperations elasticsearchOperations;
public List<Article> search(String keyword) {
    Query query = NativeQuery.builder()
        .withQuery(QueryBuilders
            .multiMatchQuery(keyword, "title", "content")
            .analyzer("ik_smart")
        )
        .withPageable(PageRequest.of(0, 10))
        .build();
    SearchHits<Article> hits = elasticsearchOperations.search(
        query, Article.class);
    return hits.stream()
        .map(SearchHit::getContent)
        .toList();
}

三、更轻量：Lucene 本地全文检索（不依赖 ES/MySQL）

适合单机小项目，内嵌 Jar 即可：

依赖：lucene-core + lucene-analyzers-smartcn（中文分词）
本地建立索引文件，关键词检索

优点：无中间件，纯代码实现。缺点：不支持分布式，不适合海量数据。

四、快速选型建议

数据 < 100 万，简单快速
→ MySQL + ngram
中大型项目、需要高亮 / 聚合 / 分页 / 高并发
→ Elasticsearch + IK
单机小工具、不想装服务
→ Lucene + 中文分词

菜单

分享

基于Java的SpringBoot项目，如何使用中文分词进行全文检索

1. 安装 MySQL 中文分词插件

1）使用内置 ngram（最简单）

2. 建表 + 建立全文索引

二、企业级方案：Elasticsearch + IK 分词（推荐正式项目）

1. 环境准备

2. Spring Boot 集成 ES

1）引入依赖

三、更轻量：Lucene 本地全文检索（不依赖 ES/MySQL）

四、快速选型建议

评论

开源工作流引擎Flowable 7.2介绍

基于Python语言开发的开源博客分享

基于Java开源的规则引擎技术方案选型

基于Nginx的免费SSL证书安装配置实践分享

基于Java的SpringBoot项目，如何使用中文分词进行全文检索

开源工作流引擎三剑客Activiti、Flowable、Camunda 详细对比选型分析

2026 年 4 月 24 日DeepSeek V4 预览版发布全景解析

IT行业项目代码版本控制的事实标准Git

Linux系统安装部署 MySQL8.0

一文梳理可信数据空间的国家政策法规