GET /scddb/_analyze
{
"text": "蓝瘦⾹菇",
"analyzer": "ik_max_word" //ik_smart
}
测试分词效果如下,不是很理想:
{
"tokens" : [
{
"token" : "蓝",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{蓝瘦香菇是什么意思
"token" : "瘦",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "⾹菇",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 2
}
]
}
参考这⾥添加⾃定义IK词库:
重启:service elasticsearch restart
再测试:
{
"tokens" : [
{
"token" : "蓝瘦⾹菇",
"start_offset" : 0,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 0
}
]
}
发布评论