Elasticsearch学习笔记——安装、数据导入和查询

时间:2022-11-17 13:44:59

到elasticsearch网站下载最新版本的elasticsearch 6.2.1

https://www.elastic.co/downloads/elasticsearch

中文文档请参考

https://www.elastic.co/guide/cn/elasticsearch/guide/current/index.html

下载tar包,然后解压到/usr/local目录下,修改一下用户和组之后可以使用非root用户启动,启动命令

./bin/elasticsearch

然后访问http://127.0.0.1:9200/

Elasticsearch学习笔记——安装、数据导入和查询

接下来导入json格式的数据,数据内容如下

{"index":{"_id":"1"}}
{"title":"许宝江","url":"7254863","chineseName":"许宝江","sex":"男","occupation":" 滦县农业局局长","nationality":"中国"}
{"index":{"_id":"2"}}
{"title":"鲍志成","url":"2074015","chineseName":"鲍志成","occupation":"医师","nationality":"中国","birthDate":"1901年","deathDate":"1973年","graduatedFrom":"香港大学"}

 需要注意的是{"index":{"_id":"1"}}和文件末尾另起一行换行是不可少的

否则会出现400状态,错误提示分别为

Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]
The bulk request must be terminated by a newline [\n]"

使用下面命令来导入json文件

其中的people.json为文件的路径,可以是/home/common/下载/xxx.json

其中的es是index,people是type,在elasticsearch中的index和type可以理解成关系数据库中的database和table,两者都是必不可少的

curl -H "Content-Type: application/json" -XPOST 'localhost:9200/es/people/_bulk?pretty&refresh' --data-binary "@people.json"

 成功后的返回值是200,比如

{
"took" : 233,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "es",
"_type" : "people",
"_id" : "1",
"_version" : 1,
"result" : "created",
"forced_refresh" : true,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "es",
"_type" : "people",
"_id" : "2",
"_version" : 1,
"result" : "created",
"forced_refresh" : true,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
}
]
}

 接下来可以使用对应的查询语句对数据进行查询

 <1>按id来查询

http://localhost:9200/es/people/1

 Elasticsearch学习笔记——安装、数据导入和查询

<2>简单的匹配查询,查询某个字段中包含某个关键字的数据(GET)

http://localhost:9200/es/people/_search?q=_id:1
http://localhost:9200/es/people/_search?q=title:许

 Elasticsearch学习笔记——安装、数据导入和查询

<3>多字段查询,在多个字段中查询包含某个关键字的数据(POST)

可以使用Firefox中的RESTer插件来构造一个POST请求,在升级到Firefox quantum之后,原来使用的Poster插件挂了

在title和sex字段中查询包含 许 字的数据

{
"query": {
"multi_match" : {
"query" : "许",
"fields": ["title", "sex"]
}
}
}

 Elasticsearch学习笔记——安装、数据导入和查询

Elasticsearch学习笔记——安装、数据导入和查询

还可以额外指定返回值

size指定返回的数量

from指定返回的id起始值

_source指定返回的字段

highlight指定语法高亮

{
"query": {
"multi_match" : {
"query" : "中国",
"fields": ["nationality", "sex"]
}
},
"size": 2,
"from": 0,
"_source": [ "title", "sex", "nationality" ],
"highlight": {
"fields" : {
"title" : {}
}
}
}

<4>Boosting

用于提升字段的权重,可以将max_score的分数乘以一个系数

{
"query": {
"multi_match" : {
"query" : "中国",
"fields": ["nationality^3", "sex"]
}
},
"size": 2,
"from": 0,
"_source": [ "title", "sex", "nationality" ],
"highlight": {
"fields" : {
"title" : {}
}
}
}

 Elasticsearch学习笔记——安装、数据导入和查询

<5>组合查询,可以实现一些比较复杂的查询

AND -> must

NOT -> must not

OR -> should

{
"query": {
"bool": {
"must": {
"bool" : {
"should": [
{ "match": { "title": "鲍" }},
{ "match": { "title": "许" }} ],
"must": { "match": {"nationality": "中国" }}
}
},
"must_not": { "match": {"sex": "女" }}
}
}
}

 <6>模糊(Fuzzy)查询(POST)

{
"query": {
"multi_match" : {
"query" : "厂长",
"fields": ["title", "sex","occupation"],
"fuzziness": "AUTO"
}
},
"_source": ["title", "sex", "occupation"],
"size": 1
}

 通过模糊匹配将 厂长 和 局长 匹配上

AUTO的时候,当query的长度大于5的时候,模糊值指定为2

Elasticsearch学习笔记——安装、数据导入和查询

<7>通配符(Wildcard)查询(POST)

 匹配任何字符

* 匹配零个或多个字

{
"query": {
"wildcard" : {
"title" : "*宝"
}
},
"_source": ["title", "sex", "occupation"],
"size": 1
}

 <8>正则(Regexp)查询(POST)

{
"query": {
"regexp" : {
"authors" : "t[a-z]*y"
}
},
"_source": ["title", "sex", "occupation"],
"size": 3
}

<9>短语匹配(Match Phrase)查询(POST)

短语匹配查询 要求在请求字符串中的所有查询项必须都在文档中存在,文中顺序也得和请求字符串一致,且彼此相连。

默认情况下,查询项之间必须紧密相连,但可以设置 slop 值来指定查询项之间可以分隔多远的距离,结果仍将被当作一次成功的匹配。

{
"query": {
"multi_match" : {
"query" : "许长江",
"fields": ["title", "sex","occupation"],
"type": "phrase"
}
},
"_source": ["title", "sex", "occupation"],
"size": 3
}

 注意使用slop的时候距离是累加的,滦农局 和 滦县农业局 差了2个距离

{
"query": {
"multi_match" : {
"query" : "滦农局",
"fields": ["title", "sex","occupation"],
"type": "phrase",
"slop":2
}
},
"_source": ["title", "sex", "occupation"],
"size": 3
}

<10>短语前缀(Match Phrase Prefix)查询(POST)