以下分为 索引文档(insert) 和 查询文档(select)
1 一个index只有一个type
索引文档时,使用 _doc来代替type
PUT /megacorp/_doc/3
{
"first_name" : "Douglas",
"last_name" : "Fir",
"age" : 35,
"about": "I like to build cabinets",
"interests": [ "forestry" ]
}
查询某一条文档
GET /megacorp/_doc/3
查询姓smith的
GET /megacorp/_search?q=last_name:Smith
2 查询姓smith的,并大于30岁的 DSL 1使用 a and b 2查询a,过滤b
POST /megacorp/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"last_name": "Smith"
}
},
{
"range": {
"age": {
"gt": 30
}
}
}
]
}
}
}
POST /megacorp/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"last_name": "Smith"
}
}
],
"filter": {
"range": {
"age": {
"gt": 30
}
}
}
}
}
}
3短语搜索, 包含关键字的全部分词
https://blog.csdn.net/sinat_29581293/article/details/81486761
GET /megacorp/_search
{
"query" : {
"match_phrase": {
"about" : "rock climbing"
}
}
}
4查看关键字分词 standard标准分词汉字分为每个字,英文分为每个单词 ,ik分词 有 ik_smart 和ik_max_word
GET /megacorp/_analyze
{
"text": ["康师傅","rock climbing"],
"analyzer": "standard"
}
{
"tokens" : [
{
"token" : "康",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "师",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "傅",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
},
{
"token" : "rock",
"start_offset" : 4,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 103
},
{
"token" : "climbing",
"start_offset" : 9,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 104
}
]
}
5查看某个字段在索引文档时分词结果
GET /test/_analyze
{
"field": "t_name",
"text": ["康师傅","rock climbing"],
}
6 查看文档字段 ,t_name字段在索引文档时使用ik_max_word分词,查询文档时使用ik_smart分词
https://segmentfault.com/a/1190000012553894?utm_source=tag-newest
http://localhost:9200/test/_mapping
t_name: {
type: "text",
similarity: "BM25",
fields: {
keyword: {
type: "keyword",
ignore_above: 256
}
},
analyzer: "ik_max_word",
search_analyzer: "ik_smart"
},
t_pyname: {
type: "text",
fields: {
keyword: {
type: "keyword",
ignore_above: 256
}
}
},
7高亮关键字
GET /megacorp/_search
{
"query" : {
"match_phrase": {
"about" : "rock climbing"
}
},
"highlight": {
"fields": {
"about": {}
}
}
}
8es的group_by,聚合 aggregations,进行分析统计
GET /megacorp/_search
{
"aggs": {
"all_inter": {
"terms": {
"field": "interests.keyword"
}
}
}
}
9 聚合时报错,具体原因是聚合需要大量的内存,聚合前,需要将相应的字段开启聚合,或者按上面的方式 使用 .keyword
Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
PUT megacorp/_mapping
{
"properties": {
"interests": {
"type": "text",
"fielddata": true
}
}
}
10聚合时间长,聚合慢, 使用"execution_hint": "map"
https://blog.csdn.net/laoyang360/article/details/79253294
GET /megacorp/_search{
"query": {
"match": {
"last_name": "smith"
}
},
"aggs": {
"all_inter": {
"terms": {
"field": "interests", "execution_hint": "map"
}
}
}
}
11查询文档,一个字段多个关键字(同一个字段查询多个搜索词) interests字段包含music的或者包含sports的,or
GET /megacorp/_search
{
"query": {
"terms": {
"interests": [
"music",
"sports"
]
}
}
}
12查询文档,同一个字段包含多个关键字 interests字段包含music的和包含sports的,and
GET /megacorp/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"interests": {
"value": "music"
}
}
}
,
{
"term": {
"interests": {
"value": "sports"
}
}
}
]
}
}
}
12查询文档,一个关键字多个字段(同一个搜索词查询多个字段)
https://blog.csdn.net/dm_vincent/article/details/41820537
GET /megacorp/_search
{
"query": {
"multi_match": {
"query": "Smith",
"fields": ["last_name","first_name"]
}
}
}
13聚合分级汇总,聚合后的每一组数据进行统计,aggs后再aggs
GET /megacorp/_search
{
"size":0,
"aggs": {
"all_inter": {
"terms": {
"field": "interests",
"execution_hint": "map"
},
"aggs": {
"avg_age": {
"avg": {
"field": "age"
}
}
}
}
}
}
14 多字段查询, 如一个关键字查询同音字,同义字,形近字,等
https://blog.csdn.net/questiontoomuch/article/details/48493991
同音字可以增加一个字段,如 t_pyname 是t_name的pinyin
同义字增加一个字段, t_shinglesname
- 使用一个词干提取器来将jumps,jumping和jumped索引成它们的词根:jump。然后当用户搜索的是jumped时,我们仍然能够匹配含有jumping的文档。
- 包含同义词,比如jump,leap和hop。
- 移除变音符号或者声调符号:比如,ésta,está和esta都会以esta被索引。