Elastic Search

Elastic Search

Docker中安装ElasticSearch

需要Java环境

下载tar.gz并解压,并移动

mv elasticsearch-7.1.0 /usr/local/elasticsearch

修改配置

vi /usr/local/elasticsearch/config/elasticsearch.yml

yml文件

network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["127.0.0.1", "[::1]"]
# 7.1 版本即便不是多节点也需要配置一个单节点，否则
#the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
cluster.initial_master_nodes: ["node-1"]
# 配置indices fielddata得内存，超过80%就会释放
indices.fielddata.cache.size: 80%
# request数量使用内存限制，默认为JVM堆的40%。
indices.breaker.request.limit: 80%

创建一个非root用户elsearch来执行elasticsearch脚本。ES不能用root用户启动

# elasticsearch can not run elasticsearch as root
adduser elsearch # 会自动建组 test
# 将文件夹以及子文件夹全部该为test用户
chown -R elsearch:elsearch elasticsearch
ll
# drwxr-xr-x 1 elsearch elsearch 4096 May 28 16:54 elasticsearch

7.X新特性

removal mapping types官方：https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html

目前版本有一个默认的type _doc，使用api对文档操作的时候，也不需要在url上加入 type了，直接index即可，具体的api可以大部分都可以通过在url去掉type进行操作。

not_analyzed不存在了，如果需要不拆分

可以对index进行analyzer设置，将默认的analyzer设置成keyword就不会拆分了。
----------------------------------------------------------------
设置analyzer：需要先关闭index
1. POST http://server_ip/index_name/_close?pretty
2. PUT ： http://server_ip/index_name/_settings?pretty
    BODY:
    {
        "index":{
            "analysis" : {
                "analyzer" : {
                    "default" : {
                        "type" : "keyword"
                    }
                }
            }
        }
    }
3. POST http://server_ip/index_name/_open?pretty

没有string这个 column type了。可以换成text或者keyword
在查询中，新增{"track_total_hits":true}，可以查询出total得总数。不会被限制成10000

Elastic Search API得使用

介绍本次BGP项目中使用到得API得使用方法以及某些特定Payload得写法

注：没有使用顺序，每个payload不是唯一得写法。

创建Index:

PUT your_server_ip:9200/index_name

Payload

{
  "settings": {
    "number_of_shards": 5,
    "analysis": {
                    "analyzer": {
                        "default": {
                            "type": "keyword"
                        }
                    }
                },
    "refresh_interval": "30s",
    "max_result_window" : "1000000",
    "max_rescore_window": "1000000"
  },
  "mappings": {
      "properties": {
        "test": {
          "type": "keyword"
        }
    }
}

说明

settings设定index，mappings设置index得column
number_of_shards：分片数量，
analysis：此处是为了不适用分词，这个是7.x版本新的设置方式
refresh_interval：设置刷新时间，为了最大化_bulk得效率，最好设置30s左右
max_result_window：ES默认只能查询10000条数据，使用scroll API可以查询到max_result_window得数量得数据
max_rescore_window：rescore API使用，本次没有使用到

修改Index Mapping:

PUT/POST your_server_ip:9200/index_name/_mappings

Payload

{
    "properties": {
        "test": {
        "type": "keyword"
    }
}

说明

可以新增column
有一些字段类型得更改是不被允许得，只能使用_reindexAPI
直接在Payload中传入properties即可

修改Index Settings:

PUT/POST your_server_ip:9200/index_name/_settings

Payload

{
    "index":{
        "analysis" : {
            "analyzer" : {
                "default" : {
                    "type" : "keyword"
                }
            }
        },
        "refresh_interval": "30s",
        "max_result_window" : "1000000",
        "max_rescore_window": "1000000"
    }
}

说明

可以一次性设置多个index，url中 index_name= index1,index2,index3
某些更改设置，必须使用_closeAPI关闭index，比如analysis

创建Index模板:

PUT your_server_ip:9200/_template/template_name

Payload

{
  "index_patterns": ["test*"],
  "settings": {
    "number_of_shards": 1,
    "analysis": {
                    "analyzer": {
                        "default": {
                            "type": "keyword"
                        }
                    }
                },
    "refresh_interval": "30s",
    "max_result_window" : "1000000",
    "max_rescore_window": "1000000"
  },
  
  "mappings": {
      "properties": {
        "state": {
          "type": "keyword"
        }
      }
    }
}

说明

index_patterns表示以 test开头得index都拥有如下得setting与mapping
模板得意义在于，我们基于时间创建index时，不需要每次都添加index得setting和mapping