diff --git a/elastic search/02_elastic_search_index.md b/elastic search/02_elastic_search_index.md index c3701af..632ef03 100644 --- a/elastic search/02_elastic_search_index.md +++ b/elastic search/02_elastic_search_index.md @@ -16,6 +16,12 @@ - [path param](#path-param) - [query param](#query-param) - [示例](#示例) + - [Similarity module](#similarity-module) + - [配置similarity](#配置similarity) + - [slow query](#slow-query) + - [Search Slow log](#search-slow-log) + - [identify search log origin](#identify-search-log-origin) + - [index slow log](#index-slow-log) # index modules @@ -133,5 +139,140 @@ PUT /my-index-000001/_block/write } ] } ``` +## Similarity module +similarity moudle(scoring/ranking model)定义了如何对匹配到的document进行打分。similaity是针对单个字段的,这意味着可以通过mapping为每个字段都定义不同的mapping。 +similarity仅适用于text类型和keyword类型的字段。 +### 配置similarity +大多similarity都可以通过如下方式进行配置: +``` +PUT /index +{ + "settings": { + "index": { + "similarity": { + "my_similarity": { + "type": "DFR", + "basic_model": "g", + "after_effect": "l", + "normalization": "h2", + "normalization.h2.c": "3.0" + } + } + } + } +} +``` +上述示例中,配置了DFR similarity,故而,在mapping中,即可通过`my_similarity`来进行引用,示例如下所示: +``` +PUT /index/_mapping +{ + "properties" : { + "title" : { "type" : "text", "similarity" : "my_similarity" } + } +} +``` +## slow query +### Search Slow log +shard level slow search log允许将slow query记录到特定的日志文件中。 + +对于threshold,可以对query阶段和fetch阶段分别进行配置,示例如下所示: +``` +index.search.slowlog.threshold.query.warn: 10s +index.search.slowlog.threshold.query.info: 5s +index.search.slowlog.threshold.query.debug: 2s +index.search.slowlog.threshold.query.trace: 500ms + +index.search.slowlog.threshold.fetch.warn: 1s +index.search.slowlog.threshold.fetch.info: 800ms +index.search.slowlog.threshold.fetch.debug: 500ms +index.search.slowlog.threshold.fetch.trace: 200ms +``` +上述所有的配置都是`dynamic`的,并且可以针对每个index单独进行设置,示例如下所示: +``` +PUT /my-index-000001/_settings +{ + "index.search.slowlog.threshold.query.warn": "10s", + "index.search.slowlog.threshold.query.info": "5s", + "index.search.slowlog.threshold.query.debug": "2s", + "index.search.slowlog.threshold.query.trace": "500ms", + "index.search.slowlog.threshold.fetch.warn": "1s", + "index.search.slowlog.threshold.fetch.info": "800ms", + "index.search.slowlog.threshold.fetch.debug": "500ms", + "index.search.slowlog.threshold.fetch.trace": "200ms" +} +``` + +默认情况下,threshold为`-1`,代表threshold被停用。 + +该日志针对的是shard的范围。 + +search slow log file在`log4j2.properties`文件中进行配置。 + +### identify search log origin +通过将`index.search.slowlog.include.user`配置项设置为true,可以在slow log中输出`触发该slow query的用户信息`,示例如下: +``` +PUT /my-index-000001/_settings +{ + "index.search.slowlog.include.user": true +} +``` +上述设置将导致用户信息将会被包含在slow log中: +```json +{ + "@timestamp": "2024-02-21T12:42:37.255Z", + "log.level": "WARN", + "auth.type": "REALM", + "elasticsearch.slowlog.id": "tomcat-123", + "elasticsearch.slowlog.message": "[index6][0]", + "elasticsearch.slowlog.search_type": "QUERY_THEN_FETCH", + "elasticsearch.slowlog.source": "{\"query\":{\"match_all\":{\"boost\":1.0}}}", + "elasticsearch.slowlog.stats": "[]", + "elasticsearch.slowlog.took": "747.3micros", + "elasticsearch.slowlog.took_millis": 0, + "elasticsearch.slowlog.total_hits": "1 hits", + "elasticsearch.slowlog.total_shards": 1, + "user.name": "elastic", + "user.realm": "reserved", + "ecs.version": "1.2.0", + "service.name": "ES_ECS", + "event.dataset": "elasticsearch.index_search_slowlog", + "process.thread.name": "elasticsearch[runTask-0][search][T#5]", + "log.logger": "index.search.slowlog.query", + "elasticsearch.cluster.uuid": "Ui23kfF1SHKJwu_hI1iPPQ", + "elasticsearch.node.id": "JK-jn-XpQ3OsDUsq5ZtfGg", + "elasticsearch.node.name": "node-0", + "elasticsearch.cluster.name": "distribution_run" +} +``` +### index slow log +index slow log和search slow log类似,其log file名称以`_index_indexing_slowlog.json`结尾。index slow log的配置如下所示: +``` +index.indexing.slowlog.threshold.index.warn: 10s +index.indexing.slowlog.threshold.index.info: 5s +index.indexing.slowlog.threshold.index.debug: 2s +index.indexing.slowlog.threshold.index.trace: 500ms +index.indexing.slowlog.source: 1000 +``` +index slow log的配置也是dynamic的,可以通过如下示例来进行配置: +``` +PUT /my-index-000001/_settings +{ + "index.indexing.slowlog.threshold.index.warn": "10s", + "index.indexing.slowlog.threshold.index.info": "5s", + "index.indexing.slowlog.threshold.index.debug": "2s", + "index.indexing.slowlog.threshold.index.trace": "500ms", + "index.indexing.slowlog.source": "1000" +} +``` +如果想要在日志中包含触发该slow index请求的用户,可以通过如下方式进行请求: +``` +PUT /my-index-000001/_settings +{ + "index.indexing.slowlog.include.user": true +} +``` +默认情况下,elasticsearch会打印slow log中头1000个字符。可以通过`index.indexing.slowlog.source`来修改该配置。 +- 如果将`indexing.slowlog.source`设置为false或0,将会跳过对`source`的输出 +- 如果将`indexing.slowlog.source`设置为true,将会输出所有`source`的内容