第六节 排序及Doc Values&Fielddata
1、排序
- Elasticsearch 默认采⽤相关性算分对结果进⾏降序排序
- 可以通过设定 sorting 参数,⾃⾏设定排序
- 如果不指定_score,算分为 Null
 
 
1-1 单字段排序
#单字段排序
POST /kibana_sample_data_ecommerce/_search
{
  "size": 5,
  "query": {
    "match_all": {
    }
  },
  "sort": [
    {"order_date": {"order": "desc"}}
  ]
}
"_score" : null,
 "order_date" : "2020-10-10T23:45:36+00:00", 
.....
 "order_date" : "2020-10-10T23:31:12+00:00",
 ...
1-2 多字段进行排序
- 组合多个条件
- 优先考虑写在前⾯的排序
- ⽀持对相关性算分进⾏排序
#多字段排序
POST /kibana_sample_data_ecommerce/_search
{
  "size": 5,
  "query": {
    "match_all": {
    }
  },
  "sort": [
    {"order_date": {"order": "desc"}},
    {"_doc":{"order": "asc"}},
    {"_score":{ "order": "desc"}}
  ]
}
2、对 Text 类型排序
GET kibana_sample_data_ecommerce/_mapping
fielddata: 默认关闭
#对 text 字段进行排序。默认会报错,需打开fielddata
POST /kibana_sample_data_ecommerce/_search
{
  "size": 5,
  "query": {
    "match_all": {
    }
  },
  "sort": [
    {"customer_full_name": {"order": "desc"}}
  ]
}
 
 
400 - Bad Request
 {
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [customer_full_name] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
      }
    ],
...
3、排序的过程
- 排序是针对字段原始内容进行的。 倒排索引⽆法发挥作⽤
- 需要⽤到正排索引。通过⽂档 Id 和字段快速得到字段原始内容
- Elasticsearch 有两种实现⽅法- Fielddata
- Doc Values (列式存储,对 Text 类型⽆效)
 
3-1 Doc Values vs Field Data
 
 
Demo
- 单字段多字段排序
- 对 Text 和 Keyword 类型进行排序
4、打开 Fielddata
 
 
- 默认关闭,可以通过 Mapping 设置打开。修改设置后,即时⽣效,⽆需重建索引
- 其他字段类型不不⽀支持,只支持对 Text 进⾏设定
- 打开后,可以对 Text 字段进⾏排序。但是是对分词后的 term 排序,所以,结果往往⽆法满足预期,不建议使⽤
- 部分情况下打开,满⾜⼀些聚合分析的特定需求
打开 text的 fielddata
PUT kibana_sample_data_ecommerce/_mapping
{
  "properties": {
    "customer_full_name" : {
          "type" : "text",
          "fielddata": true,
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
  }
}
POST /kibana_sample_data_ecommerce/_search
{
  "size": 5,
  "query": {
    "match_all": {
    }
  },
  "sort": [
    {"customer_full_name": {"order": "desc"}}
  ]
}
Output
"customer_full_name" : "Yuri Foster",
"customer_full_name" : "Yuri Roberson",
"customer_full_name" : "Yuri Greer",
"customer_full_name" : "Yuri Duncan",
5、关闭 Doc Values
- 默认启⽤,可以通过 Mapping 设置关闭- 增加索引的速度 / 减少磁盘空间
 
- 如果重新打开,需要重建索引- 什么时候需要关闭- 明确不不需要做排序及聚合分析
 
 
- 什么时候需要关闭
doc values是在创建时所产生,默认打开,关闭后可以对老的数据进行
query then update的操作,更新索引
 
 
#关闭 keyword的 doc values
PUT test_keyword
PUT test_keyword/_mapping
{
  "properties": {
    "user_name":{
      "type": "keyword",
      "doc_values":false
    }
  }
}
DELETE test_keyword
PUT test_text
PUT test_text/_mapping
{
  "properties": {
    "intro":{
      "type": "text",
      "doc_values":true
    }
  }
}
DELETE test_text
6、获取 Doc Values & Fielddata 中存储的内容
- Text 类型的不支持 Doc Values
- Text 类型打开 Fielddata后,可以查看分词后的数据
使⽤ docvalue_fields 查看存储的信息
DELETE temp_users
PUT temp_users
PUT temp_users/_mapping
{
  "properties": {
    "name":{"type": "text","fielddata": true},
    "desc":{"type": "text","fielddata": true}
  }
}
POST temp_users/_doc
{"name":"Jack","desc":"Jack is a good boy!","age":10}
6-1 查看 docvalue_fields数据
打开fielddata 后,查看 docvalue_fields数据
POST  temp_users/_search
{
  "docvalue_fields": [
    "name","desc"
    ]
}
 "hits" : [
      {
        "_index" : "temp_users",
        "_type" : "_doc",
        "_id" : "AX25U3UBaxLQOLM-FE5T",
        "_score" : 1.0,
        "_source" : {
          "name" : "Jack",
          "desc" : "Jack is a good boy!",
          "age" : 10
        },
        "fields" : {
          "name" : [
            "jack"
          ],
          "desc" : [
            "a",
            "boy",
            "good",
            "is",
            "jack"
          ]
        }
      }
    ]
查看整型字段的docvalues
POST  temp_users/_search
{
  "docvalue_fields": [
    "age"
    ]
}
查看整型字段的docvalues
"hits" : [
      {
        "_index" : "temp_users",
        "_type" : "_doc",
        "_id" : "AX25U3UBaxLQOLM-FE5T",
        "_score" : 1.0,
        "_source" : {
          "name" : "Jack",
          "desc" : "Jack is a good boy!",
          "age" : 10
        },
        "fields" : {
          "age" : [
            10
          ]
        }
      }
    ]
7、本节知识点回顾
- Elasticsearch ⽀持⾃定义排序
- Doc Values 和 Fielddata 的对⽐
- 如何设置 Doc Values和Fielddata