elasticsearch笔记(5) java操作es的查询_04深分页scroll查询

一. 为什么用深度分页scroll查询

   分页可以用terms查询的from to来实现;但是from to二者之和大于一万后就效率低下. 原因是

es查询的方式:

  1. 将用户指定的关键字进行分词,
  2. 将词汇去分词库中去检索. 得到多个文档的id
  3. 去各个分片中去拉取指定数据; 这步最慢
  4. 将数据根据匹配度score排序; 耗时长
  5. 将查询的数据舍弃一部分: 如from5to10, 就把不是5-10条的数据舍弃
  6. 返回结果

       scroll查询的方式:         

  1. 将用户指定的关键字进行分词,
  2. 将词汇去分词库中去检索. 得到多个文档的id,
  3. 将id存在一个es的上下文中
  4. 根据size键es中检索指定的数据, 拿到数据的文档id, 会从上下文中移出,
  5. 如果需要下一页数据, 直接到es的上下文中找到后续的内容.
  6. 循环4  5 查询

 elasticsearch笔记(5) java操作es的查询_04深分页scroll查询

 在java中用scroll查询

 1    @Test
 2     public void scrollQueryTest() throws IOException {
 3         //        1. 创建查询对象
 4         String index = "sms-logs-index";
 5         String type = "sms-logs-type";
 6         SearchRequest searchRequest = new SearchRequest(index);//指定索引
 7         searchRequest.types(type);//指定类型
 8         searchRequest.scroll(TimeValue.timeValueMinutes(1l));//指定存在内存的时长为1分钟
 9 //    2. 封装查询条件
10         SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
11         searchSourceBuilder.sort("fee", SortOrder.DESC);
12         searchSourceBuilder.size(2);
13         searchSourceBuilder.query(QueryBuilders.matchAllQuery());
14         searchRequest.source(searchSourceBuilder);
15 
16 
17         //        3.执行查询
18         // client执行
19         HttpHost httpHost = new HttpHost("192.168.43.30", 9200);
20         RestClientBuilder restClientBuilder = RestClient.builder(httpHost);
21         RestHighLevelClient restHighLevelClient = new RestHighLevelClient(restClientBuilder);
22 
23         SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
24         String scrollId = searchResponse.getScrollId();
25         System.out.println(scrollId);//获取scorllId
26 
27 
28 //        4.获取数据
29         SearchHit[] hits = searchResponse.getHits().getHits();
30         for(SearchHit searchHit : hits){
31             System.out.println(searchHit);
32         }
33 
34         //获取全部的下一页, 不过我不知道这种有什么用?????
35         while (true){
36             //创建SearchScrollRequest对象
37             SearchScrollRequest searchScrollRequest = new SearchScrollRequest(scrollId);
38             searchScrollRequest.scroll(TimeValue.timeValueMinutes(1l));//设置1分钟
39             SearchResponse scroll = restHighLevelClient.scroll(searchScrollRequest, RequestOptions.DEFAULT);
40             SearchHit[] hits1 = scroll.getHits().getHits();
41             if(hits1 != null && hits1.length > 0){
42                 System.out.println("------------下一页--------------");
43                 for(SearchHit searchHit : hits1){
44                     System.out.println(searchHit);
45                 }
46 
47             }else {
48                 System.out.println("------------结束--------------");
49                 break;
50             }
51         }
52         
53         //删除ScrollId
54         ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
55         clearScrollRequest.addScrollId(scrollId);
56         ClearScrollResponse clearScrollResponse = restHighLevelClient.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
57         System.out.println("删除scroll"  + clearScrollResponse);
58     }