当前位置 博文首页 > 韩超的博客 (hanchao5272):Elasticsearch数据刷新策略RefreshPo

    韩超的博客 (hanchao5272):Elasticsearch数据刷新策略RefreshPo

    作者:[db:作者] 时间:2021-09-05 13:18

    相关文章:ElasticSearch: 数据刷新相关的刷新间隔refresh_interval、刷新接口_refresh和刷新策略RefreshPolicy

    说明

    默认情况下ElasticSearch索引的refresh_interval1秒,这意味着数据写1秒才就可以被搜索到。

    每次索引refresh会产生一个新的 lucene 段,这会导致频繁的 segment merge 行为,对系统 CPU 和 IO 占用都比较高。

    如果产品对于实时性要求不高,则可以降低刷新周期,如:index.refresh_interval: 120s

    但是这种特性对于功能测试来说比较麻烦:

    • 因为实时性不能保证,所以每次插入测试数据之后,都需要sleep一段时间,才能进行测试。
    • 因为实时性不能保证,及时通过sleep策略通过的case,也可能偶尔失败。

    为了解决上述问题,需要提供ElasticSearch增删改数据之后数据立即刷新的策略。

    版本

    • ElasticSearch 5.1.1

    源码

    org.elasticsearch.action.support.WriteRequestBuilder#setRefreshPolicy接口如下

    /**
     * Should this request trigger a refresh ({@linkplain RefreshPolicy#IMMEDIATE}), wait for a refresh (
     * {@linkplain RefreshPolicy#WAIT_UNTIL}), or proceed ignore refreshes entirely ({@linkplain RefreshPolicy#NONE}, the default).
     */
    @SuppressWarnings("unchecked")
    default B setRefreshPolicy(RefreshPolicy refreshPolicy) {
        request().setRefreshPolicy(refreshPolicy);
        return (B) this;
    }
    

    枚举org.elasticsearch.action.support.WriteRequest.RefreshPolicy定义了三种策略:

    /**
     * Don't refresh after this request. The default.
     */
    NONE,
    /**
     * Force a refresh as part of this request. This refresh policy does not scale for high indexing or search throughput but is useful
     * to present a consistent view to for indices with very low traffic. And it is wonderful for tests!
     */
    IMMEDIATE,
    /**
     * Leave this request open until a refresh has made the contents of this request visible to search. This refresh policy is
     * compatible with high indexing and search throughput but it causes the request to wait to reply until a refresh occurs.
     */
    WAIT_UNTIL; 
    

    可知有以下三种刷新策略:

    • RefreshPolicy#IMMEDIATE:
      • 请求向ElasticSearch提交了数据,立即进行数据刷新,然后再结束请求。
      • 优点:实时性高、操作延时短。
      • 缺点:资源消耗高。
    • RefreshPolicy#WAIT_UNTIL:
      • 请求向ElasticSearch提交了数据,等待数据完成刷新,然后再结束请求。
      • 优点:实时性高、操作延时长。
      • 缺点:资源消耗低。
    • RefreshPolicy#NONE:
      • 默认策略。
      • 请求向ElasticSearch提交了数据,不关系数据是否已经完成刷新,直接结束请求。
      • 优点:操作延时短、资源消耗低。
      • 缺点:实时性低。

    实现此接口的主要类如下:

    • DeleteRequestBuilder
    • IndexRequestBuilder
    • UpdateRequestBuilder
    • BulkRequestBuilder

    实例

    /**
     * ElasticSearch立即更新的示例代码
     */
    @Test
    public void refreshImmediatelyTest() {
        //删除操作
        client.prepareDelete("index", "type", "1").setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
    
        //索引操作
        client.prepareIndex("index", "type", "2").setSource("{\"age\":1}").setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
    
        //更新操作
        client.prepareUpdate("index", "type", "3").setDoc("{\"age\":1}").setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
    
        //批量操作
        client.prepareBulk()
                .add(client.prepareDelete("index", "type", "1"))
                .add(client.prepareIndex("index", "type", "2").setSource("{\"age\":1}"))
                .add(client.prepareUpdate("index", "type", "3").setDoc("{\"age\":1}"))
                .setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE).get();
    }
    
    cs