Summary of ElasticSearch

Keywords: Big Data ElasticSearch

1. Three clients of ES

a.Transport Client, transport based connection, using port 9300 of es

b.JestClient, http based connection, using port 9200 of es

3.RestClient, http based connection, official recommendation of es, 9200 port

2. Document source:

es aggregation based on jestClient: https://blog.csdn.net/lvyuan1234/article/details/78655493 (this paper solves the basic aggregation operation of jest)

Time based es aggregation: https://blog.csdn.net/xuyingzhong/article/details/78839744 (this article solves the problem of not returning empty bucket s)

Official documents: https://www.elastic.co/guide/en/elasticsearch/reference/5.4

scroll Based deep paging: https://www.jianshu.com/p/32f4d276d433 (note the length of time for each incoming scroll)

First query: add doc to sort
 /Index / type / _search? Scroll = 5m & size = 10000 (Tips: 5m refers to the duration of cursor existence, es default size is no more than 10000)
"{\"sort\":[{\"_doc\":{\"order\":\"asc\"}}],\"query\":{\"match_all\":{}}}"

Second query: the last query's scroll ﹣ ID needs to be brought in for each query in the future
 _search/scroll?scroll=5m, no need to specify index,type
"{\"scroll_id\": " + "\"" + scroll_id + "\"" + "}"

3. Code based on time aggregation:

  

@Override
    public List<ResourceVO> summary(QueryParam queryParam) {
        List<Object> rangeTime = queryParam.getTerms().stream().filter(term -> term.getColumn().equals(ENTRYTIME))
                .map(Term::getValue).collect(Collectors.toList());
        if (2 != rangeTime.size() || CollectionUtils.isEmpty(rangeTime)) {
            throw new ParamNoExistException(ErrorConstant.PARAM_NO_EXIST, I18nConstant.DTAE_PARAM_NOT_EXIST);
        }
        Object beginTime = rangeTime.get(0);
        Object endTime = rangeTime.get(1);
        // Build query builder
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery(ENTRYTIME).
                gte(beginTime).lte(endTime));
        // Aggregation based on fields
        DateHistogramAggregationBuilder field = AggregationBuilders.dateHistogram("agg").field(ENTRYTIME);
        // Aggregation based on days
        field.dateHistogramInterval(DateHistogramInterval.DAY);
        field.format("yyyy-MM-dd");
        // Force return of empty buckets
        field.minDocCount(0);
        // Cooperate with mindocount to return all data within the date and time, including 0
        field.extendedBounds(new ExtendedBounds(parseDate(beginTime), parseDate(endTime)));
        searchSourceBuilder.query(queryBuilder);
        searchSourceBuilder.aggregation(field);
        // Return only aggregate results, not query results
        searchSourceBuilder.size(0);
        Search videoSearch = new Search.Builder(searchSourceBuilder.toString()).addIndex(VIID).addType(VIDEOSLICE).build();
        Search imageSearch = new Search.Builder(searchSourceBuilder.toString()).addIndex(VIID).addType(IMAGE).build();
        SearchResult videoResult = null;
        SearchResult imageResult = null;
        try {
            videoResult = jestClient.execute(videoSearch);
            imageResult = jestClient.execute(imageSearch);
        } catch (IOException e) {
            log.error(e.getMessage());
        }
        List<ResourceVO> resourceVOS = Lists.newArrayList();
        // Value the result of aggregation
        List<DateHistogramAggregation.DateHistogram> videoBuckets = videoResult.getAggregations().getDateHistogramAggregation("agg").getBuckets();
        for (DateHistogramAggregation.DateHistogram videoBucket : videoBuckets) {
            ResourceVO resourceVO = new ResourceVO();
            resourceVO.setTime(videoBucket.getTimeAsString());
            resourceVO.setVideoCount(videoBucket.getCount());
            resourceVOS.add(resourceVO);
        }
        List<DateHistogramAggregation.DateHistogram> imageBulkets = imageResult.getAggregations().getDateHistogramAggregation("agg").getBuckets();
        for (DateHistogramAggregation.DateHistogram imageBulket : imageBulkets) {
            ResourceVO imageResourceVO = resourceVOS.stream().filter(resourceVO -> resourceVO.getTime().equals(imageBulket.getTimeAsString())).findFirst().get();
            imageResourceVO.setImageCount(imageBulket.getCount());
            resourceVOS.add(imageResourceVO);
        }
        return resourceVOS;
    }

 

Posted by pnoeric on Fri, 13 Dec 2019 09:11:44 -0800