java implementation of time histogram aggregation in elastic search

Keywords: Big Data

Demand: count how many pieces of data there are every day and the average value of a field.

1. Use datehistograggregationbuilder to count by day, and then embed aggs to take the average value.

If you want to sort the results, add. Order (histogram. Order. Count? DESC) to AggregationBuilders

        AvgAggregationBuilder avgAggregationBuilder = AggregationBuilders
                .avg("avg_aggsName")
                .field("fieldName");
                
        DateHistogramAggregationBuilder dateHistogramAggregationBuilder = AggregationBuilders
                .dateHistogram("aggsName")
                .field("fieldName") //It can be time
                .dateHistogramInterval(DateHistogramInterval.DAY)
                .format("yyyy-MM-dd")
                .minDocCount(0L)
                .subAggregation(avgAggregationBuilder);

2. If a new requirement is added, only the data of the last month will be counted.

Then add a filter, filter and aggregate.

//Date limit
		QueryBuilder rangeBuilder = QueryBuilders
            .rangeQuery(aggsName)
            .format("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")
            .gte(timeRange.get("startTime").toString())
            .lte(timeRange.get("endTime").toString());
            
//        Aggregate after filtering.missing(0)
        QueryBuilder queryBuilder = QueryBuilders
                .boolQuery()
                .filter(rangeBuilder);
                
3. Get the starting time of the day / week / month
	private static Map<String, String> getTime(String period) {

        Map<String, String> timeRange = new HashMap<>();
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");
        Calendar calendar = Calendar.getInstance();
        timeRange.put("endTime", simpleDateFormat.format(calendar.getTime()));

        switch (period) {

            case "week": {
                calendar.add(Calendar.DATE, -7);
            }
            break;

            case "month": {
                calendar.add(Calendar.MONTH, -1);
            }
            break;

            default: break;

        }

        calendar.set(Calendar.HOUR_OF_DAY, 0);
        calendar.set(Calendar.MINUTE, 0);
        calendar.set(Calendar.SECOND, 0);
        calendar.set(Calendar.MILLISECOND, 0);
        timeRange.put("startTime", simpleDateFormat.format(calendar.getTime()));
        return timeRange;

    }

4. Integrate query statements
		searchSourceBuilder
                .query(queryBuilder)
                .aggregation(dateHistogramAggregationBuilder);

        String query = searchSourceBuilder.toString();
        

Requirement: get the latest record under type/index

First match all the data with matchAll, and then sort by time. Finally get getFirstHit and get the latest record. (it is better to include the time field in the field)

		QueryBuilder queryBuilder = QueryBuilders
                .matchAllQuery();
        
        searchSourceBuilder
                .query(queryBuilder)
                .sort("time", SortOrder.DESC);

Output result

1. Get hits data:
		List<SearchResult.Hit<TESTCLASS, Void>> hits = result.getHits(TESTCLASS.class);
        List<TESTCLASS> userList = new ArrayList<>();
        for (SearchResult.Hit<TESTCLASS, Void> hit : hits) {
            userList.add(hit.source);
        }

2. Get the buckets data in aggs:
		MetricAggregation jsonAggs = searchResult.getAggregations();
		

Date histogram

		Map<String,Object> map = new HashMap<>();
        DateHistogramAggregation histogram = jsonAggs.getDateHistogramAggregation("aggsName");
        for (DateHistogramAggregation.DateHistogram entry : histogram.getBuckets()) {
            map.put(entry.getTimeAsString(), entry.getCount());
        } 
   

Average value

        AvgAggregation avg = jsonAggs.getAvgAggregation("avg_aggsName");
        result.put("avg_aggsName", Math.ceil(avg.getAvg()));

Posted by EvilCoatHanger on Mon, 09 Dec 2019 10:09:52 -0800