Elasticsearch Terms Aggregation by Entire Field

ElasticSearch terms aggregation by entire field

You should fix this in your mapping. Add a not_analyzed field. You can create the multi field if you also need the analyzed version.

"album": {
"city": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}

Now create your aggregate on city.raw

How can I aggregate the whole field value in Elasticsearch

In order to get what you expect you need to change your mapping to this:

    "logGroup" : {
"type" : "keyword"
},

Failing to do that, your log groups will get analyzed by the standard analyzer which splits the whole string and you'll not be able to aggregate by full log groups.

If you don't want or can't change the mapping and reindex everything, what you can do is the following:

First, add a keyword sub-field to your mapping, like this:

PUT /my-index/_mapping
{
"properties": {
"logGroup" : {
"type" : "text",
"fields": {
"keyword": {
"type" : "keyword"
}
}
}
}
}

And then run the following so that all existing documents pick up this new field:

POST my-index/_update_by_query?wait_for_completion=false

Finally, you'll be able to achieve what you want with the following query:

GET /my-index/_search
{
"size": 0,
"aggs": {
"types_count": {
"terms": {
"field": "logGroup.keyword",
"size": 10000
}
}
}
}

ElasticSearch : Group by field with Terms aggregation and aggregate on Min price

Yes, that's because the aggregations are applied on the matched documents only. You want to use post_filter instead of a normal query, that way your aggregations will run on all documents and then at the very end, only red Tshirt documents will be returned.

{
"aggs": {
"variation_groups": {
"terms": {
"field": "variationGroup",
"size": 0
},
"aggs": {
"min_price": {
"min": {
"field": "price"
}
},
"max_price": {
"max": {
"field": "price"
}
},
"top_article": {
"top_hits": {
"size": 1
}
}
}
}
},
"post_filter": { <---- move your query in a post_filter
"query": {
"match": {
"name": "red Tshirt"
}
}
}
}

UPDATE

Based on your comment, I would do it like this:

{
"size": 0,
"aggs": {
"variation_groups": {
"terms": {
"field": "variationGroup",
"size": 0
},
"aggs": {
"min_price": {
"min": {
"field": "price"
}
},
"max_price": {
"max": {
"field": "price"
}
},
"top_article": {
"filter": {
"query": {
"match": {
"name": "red Tshirt"
}
}
},
"aggs": {
"top_article": {
"top_hits": {
"size": 1
}
}
}
}
}
}
}
}

Elasticsearch aggregations: Always return a field in term aggregation

You can leverage the missing value settings of the terms aggregation. You simply specify the key of the bucket that will collect all documents that don't have any term in the specified field:

{
"aggs" : {
"cities" : {
"terms" : {
"field" : "city.name",
"missing": "Unspecified" <--- add this
}
}
}
}

Elasticsearch - Terms Aggregation nested field

The nested constraint in the query part will only select all documents that do have a nested field satisfying the constraint. You also need to add that same constraint in the aggregation part, otherwise you're going to aggregate all nested fields of all the selected documents, which is what you're seeing. Proceed like this instead:

// 1. terms aggregation on the desired nested field
nestedField = AggregationBuilders.terms("fieldBAgg").field("list.fieldC.keyword");

// 2. filter aggregation on the desired nested field value
onlyBQuery = QueryBuilders.termQuery("list.fieldB.keyword", "ABC");
onlyBFilter = AggregationBuilders.filter("onlyFieldB", onlyBQuery).subAggregation(nestedField);

// 3. parent nested aggregation
nested = AggregationBuilders.nested("listAgg", "list").subAggregation(onlyBFilter);

// 4. main query/aggregation
sourceBuilder.query(matchQueryBuilder).aggregation(nested);

ElasticSearch - how to get aggregation of aggregation

You can use avg bucket aggregation, where you can provide bucket_path and based on value it will calculate avg of entire aggregation.

Below is sample query:

{
"size": 0,
"aggs": {
"bystate": {
"terms": {
"field": "state",
"size": 59
},
"aggs": {
"group-by-index": {
"terms": {
"field": "_index"
}
},
"min_date": {
"min": {
"field": "signedUpAt"
}
},
"avg_date": {
"avg": {
"field": "signedUpAt"
}
}
}
},
"avg_all_state": {
"avg_bucket": {
"buckets_path": "bystate>avg_date"
}
}
}
}

aggregation query and return all fields in elasticsearch

You can do that with top hits aggregation

Try this

{
"size": 0,
"aggs": {
"by_date": {
"date_histogram": {
"field": "date",
"interval": "day"
},
"aggs": {
"genders": {
"terms": {
"field": "ip",
"size": 100000,
"order": {
"_count": "asc"
}
},
"aggs": {
"cpu_usage": {
"max": {
"field": "cpu_usage"
}
},
"include_source": {
"top_hits": {
"size": 1,
"_source": {
"include": [
"date", "ip", "dev_type", "env", "cpu_usage"
]
}
}
}
}
}
}
}
}
}

Does this help?

Multi-field terms aggregation approach

One approach would be to use top_hits and use source filtering to return only the city_id as show in the example below.
I don't think this would be prohibitively less performant
You could try it on your indexes to see the impact before trying out the approach of city_name_id field specified in OP.

Example:

    post <index>/_search
{
"size" : 0,
"aggs": {
"city": {
"terms": {
"field": "city"
},
"aggs" : {
"id" : {
"top_hits" : {
"_source": {
"include": [
"city_id"
]
},
"size" : 1
}
}
}
}
}
}

Results:

 {
"key": "London",
"doc_count": 2,
"id": {
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "country",
"_type": "city",
"_id": "2",
"_score": 1,
"_source": {
"city_id": 46
}
}
]
}
}
},
{
"key": "New York",
"doc_count": 1,
"id": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "country",
"_type": "city",
"_id": "3",
"_score": 1,
"_source": {
"city_id": 47
}
}
]
}
}
},
{
"key": "Rome",
"doc_count": 1,
"id": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "country",
"_type": "city",
"_id": "1",
"_score": 1,
"_source": {
"city_id": 45
}
}
]
}
}
}

ElasticSearch - terms aggregation split by whitespace

I'm inferring that your mapping type is keyword because you aggregated on a field called "attributes.Title.keyword". The keyword mapping will not tokenize your string so during aggregation time, it will treat the entire string as a unique key.

You want to update your mapping to type: "text" for the title field. I wouldn't call it title.keyword but something like title.analyzed -- if you don't specify an analyzer, Elasticsearch will apply the standard analyzer which should be enough to get you started. You can also use the whitespace analyzer if you only want your titles to be broken down by whitespace (instead of stemmed and some other stuff). You will get a lot of other words in your aggregation but I'm assuming that you're looking for these shared experience modifier tokens and based on frequency, they will rise to the top.

If you're using 5.x, make sure to set 'fielddata: true' since text fields aren't available for aggregation by default.

mapping:

"properties" : {
"attributes" : {
"properties" : {
"title" : {
"properties" : {
"keyword" : { "type" : "keyword" },
"analyzed" : { "type" : "text", "analyzer" : "whitespace", "fielddata" : true }
}
}
}
}
}


Related Topics



Leave a reply



Submit