GROUP BY and Memory limit exceeded
GROUP BY and Memory limit exceeded
Method 1
Simply increase max_memory_usage
Method 2
Enable external memory aggregation:
max_bytes_before_external_group_by
distributed_aggregation_memory_efficient
Method 3
Two-pass aggregation:
– in a subquery, calculate the set of keys
that will be included in the result;
– only for them calculate all aggregate functions.
Method 3, better
Two-pass aggregation:
– in a subquery, calculate the set of keys
that will be included in the result;
– only for them calculate all aggregate functions.
To save memory, we will calculate
a set of hashes instead of a set of strings.
Method 4
Magic options:
max_rows_to_group_by,
group_by_overflow_mode = 'any'
Method 6
Export data for a subset of keys.
Make multiple passes.
Method 7
Calculate data for keys
that occur frequently.
Method 7, better
Calculate data for keys
that occur frequently.
Conclusion
If a query doesn't fit within the limits,
you can still execute it.