You can instantly copy 1TB+ datasets via Neon branches, e.g. for ephemeral environments - Learn how they compare to other "branching" features
Docs/Neon Postgres guides/Functions/Aggregate functions

Postgres array_agg() function

Aggregate values into an array

The Postges array_agg() function collects values from multiple rows into a single array.

It's particularly useful for denormalizing data, creating comma-separated lists, or preparing data for JSON output. For example, you can use it to list all products in a category from a products catalog table or all orders for a customer from an orders table.

Try it on Neon!

Neon is Serverless Postgres built for the cloud. Explore Postgres features and functions in our user-friendly SQL editor. Sign up for a free account to get started.

Sign Up

Function signature

The array_agg() function has this simple form:

array_agg(expression) -> anyarray
  • expression: The value to be aggregated into an array. This can be a column or expression of any data type.
array_agg(expression ORDER BY sort_expression [ASC | DESC] [NULLS { FIRST | LAST }]) -> anyarray
  • expression: The value to be aggregated into an array.
  • ORDER BY: Specifies the order in which the values should be aggregated.
  • sort_expression: The expression to sort by.
  • ASC | DESC: Specifies ascending or descending order (default is ASC).
  • NULLS { FIRST | LAST }: Specifies whether nulls should be first or last in the ordering (default depends on ASC or DESC).

Example usage

Consider an orders table with columns order_id, product_id, and quantity. You can use array_agg() to list all the product IDs for each order.

WITH orders AS (
  SELECT 1 AS order_id, 101 AS product_id, 2 AS quantity
  UNION ALL SELECT 1, 102, 1
  UNION ALL SELECT 2, 103, 3
  UNION ALL SELECT 2, 104, 1
  UNION ALL SELECT 3, 101, 1
)
SELECT
  order_id,
  array_agg(product_id) AS products
FROM orders
GROUP BY order_id
ORDER BY order_id;

This query groups the orders by order_id and aggregates the product_id values into an array for each order.

order_id | products
----------+-----------
        1 | {101,102}
        2 | {103,104}
        3 | {101}
(3 rows)

Advanced examples

Ordered array aggregation

You can specify an order for the elements in the resulting array:

WITH employees AS (
  SELECT 1 AS emp_id, 'John' AS name, 'SQL' AS skill
  UNION ALL SELECT 1, 'John', 'Python'
  UNION ALL SELECT 1, 'John', 'Java'
  UNION ALL SELECT 2, 'Jane', 'C++'
  UNION ALL SELECT 2, 'Jane', 'Ruby'
)
SELECT
  emp_id,
  name,
  array_agg(skill ORDER BY skill) AS skills
FROM employees
GROUP BY emp_id, name
ORDER BY emp_id;

This query aggregates the listed skills for each employee into an alphabetically ordered array.

emp_id | name |      skills
--------+------+-------------------
      1 | John | {Java,Python,SQL}
      2 | Jane | {C++,Ruby}
(2 rows)

Combining with other aggregate functions

array_agg() can be used in combination with other aggregate functions:

WITH sales(category, product, price, sale_date) AS (
  VALUES
    ('Electronics', 'Laptop', 1200, '2023-01-15'::date),
    ('Electronics', 'Smartphone', 800, '2023-01-20'::date),
    ('Electronics', 'Tablet', 500, '2023-02-10'::date),
    ('Books', 'Novel', 20, '2023-02-05'::date),
    ('Books', 'Textbook', 100, '2023-02-15'::date),
    ('Books', 'Cookbook', 30, '2023-03-01'::date)
)
SELECT
  category,
  array_agg(
    (SELECT product || ': ' || SUM(price)::text
     FROM sales s2
     WHERE s2.category = s1.category AND s2.product = s1.product
     GROUP BY s2.product)
  ) AS product_sales
FROM sales s1
GROUP BY category;

This query aggregates products into an array with their total sales, for each category.

category   |                  product_sales
-------------+--------------------------------------------------
 Electronics | {"Laptop: 1200","Smartphone: 800","Tablet: 500"}
 Books       | {"Novel: 20","Textbook: 100","Cookbook: 30"}
(2 rows)

Using array_agg() with DISTINCT

You can use DISTINCT with array_agg() to remove duplicates from the output array:

WITH user_logins AS (
  SELECT 1 AS user_id, 'Chrome' AS browser
  UNION ALL SELECT 1, 'Firefox'
  UNION ALL SELECT 1, 'Chrome'
  UNION ALL SELECT 2, 'Safari'
  UNION ALL SELECT 2, 'Chrome'
)
SELECT
  user_id,
  array_agg(DISTINCT browser ORDER BY browser) AS browsers_used
FROM user_logins
GROUP BY user_id;

This query creates an array of the browsers used by each user, without duplicates and in alphabetical order.

user_id |  browsers_used
---------+------------------
       1 | {Chrome,Firefox}
       2 | {Chrome,Safari}
(2 rows)

Additional considerations

Performance implications

While array_agg() is powerful, it can be memory-intensive for large datasets. The function needs to hold all the aggregated values in memory before creating the final array. For very large result sets, consider using pagination or limiting the number of rows before aggregating.

NULL handling

By default, array_agg() includes NULL values in the resulting array. If you want to exclude NULL values, you can use it in combination with FILTER:

SELECT array_agg(column_name) FILTER (WHERE column_name IS NOT NULL)
FROM table_name;

Alternative functions

  • string_agg(): Concatenates string values into a single string, separated by a delimiter.
  • json_agg(): Aggregates values into a JSON array.

Resources

Last updated on

Was this page helpful?