0%
Reading Settings
Font Size
18px
Line Height
1.5
Letter Spacing
0.01em
Font Family
Table of contents

Solving the "must appear in the GROUP BY clause" Error in Rails and PostgreSQL
Software Engineer
Software Engineer
Ruby on Rails
Ruby on Rails

This blog will give an example of how to resolve this common error when using group_by in Active Record
1. Example problem
First, take a look at this SQL problem. Imagine you're working on a booking application where users can have multiple sessions. You have two models:
// language: ruby # User # email: string class User < ApplicationRecord has_many :sessions end # Session # tag: string class Session < ApplicationRecord belongs_to :user end
Now, you need to find users that having the same email address and joined sessions with a specific tag. This might be necessary for identifying potential duplicate accounts or for a data cleanup task.
2. The wrong approach
Your first attempt might look something like this:
// language: ruby User .joins(:sessions) .where(sessions: { tag: Session::SPECIFIC_TAG }) .group(:email) .having('COUNT(DISTINCT users.id) > 1')
The condition COUNT(DISTINCT users.id) > 1 is checking if each email group has more than 1 distinct user
The generated SQL:
The generated SQL:
// language: sql SELECT "users".* FROM "users" INNER JOIN "sessions" ON "sessions"."user_id" = "users"."id" WHERE "sessions"."tag" = 'SPECIFIC_TAG' GROUP BY "users"."email" HAVING COUNT(DISTINCT users.id) > 1
However, running this query results in an error:
// language: bash PG::GroupingError: ERROR: column "users.id" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: SELECT "users".* FROM "users" INNER JOIN "sessions" ON "sessions"...
This error occurs because of how SQL handles grouping and non-aggregated columns. When using a
- Be included in the
GROUP BY clause, or - Be used in an aggregate function (like
COUNT ,SUM ,AVG , etc.)
In our query, we're grouping by email, but Rails' default
3. Solution
To solve this problem and efficiently retrieve the data we need, we can use PostgreSQL's array_agg function along with pluck. Here's the working query:
// language: ruby User .joins(:sessions) .where(sessions: { tag: Session::SPECIFIC_TAG }) .group(:email) .having('COUNT(DISTINCT users.id) > 1') .pluck("users.email, array_agg(users.id)")
It will return an array of arrays, where each inner array contains:
- The shared email
- An array of user IDs that share that email (using PostgreSQL's
array_agg function)
// language: ruby [["[email protected]", [1, 99, 543]]
You might wonder why we use pluck instead of select in our solution. Let's explore what happens when we try to use select:
// language: ruby User .joins(:sessions) .where(sessions: { tag: Session::SPECIFIC_TAG }) .group(:email) .having('COUNT(DISTINCT users.id) > 1') .select('users.email, array_agg(users.id) as user_ids')
This query still works, but it comes with some caveats:
- Partial Objects: This query returns
User objects, but they're partial objects. They only have the email attribute populated, plus a custom user_ids attribute that doesn't correspond to any column in the users table. - Type Casting: The user_ids attribute will be a string representation of the PostgreSQL array, not a Ruby array. You'd need to parse it manually.
- ActiveRecord Expectations: Some ActiveRecord methods might not work as expected on these partial objects, potentially leading to subtle bugs.
- Performance: Creating ActiveRecord objects is more resource-intensive than just returning the raw data.
Remember, when dealing with complex queries involving grouping and aggregation, it's often helpful to think about exactly what data you need and how you can retrieve it most efficiently from the database.
Related blogs

I just made a serious mistake with Rails destroy_all
Rails Active Record is convenient and human-readable for interacting with SQL models. But not understanding the generated SQL behind these elegant methods can cause serious data loss that might go unnoticed until it's too late.1. Setting the SceneThi...
Ruby on Rails
Ruby on Rails
Software Engineer
Software Engineer


Modern JavaScript OOP Features
When I first started learning JavaScript, its object-oriented features felt like an afterthought. The language had objects and prototypal inheritance, but it was missing key features like true private fields and methods. That’s why many developers tu...
Software Engineer
Software Engineer
