Table of contents
    blog cover

    Integer and String Enum in Rails

    Software Engineer
    Software Engineer
    Ruby on Rails
    Ruby on Rails

    TL;DR

    For General Use: I would lean towards using strings for enums due to their readability and maintainability benefits.
    For Performance-Critical Applications: I would opt for integers, especially if dealing with large datasets or requiring high performance


    What is an Enum and Why Use It?

    Enums (short for enumerators) in Rails are a way to define a set of predefined values for a column in a database table. They provide built-in validation and are highly readable and maintainable. Here's a simple example:
    // language: ruby
    class User < ApplicationRecord  
      enum status: { active: 0, inactive: 1, cancelled: 2 }
    end

    It will be shipped build-in methods
    // language: ruby
    User.active # list all active users
    user = User.create status: :active
    user.active? # true
    user.inactive! # set status to inactive

    Enum automatically handles validation, ensuring that only predefined values are accepted.
    // language: ruby
    user = User.create status: :trial # raise error

    Integer Enum

    You can define integer enum by
    // language: ruby
    class User < ApplicationRecord  
      enum status: { active: 0, inactive: 1, cancelled: 2 }
    end

    Pros
    • Performance: Integer enums are generally faster for database operations, including queries and indexing.
    // language: ruby
    User.active.average(:balance)
    • Storage Efficiency: Integers take up less space in the database compared to strings.
    • Consistency: Less prone to human error such as typos.

    Cons
    • Readability: Integer values are not self-explanatory when inspecting the database directly. If you read the database directly or use other frameworks, it can be unclear what the numbers represent.
    // language: sql
    SELECT * FROM users WHERE status = 0;
    • Migration Complexity: Changing enum values requires careful database migrations to avoid breaking the application. 
    For example, you have
    // language: ruby
    enum program: { summer_2020: 0, winter_2020: 1 }
    Next year, you want to add programs for the year 2021. Your updated enum might look like this:
    // language: ruby
    enum program: { summer_2020: 0, winter_2020: 1, summer_2021: 2, winter_2021: 3 }
    Now, suppose you want to group programs by season rather than by year. You might want the enum to look like this:
    // language: ruby
    enum program: { summer_2020: 0, summer_2021: 2, winter_2020: 1, winter_2021: 3 }
    However, this reordering makes the values look disorganized and can create confusion. To properly group them, you would need to create a migration to update existing records:

    String Enum

    You can define string enum by
    // language: ruby
    class User < ApplicationRecord  
      enum status: { active: 'active', inactive: 'inactive', cancelled: 'cancelled' }
    end

    Pros
    • Readability: String values are human-readable and self-explanatory when inspecting the database.
    // language: sql
    SELECT * FROM users WHERE status = 'active';
    • Maintainability: It is easier to add new values or change existing ones without worrying about data loss or complex migrations.
    • Human-Friendly: It is beneficial for teams that frequently read data directly from the database or when multiple systems need to access the enum values.

    Cons
    • Performance: String enums can be slower for database operations compared to integers.
    • Storage: Strings take up more space in the database, especially with longer names.

    Space and Speed

    You can read this post for details: Rails enums integer vs string way | by Michał Rudzki | Medium
    I will summarize some points

    Populated with records:
    • 4 kinds of pets and 11 million records
    • 8 kinds of pets and 28.5 million records
    • 8 kinds of pets with long names and 10 million records
    Query speed

    Query speed
    Integer column is faster by about 14% in first collection and 12% in second one, but this is only for index-less columns.
    For columns with indexes the differences are 7% but in favour of strings and 2% in favour of integers respectively. Of course indexes takes a lot of space but depending on usage the absolute values are totally worth it.
    For longer pet names differences are more visible and are 22% in index-less columns and 43% for index columns

    Storage
    Here is the size difference, string takes up: 2297 MB and integer takes up and integer takes up: 2292 MB.
    Of course the issue here is that strings are really short so the difference is just 5MB, the longer strings would’ve taken much more space.
    So I was a little confused why is it only 5MBs.
    I created new branch with longer names and then the difference on 10 million records is: 86MB and whole table takes 802MBs with integer enum and 888MB on string enums, so it is about 10% of difference.


    Ultimately, the choice depends on your specific use case and the needs of your application and team. By understanding the trade-offs, you can make an informed decision that best suits your project.
    Happy coding!
    Created at 2024-09-05 23:53:58 +0700

    Related blogs