A polymorphic relationship is a type of relationship where a single record can be associated with records from different tables through a single universal structure. In simpler terms, one table can reference different tables, not just one.
Let’s look at an example
Imagine we’re building a social network and we have entities like posts, photos, and videos. We need to implement the ability to add comments to these entities. The most obvious ways to implement this are:
- Create a separate table for each content type, for example
post_comments,photo_comments,video_comments, and so on. - Create a universal
commentstable and link it to the others via foreign keys - for example,postId,photoId,videoId, and so on.
These approaches work, but when we add new content types, we’ll have to either create new tables in the database or add new foreign keys, which will complicate the API’s business logic and the database schema.
A popular solution in such cases is a polymorphic relationship. The idea is that we have one universal comments table with entity_id and entity_type columns.
- entity_id - the identifier of the record the comment was added to. It’s important to understand that this is not a foreign key, but just a number or a string (for example, if you use UUIDs or a similar type of ID).
- entity_type - the type of content the comment belongs to, for example post or photo. We need this column to know which table to query. For instance, if we want to fetch all comments for the post with
ID=1, we filter only those comments whereentity_type=post.
The main thing to understand is that there are no perfect solutions. If you use polymorphic relationships, you gain flexibility and avoid duplication, but you trade off the following:
- Referential integrity - since we don’t have foreign keys, the database can’t guarantee the integrity of the data and the relationships between them. This can be partially addressed by adding
CHECKconstraints andPartial Indexes, but that’s more of a workaround than a real solution. - JOIN - database queries become more complex and slower due to the lack of foreign keys. This is mainly an issue for systems with large amounts of data and heavy load, so in such cases it’s better to use the approaches I mentioned earlier.
- CASCADE - since it isn’t possible without foreign keys, we have to implement it at the business logic level. This directly affects how transactions work, because it can break data consistency
(Consistency, in ACID terms).