In-depth Interview Question Breakdowns
A collection of questions and answers for technical interview preparation. Includes JavaScript/Node.js, System Design, databases, and cloud technologies.
Tags:
What are polymorphic relationships?
A polymorphic relationship is a type of relationship where a single record can be associated with records from different tables through a single universal structure. In simpler terms, one table can reference different tables, not just one.
Let’s look at an example
Imagine we’re building a social network and we have entities like posts, photos, and videos. We need to implement the ability to add comments to these entities. The most obvious ways to implement this are:
- Create a separate table for each content type, for example
post_comments,photo_comments,video_comments, and so on. - Create a universal
commentstable and link it to the others via foreign keys - for example,postId,photoId,videoId, and so on.
These approaches work, but when we add new content types, we’ll have to either create new tables in the database or add new foreign keys, which will complicate the API’s business logic and the database schema.
A popular solution in such cases is a polymorphic relationship. The idea is that we have one universal comments table with entity_id and entity_type columns.
- entity_id - the identifier of the record the comment was added to. It’s important to understand that this is not a foreign key, but just a number or a string (for example, if you use UUIDs or a similar type of ID).
- entity_type - the type of content the comment belongs to, for example post or photo. We need this column to know which table to query. For instance, if we want to fetch all comments for the post with
ID=1, we filter only those comments whereentity_type=post.
The main thing to understand is that there are no perfect solutions. If you use polymorphic relationships, you gain flexibility and avoid duplication, but you trade off the following:
- Referential integrity - since we don’t have foreign keys, the database can’t guarantee the integrity of the data and the relationships between them. This can be partially addressed by adding
CHECKconstraints andPartial Indexes, but that’s more of a workaround than a real solution. - JOIN - database queries become more complex and slower due to the lack of foreign keys. This is mainly an issue for systems with large amounts of data and heavy load, so in such cases it’s better to use the approaches I mentioned earlier.
- CASCADE - since it isn’t possible without foreign keys, we have to implement it at the business logic level. This directly affects how transactions work, because it can break data consistency
(Consistency, in ACID terms).
What is a foreign key in SQL?
Foreign key is a column in a table that references a primary key in another table. The most important thing to understand is that a foreign key is not just an ID value pointing to a specific record in another table - it’s a mechanism that enforces.
Simple example
We have users and orders tables. A user can create orders, so a record in the orders table must be linked to a specific record in users. To connect them, we use the orders.user_id field, which indicates who created the order.
From a practical standpoint, we could implement this relationship without a foreign key by simply storing the ID as a number in the user_id column. But this approach has several significant downsides:
- The database doesn’t enforce the relationship between the tables. For example, in this case we have a one-to-many relationship (one user has many orders). But if we try to query orders for a specific user, the database will scan the entire table, because it doesn’t know which orders belong to that user -
user_idis just a number, not a reference to another table. - There’s no protection against non-existent data. If we try to write the value 100 into
user_id, we won’t get any error telling us that such a user doesn’t exist. - There’s no control over what happens when a user is deleted or updated. If, when deleting a user, we also need to delete all of their orders, then without a foreign key, we’ll have to handle it in business logic. A foreign key, on the other hand, provides a referential actions mechanism that lets you control this process. You can read more about this mechanism here - https://devs-hive.tech/interview-qa/referential-actions-sql.
All these problems are solved by a foreign key, which enforces a strong relationship between tables and allows us to define relationships between them.
What are referential actions in SQL?
One of those questions that can leave a Node.js developer slightly stunned. For some reason, SQL and an understanding of relational databases have become the Achilles’ heel of backend development in the JavaScript world.
Referential actions are rules that define what should happen to rows in a child table when a row in the parent table, referenced by a foreign key, is updated or deleted.
There are three main ways to control this, and now we’ll look at them with an example. Let’s imagine we have users and orders tables. In the orders table, there is a user_id field, which is a foreign key that links each order to a specific user.
If we try to delete a user with id = 1, and that user is still being referenced by rows in the orders table, the referential integrity mechanism will come into play. It determines whether the deletion is allowed and, if so, under what rules. Next, we’ll go through those rules one by one, and you’ll see that it’s actually very simple and easy to understand.
RESTRICT
If a row in users is referenced by at least one row in orders, we won’t be able to delete it. The only way to delete such a record is to first delete the related rows from orders, and only then will referential integrity allow us to delete the user.
CASCADE
The most dangerous rule, as it means that when you delete a row from the users table, all rows in orders that reference that user will also be deleted.
SET NULL
When a user is deleted from the users table, all orders that reference that user will have their foreign key (user_id) set to NULL. The same rules apply to UPDATE operations as well, but since they are used relatively rarely, interviewers will usually ask specifically about DELETE.
What is the difference between Process and Thread in Node.js?
This question is more related to computer science, but it’s asked quite often in Node.js interviews, so it makes sense to cover it in this category. First, let’s define a process and a thread.
Process is a separately running program for which the operating system allocates resources. The key idea behind a process is isolation, so issues in one process don’t directly affect others, and communication happens via IPC (inter-process communication), for example through pipes, sockets, and so on.
Thread is a sequence of instructions executed within a process and has its own execution context (Thread Execution Context). Since threads exist within a process, they share memory, which makes them a great solution for parallel tasks. Also, because there’s no need for IPC, threads run much faster than processes.
Below is a diagram of a typical Node.js API that uses multiple processes. It clearly shows that threads are inside processes, and the process itself is an isolated container. Also, note the Libuv icons - they appear both in the process and in the worker threads. This means that when you create a thread/process, a separate Event Loop will be created.
Let’s summarize the key differences:
- A process is a top-level container.
- Each process has its own memory, while threads share memory.
- A process has an independent lifecycle, while a thread depends on the process.
- Communication between processes is slower and less efficient than communication between threads.
What is the CAP Theorem?
This is one of the must-know questions in Senior interviews for projects involving distributed systems. It’s a theoretical question, but it shows how deeply you’ve gone into system design and microservices architecture.
CAP theorem describes a fundamental trade-off in distributed data storage systems where data is replicated across multiple nodes. It states that in the presence of a network partition, a system cannot simultaneously guarantee all three properties: Consistency, Availability, and Partition Tolerance.
- Consistency: every read request returns the most up-to-date value, meaning all nodes behave as a single source of truth.
- Availability: every request receives a response - the system remains available and doesn’t completely block.
- Partition tolerance: the system continues to operate even if communication between cluster components is lost or significantly delayed.
The practical point of the theorem is that in a distributed system, network failures are inevitable, so we treat partition tolerance as a given. Under these conditions, we have to choose between prioritizing Partition Tolerance + Consistency (PC) or Partition Tolerance + Availability (PA).
CP: system preserves data correctness and freshness, but it may temporarily block operations to avoid returning inconsistent data. As a result, the system may sometimes be unavailable and return an error or a timeout in order to maintain data consistency.
AP: the system continues to respond at all times, but it allows that during a partition, different nodes may return stale values. With this approach, consistency is achieved later - this is also known as Eventual Consistency.
It’s important to understand that CAP forces a trade-off only when problems occur. Under normal conditions, a system should have both high availability and acceptable data consistency.
What is a Circuit Breaker?
One of those questions that helps determine whether you’ve worked with complex systems. The first thing to understand is that a Circuit Breaker is a pattern that stops a system from attempting an operation that is very likely to fail. This in turn, will save resources and stop the endless flow of identical logs.
Example usage
We have Service A that sends a request to Service B. In turn, it calls a third-party API and stores something in a database. A Circuit Breaker can act as a proxy between Services A and B. This proxy works like a fuse and, if a critical error occurs, it can block traffic to B. The conditions under which the breaker trips can be different. The most common approach is a threshold based on the number of errors of a certain type (for example, 500/502 responses) within a specific time window (for example, 60 seconds).
After that, it will check whether the issue has been resolved, and once the service is working again, it will restore traffic to it. Technically, a Circuit Breaker has several states that determine how requests to the service are handled.
Closed
This state means that everything is fine with the service and requests are being routed where they should be. If critical errors occur, it starts counting them, and once their number exceeds the specified threshold, the proxy transitions to the Open state. In addition, a timer starts, and once it expires, the proxy moves to the Half-Open state, where the service’s health is checked.
Open
In this state, the Circuit Breaker will not route requests to the service. Instead, it will return an error or a fallback response. It depends on the implementation, but the main purpose of this state is to prevent traffic from reaching the service.
Half-Open
In this state, we allow a limited number of requests to reach the service. If all of them complete successfully, we move the Circuit Breaker to the Closed state and resume traffic to the service. If at least one request fails, we move it back to the Open state and restart the timer.
Do You Follow the OWASP Top 10?
This is a fairly common question in Senior-level interviews, and it is a standard way to check whether a candidate keeps up with security trends. Usually, you will first be asked about the essence of the project itself, after which you may be asked about the most critical vulnerabilities at the moment.
OWASP Top 10 is a list of the 10 most common and most dangerous security risks in the web, published by OWASP (Open Worldwide Application Security Project). You need to follow it in order to understand how to protect your applications or APIs.
Also, most people do not follow it and have probably never gone beyond XSS, so you can use the OWASP Top 10 as a trump card in an interview that can significantly increase your chances of success.
What problem do Hooks solve in React?
Question is often asked in Junior/Middle interviews and it shows how deeply you’ve understood why React Hooks were introduced in the first place.
If we go back to the era of class components, we can see an obvious problem: state is tightly coupled to a specific component, and we can’t reuse it. Of course, people found workarounds - using inheritance or mixins - but that usually made things worse rather than solving the problem.
Let’s look at an example
Imagine we have a so-called Toggle that switches a button’s state. Right now, we use it in one component and it doesn’t cause any issues.
But later, we need to add the same functionality for modal windows. And this is exactly where the main problem with class components shows up. Since this.state is tied to a specific class, we can’t reuse that logic in another component.
So we end up writing workarounds. The most common ones are render props, inheritance, and HOCs. The last one is the most reasonable option on that list and often helped, but it made the codebase more complex.
With Hooks, the situation improved: we can extract all the toggle logic into a custom hook and reuse it in both a button and a modal window.
What is a dead-letter queue?
This question often comes up in backend or full-stack interviews, especially for projects involving distributed systems. Each specific technology (SQS, RabbitMQ, Kafka, etc.) has its own approach to dead-letter queues (DLQs), so it’s better to focus on understanding how it works and what problems it solves.
From a system design perspective, it’s a pattern in message-driven architectures that isolates failed messages into a separate queue. This prevents the main flow from being blocked, avoids data loss, and makes it possible to investigate the root cause and reprocess the messages.
It’s important to understand that a DLQ doesn’t solve the root cause of the failure. Its purpose is to store failed messages in a controlled way, analyze them, and decide how to handle them going forward.
Let’s look at an example
We have a Payments microservice that publishes an event after a successful payment: PaymentSucceeded { paymentId, userId, planId, amount, occurredAt }. Another microservice, Subscriptions, listens to it and, after receiving it, creates a subscription in the database and sends a welcome email via a third-party provider.
The simplest scenario where a DLQ is useful is when a third-party email-sending API fails. After we get a 502 error from the third-party service, we retry three more times with a 30-second interval (retry + backoff). If we keep getting 502s, we send the message to the DLQ. Once the provider is back up, we try to process those messages again - this is called a DLQ re-drive.
What are private and public keys?
One of the common interview questions for a Node.js developer position. Most often, it comes up in discussions about the difference between HTTP and HTTPS or in the context of authorization, especially JWT.
First, you need to understand what private and public keys are. A public key is not secret and can be shared freely, while a private key must be stored securely and never shown to anyone.
Encryption
In this example, we’ll explain in simple terms how encryption works. Imagine you have a friend and want them to send you some secret information in encrypted form so you can decrypt it after receiving it.
To do this, you need to generate a key pair (a public key and a private key). You keep the private key secret while giving your friend the public key. Using the public key, your friend can encrypt the message and send it to you. But only the person who has the private key can decrypt it.
Signature
Continuing with our message example, imagine that now you want to send a secret message to your friend. Your friend generates a key pair and gives you their public key. Then you encrypt the message with their public key so that only they can decrypt it. But if they have shared their public key with several people, the question arises: how can they be sure that the message is really from you and that no one has forged it or changed it along the way?
This is exactly the problem that a digital signature solves. You sign the message with your private key, and your friend verifies the signature using your public key. That way, even if many people have your public key and can encrypt messages for you, they still cannot forge your signature, because that would require your private key.
If you’re now trying to remember where else you’ve seen this, the same principle is used in authorization with JWT tokens.
What are Noisy Tenants?
Noisy tenants are a common problem in multi-tenant systems when one “noisy” tenant (customer) starts consuming a disproportionate amount of resources, such as CPU, memory, or queue capacity, or exceeding API limits. As a result, it affects other customers: latency increases, timeouts appear, session problems or downtime occurs.
How does this look in practice?
For example, you use a message broker (SQS, RabbitMQ, etc.), and one tenant generates a large number of events. Since the broker and event consumers are shared, they mostly process messages from the noisy tenant and other customers suffer from increased latency.
Another example is related to third-party integrations. You haven’t set clear per-customer rate limits and a noisy tenant calls the third-party API so frequently that it very quickly hits the rate limit and other clients can’t use the integration properly.
Main causes:
- Shared infrastructure: message brokers, databases, servers, etc.
- Shared limits of external services or a lack of rate limiting
- Uneven traffic distribution
Ways to solve the problems:
- Limits and fair resource allocation
- Quotas for heavy and expensive operations
- Dedicated resources for high-volume clients
What is throttling?
Throttling is a programming technique that limits how often something can run. Put simply, we specify that the code can’t be executed more often than once every N milliseconds. The best way to understand throttling is to look at an example.
Throttling in APIs
For example, you’re building a system that works with many tenants (customers), who can choose pricing plans based on their needs for your API.
A typical problem in such systems is noisy tenants. Imagine a customer chose the highest-tier plan, but the rate of requests they make is so high that they end up consuming a significant portion of the resources. As a result, other customers experience response latency issues or slow data loading.
In this case, by using throttling you can specify that the maximum request rate is 1,000 per second. This can potentially reduce the load on your resources and allow all customers to use the system properly.
How can you determine an algorithm’s complexity?
First, let’s clarify what algorithmic complexity is. It’s a way to estimate how many resources an algorithm uses as the input size grows. There are two ways to measure an algorithm’s complexity.
Time Complexity
The time complexity of an algorithm is a measure of how much it slows down as the amount of data grows. Big O notation is used here, and it provides an upper bound.
Let’s break it down in more detail.
O(1) - A good example of constant time complexity is an assignment operation or accessing an object property by key.
O(log n) - With this complexity, execution time grows very slowly, even as the input size increases significantly. There are many examples of this, and the simplest one is binary search.
O(n) - With this complexity, the execution time grows in direct proportion to the number of elements. For example, imagine iterating over an array of 100 elements. With a logarithmic-time algorithm, increasing the number of elements to 100,000 would barely affect the runtime, whereas an O(n) algorithm would slow down significantly.
O(n log n) - In simple terms, this is a combination of linear and logarithmic complexity. For example, an algorithm may use a divide-and-conquer strategy O(log n) to split an array and then use iteration O(n) to process all elements.
O(n²) - This happens when, for each element, the algorithm checks all other elements - most commonly when there are two nested loops.
O(2ⁿ) - Exponential complexity is one of the slowest. Put simply, it doubles the number of operations with each additional element. For example, with 10 elements, you’d need roughly 1,024 operations.
A simple example of determining an algorithm’s complexity
We have a function that returns the minimum value in an array. Essentially, we have two operations: assigning let min = arr[0] and iterating over the array arr. The assignment operation always takes constant time, O(1), because it doesn’t depend on the number of elements, while iterating over the array depends on how many elements it contains, so the complexity is O(n). The overall complexity of an algorithm is determined by the most expensive operation. In our case, that’s iterating over the array, so the time complexity of the findMin function is O(n).
Space Complexity
This is the amount of additional memory an algorithm needs, beyond the input data. For example, creating extra variables, arrays, objects, and so on. To determine space complexity, we use the same Big O notation, but instead of counting operations, we focus on how much extra memory (data structures) is being used.
After answering the question, it’s a big plus to emphasize the importance of understanding complexity, since it directly relates to understanding data structures. And data structures are a foundation for understanding modern distributed systems, databases, caching, and many other important topics.
How does a LEFT JOIN work?
LEFT JOIN is a type of join that returns all rows from the left table, even if there are no matching rows in the right table. If a match exists in the right table, its data is included. If there is no match, the columns from the right table will contain NULL.
Let’s look at an example
We have two tables: users and orders. Our goal is to get all users along with their orders. If a user has no orders, we still want to include them in the result, so we use a LEFT JOIN.
The query will look like this. Let’s break it down in more detail:
As a result of running the query, we’ll get the following output.
What is idempotency?
So, let’s start with the definition of idempotency. It’s a concept that guarantees that repeating an operation multiple times has the same effect as performing it once. Most often, this question comes up in the context of REST API design, so it makes sense to discuss this concept in terms of HTTP methods.
For example, let’s look at the PUT method
According to REST, it performs a full update of a resource. And if we send it 10 times with the same body, all those calls will produce the same result as a single call.
The same applies to GET, DELETE, and PATCH. But if we look at the POST method, according to the REST concept, each call creates a new record in the database, so repeating the call will NOT have the same effect - therefore, the method is not idempotent. Additionally, I’d recommend reading about idempotency in distributed systems and talking about it in an interview. It’s also good practice to mention an idempotency key and give an example of how it’s used.
What are composite indexes?
Most likely, you’ll be asked this question in the context of discussing indexes, and they’ll expect you to understand that the order in which an index is created and used matters.
Let’s start with creation
When creating a composite index, you should be guided by selectivity. The most selective column (the one that narrows down results the most) should come first - for example, customer_id. Then you take a less selective column, such as status, and the last column, which should have the lowest selectivity, for example, created_at.
The reason for this order is the index structure. Simply put, it’s nested, and we can only use status efficiently if we already have customer_id.
Let’s look at an example
The index will be created correctly, but you need to explain the rule: when querying using this index, it will be efficient only if the values are provided from left to right.
So, what rules will the query planner follow when using this index:
- Obviously, if we provide
customer_id,status, andcreated_at, the index will work as expected. - If we provide only
customer_idandstatus, the index will still be used and will search by those two columns. - If we provide
customer_idandcreated_at, the index can still be used, but the query planner won’t takecreated_atinto account because we skipped status and thus broke the index prefix. You can see this by runningEXPLAIN. - If we provide only
created_at, the index won’t be used and a full scan will be performed, because we didn’t providecustomer_id, which is the first column (the index prefix).
What is the difference between type and interface in TypeScript?
This is a common interview question, especially for Junior and Middle positions. In general, what they usually want to hear in an interview is interface merging.
Let’s break it down with an example:
If you try the same approach with a type, you will get an error. When answering this question, it would be a bonus to mention when it’s better to use type vs interface:
- Interface is better used to describe the structure of objects and classes, as intended in
OOP. - For everything else, it’s more appropriate to use type, especially in the context of building complex types:
union,utility types,function signatures…