JDBC, Hibernate & Database Interview Questions

1. What is SQL Injection? How to solve it?

SQL Injection is a code injection technique where malicious SQL statements are inserted into an entry field for execution (e.g., to dump the database contents to the attacker). This occurs when user input is not properly sanitized and is concatenated directly into a dynamic SQL query.

How to solve it? 1. Use PreparedStatement: This is the most effective way. It uses parameterized queries where inputs are treated as data, not executable code. 2. Input Validation: Sanitize and validate all user inputs (e.g., allow only alphanumeric characters). 3. ORM Frameworks: Use frameworks like Hibernate/JPA which internally use prepared statements. 4. Least Privilege: Ensure the database user has minimum required permissions.

2. Difference between Statement and PreparedStatement

Feature	Statement	PreparedStatement
Compilation	Compiled every time it is executed.	Pre-compiled on the database side (compiled once, executed multiple times).
Performance	Slower for repeated execution.	Faster for repeated execution (due to caching).
Security	Vulnerable to SQL Injection.	Prevents SQL Injection (using placeholders `?`).
Usage	Static SQL queries.	Dynamic SQL queries with parameters.

3. What are JDBC statements? List the types of JDBC statements and their usage.

JDBC Statements are interfaces used to send SQL commands to the database.

Types: 1. Statement: Used for general-purpose access to the database. Useful when using static SQL statements at runtime. The Statement interface cannot accept parameters. 2. PreparedStatement: Used when you plan to use the SQL statement many times. The PreparedStatement interface accepts input parameters at runtime. 3. CallableStatement: Used to access database stored procedures. The CallableStatement interface can also accept runtime input parameters.

4. What is JdbcTemplate? And what are some of the advantages it has over standard JDBC?

JdbcTemplate is a central class in the Spring Framework's JDBC support. It simplifies the use of JDBC and helps to avoid common errors.

Advantages over standard JDBC: 1. Boilerplate Code Reduction: It handles the creation and release of resources (Connections, Statements, ResultSets) automatically. You don't need try-catch-finally blocks for closing resources. 2. Exception Handling: It translates standard SQLException (checked exception) into Spring's DataAccessException hierarchy (unchecked exceptions), which is more informative. 3. Simplified Querying: Provides convenient methods for querying and mapping results to objects (RowMapper).

5. How to handle transactions manually in JDBC?

By default, JDBC is in auto-commit mode (each SQL statement is treated as a transaction). To handle transactions manually:

Disable Auto-commit: connection.setAutoCommit(false);
Perform Operations: Execute multiple SQL statements.
Commit: If all succeed, call connection.commit();
Rollback: If any fail (catch exception), call connection.rollback();

6. What is ORM and what are its benefits?

ORM (Object-Relational Mapping) is a programming technique for converting data between incompatible type systems using object-oriented programming languages. It maps Java objects (POJOs) to Database tables.

Benefits: 1. Productivity: Eliminates boilerplate JDBC code. 2. Maintainability: Code is cleaner and easier to understand. 3. Database Independence: The ORM abstracts the SQL dialect; changing databases often requires only configuration changes. 4. Caching: Supports caching (L1, L2) to improve performance.

7. What is Hibernate? How to configure Hibernate?

Hibernate is an open-source, lightweight ORM framework for Java. It simplifies database interaction by handling object-table mapping.

Configuration: Typically done via hibernate.cfg.xml or persistence.xml (JPA), or application.properties (Spring Boot). Key properties include: * hibernate.connection.driver_class * hibernate.connection.url * hibernate.connection.username / password * hibernate.dialect * hibernate.hbm2ddl.auto (validate/update/create)

a. Follow Up: If we want to migrate from MySQL to OracleDB, what is the one configuration property that we need to change other than the database url, driver, username and password? * You need to change the hibernate.dialect property (e.g., from org.hibernate.dialect.MySQLDialect to org.hibernate.dialect.OracleDialect). This tells Hibernate how to generate the specific SQL syntax for the target database.

b. What is the difference between dialect and driver class? * Driver Class: The JDBC driver acts as the bridge/translator between the Java application and the database connection protocol. It knows how to talk to the DB. * Dialect: Tells Hibernate the specific SQL syntax features of the database (e.g., how to do pagination, data types). It knows what language to speak.

8. What is Session and SessionFactory in Hibernate?

SessionFactory: A heavyweight, thread-safe, immutable object. It is created once during application startup and serves as a factory for Session objects. It holds the database configuration and mapping metadata. (Scope: Application)
Session: A lightweight, non-thread-safe object that represents a conversation between the application and the database. It is used to perform CRUD operations. (Scope: Request/Transaction)

a. Is session in hibernate thread safe? * No, the Hibernate Session is not thread-safe. It should not be shared between threads. A new session should be obtained for each transaction or unit of work.

9. Difference between getCurrentSession vs. OpenSession

Feature	`getCurrentSession()`	`openSession()`
Creation	Returns the session bound to the current context (e.g., Spring Transaction). If none exists, creates one.	Always creates and opens a new session.
Closing	Automatically closed when the transaction commits/rollbacks.	Must be manually closed (`session.close()`).
Thread Context	Scoped to the current thread/transaction.	Not scoped; independent.

10. What are the three Hibernate entity states?

Transient: The object is created (new) but not associated with any Hibernate Session and has no representation in the database (no ID).
Persistent: The object is associated with a Session and has a representation in the database. Changes to it are tracked and saved automatically.
Detached: The object was persistent but its Session has been closed. It has an ID in the DB, but changes are not tracked by Hibernate anymore.

11. Difference between get() vs. load()

Feature	`session.get(Class, id)`	`session.load(Class, id)`
Database Hit	Hits the database immediately.	Returns a proxy (lazy loaded). Hits DB only when a non-ID method is called.
Return Value	Returns `null` if the object doesn't exist.	Throws `ObjectNotFoundException` if the object doesn't exist when accessed.
Usage	Use if you are unsure if the object exists.	Use if you are sure the object exists (performance optimization).

12. Difference between update, merge, saveOrUpdate

update(): Reattaches a detached object to the current session. Throws an exception if a persistent object with the same ID is already in the session.
merge(): Copies the state of the given object onto the persistent object with the same identifier. If there is no persistent instance currently in the session, it loads one from the database. It returns a new reference (the persistent one). Safe to use if the session might already contain the entity.
saveOrUpdate(): Checks the object's ID. If null (transient) -> save(); if not null (detached) -> update().

13. Why do we use flush(), clear(), and commit()?

flush(): Forces Hibernate to synchronize the in-memory state of the Session with the database (execute pending SQL INSERT/UPDATE/DELETE). It does not commit the transaction.
commit(): Commits the database transaction. It automatically calls flush() first, then makes changes permanent in the DB.
clear(): Completely clears the Session. It detaches all objects from the session. Used to free up memory during batch processing to avoid OutOfMemoryError.

14. How to do Many-to-Many mapping in hibernate

A Many-to-Many relationship requires a third table (Join Table) in the database. In Hibernate, use @ManyToMany and @JoinTable.

Example (Student <-> Course):

@Entity
public class Student {
    @ManyToMany
    @JoinTable(
        name = "student_course",
        joinColumns = @JoinColumn(name = "student_id"),
        inverseJoinColumns = @JoinColumn(name = "course_id")
    )
    private Set<Course> courses;
}

@Entity
public class Course {
    @ManyToMany(mappedBy = "courses") // Inverse side
    private Set<Student> students;
}

15. Give an example with @ManyToOne and @OneToMany

Example (Department 1 <-> * Employee):

Department (One Side):

@Entity
public class Department {
    @Id
    private Long id;

    @OneToMany(mappedBy = "department", cascade = CascadeType.ALL)
    private List<Employee> employees;
}

Employee (Many Side - Owning Side):

@Entity
public class Employee {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "department_id") // Foreign Key
    private Department department;
}

16. Fetching strategy in Hibernate

a. What is lazy fetching? * Lazy fetching means data is loaded from the database only when it is explicitly requested. It is the default for collections and associations (@OneToMany, @ManyToMany). It improves performance by avoiding loading unnecessary data.

b. What is LazyInitializationException? * It is a runtime exception thrown when you try to access a lazily loaded object or collection (proxy) after the Hibernate Session has been closed.

c. How to prevent LazyInitializationException? 1. Initialize inside the transaction: Access the getter method while the session is open. 2. JOIN FETCH (HQL): Write a query to fetch the related data eagerly (e.g., SELECT d FROM Department d JOIN FETCH d.employees). 3. Entity Graphs: Use JPA @EntityGraph to define fetch plans. 4. Open Session In View (OSIV): (Anti-pattern but common) Keep session open until view renders.

17. Why is Hibernate often preferred over JDBC?

Cleaner Code: Removes SQL boilerplate (PreparedStatements, ResultSets).
Object-Oriented: Allows working with Objects instead of Tables.
Caching: Built-in L1 and L2 caching support.
Association Management: Handles complex relationships (OneToMany, etc.) easily.
Database Independence: Easy to switch databases via dialect.
Automatic Transaction Management & Lazy Loading.

18. What is HQL and what is Criteria? What is type safe?

HQL (Hibernate Query Language): An object-oriented query language similar to SQL, but it operates on persistent objects/classes and their properties, not database tables.
Criteria API: A programmatic, type-safe way to create queries. You build the query using Java objects/methods instead of string manipulation.
Type Safe: Means the compiler validates types at compile-time. HQL (String-based) can fail at runtime due to typos. Criteria queries (using Metamodel) are checked by the compiler, reducing runtime errors.

19. Caching in Hibernate/What are first-level cache and second-level cache and how are they accessed?

First-Level Cache (L1): Associated with the Session. It is enabled by default and mandatory. Caches objects within the current transaction.
Second-Level Cache (L2): Associated with the SessionFactory. It is optional and pluggable (e.g., EhCache, Redis, Infinispan). Caches objects across sessions/transactions.

a. Follow up: If the second-level cache is enabled, how are the caches accessed (the order) when trying to fetch an entity? 1. First-Level Cache: Hibernate checks the current Session. 2. Second-Level Cache: If not found in L1, it checks the L2 Cache (if configured). 3. Database: If not found in L2, it executes the SQL query to the Database.

20. Explain ACID

ACID properties ensure reliable processing of database transactions. * Atomicity: All or nothing. The entire transaction succeeds, or it rolls back completely. * Consistency: The database transforms from one valid state to another, maintaining constraints/integrity. * Isolation: Concurrent transactions do not interfere with each other (Visible effects depend on Isolation Level). * Durability: Once committed, the changes are permanent, even in case of power loss.

21. Given an Employee table with ID, Department, and Salary, write a SQL query for the following:

a. Find the number of employees in each department.

SELECT Department, COUNT(*)
FROM Employee
GROUP BY Department;

b. Get the highest salary per department group.

SELECT Department, MAX(Salary)
FROM Employee
GROUP BY Department;

c. Find the employees who have the top salary in each department.

-- Using Subquery
SELECT * FROM Employee e
WHERE Salary = (SELECT MAX(Salary) FROM Employee WHERE Department = e.Department);

-- OR using Window Function
SELECT * FROM (
    SELECT *, RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) as rnk
    FROM Employee
) WHERE rnk = 1;

d. Find all employees with the 3rd highest salary.

-- Using DENSE_RANK to handle ties
SELECT * FROM (
    SELECT *, DENSE_RANK() OVER (ORDER BY Salary DESC) as rnk
    FROM Employee
) WHERE rnk = 3;

-- Standard Limit/Offset (MySQL/Postgres) - distinct salaries
SELECT * FROM Employee
WHERE Salary = (SELECT DISTINCT Salary FROM Employee ORDER BY Salary DESC LIMIT 1 OFFSET 2);

22. Explain SQL vs NoSQL databases

Feature	SQL (Relational)	NoSQL (Non-Relational)
Structure	Table-based, strict schema.	Document, Key-Value, Graph, Wide-column. Dynamic schema.
Scaling	Vertical (Scale-up: better CPU/RAM).	Horizontal (Scale-out: more servers).
Transactions	ACID compliant (Strong consistency).	Often BASE (Eventual consistency).
Examples	MySQL, PostgreSQL, Oracle.	MongoDB, Redis, Cassandra.

a. When do you choose to use Relational DB and when to use Non-Relational DB? * Use SQL when: You have structured data, need strict ACID compliance (e.g., banking), and complex joins/relationships. * Use NoSQL when: You have unstructured/semi-structured data, need high throughput/scalability (Big Data), rapid prototyping (flexible schema), or specific data models (e.g., graph).

23. What is inner join, left join, right join?

Inner Join: Returns records that have matching values in both tables.
Left (Outer) Join: Returns all records from the left table, and the matched records from the right table (NULL if no match).
Right (Outer) Join: Returns all records from the right table, and the matched records from the left table (NULL if no match).

24. What is CTE?

CTE (Common Table Expression) is a temporary result set defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or MERGE statement. It is defined using the WITH clause. * It improves readability and allows recursive queries (e.g., hierarchical data).

25. Explain the Stored Procedure and Trigger.

Stored Procedure: A prepared SQL code that you can save, so the code can be reused over and over again. It is explicitly called by the application. Can accept parameters and return values. Business logic can be encapsulated here.
Trigger: A special type of stored procedure that automatically executes (fires) when an event occurs in the database server (e.g., BEFORE INSERT, AFTER UPDATE). Used for audit trails, enforcing constraints, etc.

26. What are the differences between a clustered index and a non-clustered index?

Feature	Clustered Index	Non-Clustered Index
Data Storage	Stores the actual data rows in the leaf nodes. Physically sorts the table data.	Stores a separate structure with pointers/keys to the actual data.
Count	Only one per table (because data can be sorted only one way).	Multiple allowed per table.
Performance	Faster for retrieval (no extra lookup).	Slightly slower (requires lookup in the main table).

27. What does the window function do? What are the differences between rank() and dense_rank()?

Window Functions perform a calculation across a set of table rows that are somehow related to the current row (e.g., OVER (PARTITION BY ...)). Unlike GROUP BY, they do not collapse rows; the result is added to each row.

Difference between rank() and dense_rank(): * RANK(): Assigns ranks with gaps if there are ties. (e.g., 1, 1, 3, 4). * DENSE_RANK(): Assigns ranks without gaps. (e.g., 1, 1, 2, 3).

28. Based on insights from our monitoring tools indicating slow query performance, what strategies could be implemented to optimize it?

Analyze Execution Plan: Use EXPLAIN or EXPLAIN ANALYZE to see how the DB executes the query (full table scans vs index usage).
Indexing: Add indexes on columns used in WHERE, JOIN, and ORDER BY clauses.
Optimize Query: Avoid SELECT *, use proper Joins, avoid functions on indexed columns in WHERE clause.
Database Tuning: Check buffer pool size, cache settings.
Archiving/Partitioning: If the table is huge, partition it or move old data to an archive.
Caching: Implement caching (Redis/Memcached) for frequent read-heavy queries.

29. Explain CAP

The CAP Theorem states that a distributed system can only provide two of the following three guarantees: 1. Consistency (C): Every read receives the most recent write or an error. 2. Availability (A): Every request receives a (non-error) response, without the guarantee that it contains the most recent write. 3. Partition Tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes.

a. Is MongoDB CP or AP? * MongoDB is typically CP (Consistency + Partition Tolerance). By default, it prioritizes consistency (reads go to primary). If a network partition occurs and the primary is lost, the system becomes unavailable for writes until a new primary is elected. However, it can be tuned to be more AP (using eventual consistency settings like reading from secondaries).

30. What are the two locking types in databases?

Optimistic Locking: Assumes no conflict will occur. Does not lock the record when reading. Checks for version/timestamp mismatch upon updating. Better for read-heavy systems.
Pessimistic Locking: Locks the record immediately when reading/accessing it (Shared Lock for read, Exclusive Lock for write). Prevents others from modifying it until the transaction ends. Better for write-heavy systems with high contention.