Skip to content

SQL

Database Icon

Table of Contents

Part 1: The SELECT Statement

1. SELECT and FROM: The Basics

You use SELECT to query the database and get data that matches your criteria.

  • SELECT tells the database which columns you want.
  • FROM tells it which table to get them from.
-- Select all columns from the 'users' table
SELECT * FROM users;

-- Select only specific columns
SELECT username, email FROM users;

2. WHERE: Filtering Your Results

Use WHERE to only get records that match your condition. It's how you filter results.

-- Find the user with the id of 1337
SELECT * FROM users WHERE id = 1337;

-- Find all products that are out of stock
SELECT * FROM products WHERE stock_quantity = 0;

3. AND, OR, IN, BETWEEN: Combining Filters

  • AND & OR: Combine multiple conditions.
  • IN: Specify a list of possible values.
  • BETWEEN: Select values within a given range.
-- Find active admin users
SELECT * FROM users WHERE is_active = TRUE AND is_admin = TRUE;

-- Find products in the 'electronics' or 'books' category
SELECT * FROM products WHERE category = 'electronics' OR category = 'books';

-- Find users from specific countries
SELECT * FROM users WHERE country IN ('USA', 'Canada', 'UK');

-- Find orders placed within a date range
SELECT * FROM orders WHERE order_date BETWEEN '2023-01-01' AND '2023-01-31';

4. LIKE: Pattern Matching

Use LIKE in a WHERE clause to search for patterns in a column. Handy for partial matches.

  • %: Represents zero, one, or multiple characters.
  • _: Represents a single character.
-- Find all users whose email is a gmail address
SELECT * FROM users WHERE email LIKE '%@gmail.com';

-- Find all products with a 4-letter name starting with 'c'
SELECT * FROM products WHERE product_name LIKE 'c___';

5. ORDER BY: Sorting Results

Use ORDER BY to sort your results. You can go ascending or descending.

  • ASC: Ascending order (default).
  • DESC: Descending order.
-- Get all products, sorted by price from highest to lowest
SELECT name, price FROM products ORDER BY price DESC;

-- Get all users, sorted by country, then by username
SELECT username, country FROM users ORDER BY country ASC, username ASC;

6. LIMIT: Restricting the Number of Results

Use LIMIT to cap how many records you get back. Essential for performance and pagination.

-- Get the 10 most recently registered users
SELECT username, registration_date FROM users ORDER BY registration_date DESC LIMIT 10;

Part 2: Joining Multiple Tables

Real world data lives in multiple tables. Use JOIN clauses to combine them into something useful.

7. Primary and Foreign Keys

  • A Primary Key uniquely identifies each row (like users.id).
  • A Foreign Key points to another table's Primary Key (like orders.user_id pointing to users.id). That's how you link tables together.

8. INNER JOIN: The Workhorse

INNER JOIN gets records that match in both tables. It's the join you'll use most often.

-- Get a list of users and the dates of the orders they placed
SELECT
    users.username,
    orders.order_date,
    orders.total_amount
FROM users
INNER JOIN orders ON users.id = orders.user_id;

9. LEFT JOIN: Finding What's Missing

LEFT JOIN gets all records from the left table (users) plus matching ones from the right (orders). No match? The right side is NULL. Perfect for finding things that don't have matches.

-- Find all users who have NEVER placed an order
SELECT
    users.username
FROM users
LEFT JOIN orders ON users.id = orders.user_id
WHERE orders.id IS NULL;

10. RIGHT JOIN and FULL OUTER JOIN

  • RIGHT JOIN: The reverse of a LEFT JOIN. Returns all records from the right table.
  • FULL OUTER JOIN: Returns all records when there is a match in either the left or right table.

Part 3: Aggregating and Grouping Data

11. Aggregate Functions: COUNT, SUM, AVG, MIN, MAX

Aggregate functions perform a calculation on a set of values and return a single value.

-- How many users are in the database?
SELECT COUNT(*) FROM users;

-- What is the total revenue from all orders?
SELECT SUM(total_amount) FROM orders;

-- What is the average price of a product?
SELECT AVG(price) FROM products;

-- What are the highest and lowest prices?
SELECT MAX(price), MIN(price) FROM products;

12. GROUP BY: Summarizing Data

The GROUP BY statement groups rows that have the same values into summary rows. It's almost always used with aggregate functions.

-- Count the number of users in each country
SELECT country, COUNT(*) AS number_of_users
FROM users
GROUP BY country;

-- Calculate the total sales amount for each customer
SELECT user_id, SUM(total_amount) AS total_spent
FROM orders
GROUP BY user_id;

13. HAVING: Filtering Groups

The WHERE clause filters rows before they are grouped. The HAVING clause filters groups after they have been created by the GROUP BY clause.

-- Find all countries with more than 10 users
SELECT country, COUNT(*) AS number_of_users
FROM users
GROUP BY country
HAVING COUNT(*) > 10;

Part 4: DML

Data Manipulation Language (DML) is used to manage data within tables.

14. INSERT INTO: Adding New Data

INSERT INTO users (username, email, password_hash)
VALUES ('newuser', 'new@example.com', 'a_very_long_and_secure_hash');

15. UPDATE: Modifying Existing Data

Warning: Always use a WHERE clause with UPDATE. If you forget, you will update every single row in the table.

UPDATE users
SET email = 'new.email@example.com'
WHERE username = 'newuser';

16. DELETE FROM: Removing Data

Warning: Always use a WHERE clause with DELETE. If you forget, you will delete every single row in the table.

DELETE FROM users WHERE username = 'newuser';

Part 5: DDL

Data Definition Language (DDL) is used to create and manage the structure of your database.

17. CREATE TABLE: Building Your Database

This command creates a new table, where you define the columns and their data types.

CREATE TABLE users (
    id INT PRIMARY KEY AUTO_INCREMENT, -- In PostgreSQL, use SERIAL PRIMARY KEY
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(255) NOT NULL UNIQUE,
    password_hash VARCHAR(255) NOT NULL,
    is_active BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

18. Common Data Types

  • INT or INTEGER: Whole numbers.
  • VARCHAR(n): A variable length string with a maximum size of n.
  • TEXT: A variable length string with a very large maximum size.
  • DATE: Stores a date (year, month, day).
  • TIMESTAMP: Stores a date and time.
  • BOOLEAN: TRUE or FALSE.
  • DECIMAL(p, s) or NUMERIC(p, s): An exact number with p total digits and s digits after the decimal point. Use this for currency.

19. Constraints: PRIMARY KEY, FOREIGN KEY, etc.

Constraints are rules enforced on the data in a table.

  • NOT NULL: Ensures a column cannot have a NULL value.
  • UNIQUE: Ensures all values in a column are different.
  • PRIMARY KEY: A combination of NOT NULL and UNIQUE. Uniquely identifies each row.
  • FOREIGN KEY: Uniquely identifies a row in another table.
  • DEFAULT: Sets a default value for a column when no value is specified.

20. ALTER TABLE and DROP TABLE

  • ALTER TABLE: Modifies an existing table (adds, deletes, or modifies columns).
  • DROP TABLE: Deletes a table completely.
-- Add a new column to the users table
ALTER TABLE users ADD COLUMN last_login_ip VARCHAR(45);

-- Delete the users table (be careful!)
DROP TABLE users;

Part 6: Advanced Querying

21. Subqueries

A subquery (or inner query) is a query nested inside another query.

-- Find the names of all users who have placed an order with an amount greater than 100
SELECT username FROM users
WHERE id IN (SELECT user_id FROM orders WHERE total_amount > 100);

22. Common Table Expressions-(CTEs)

A CTE, defined with the WITH clause, creates a temporary, named result set. It makes complex queries much more readable than subqueries.

WITH TopCustomers AS (
    -- First, find the user_id of customers who have spent more than 500 total
    SELECT user_id
    FROM orders
    GROUP BY user_id
    HAVING SUM(total_amount) > 500
)
-- Then, get the names of those users
SELECT username
FROM users
WHERE id IN (SELECT user_id FROM TopCustomers);

Part 7: Transactions

23. ACID Properties

A transaction is a sequence of operations performed as a single logical unit of work. They guarantee ACID properties:

  • Atomicity: All operations in the transaction complete successfully, or none of them do.
  • Consistency: The database remains in a consistent state before and after the transaction.
  • Isolation: Concurrent transactions do not interfere with each other.
  • Durability: Once a transaction is committed, it will remain so, even in the event of a power loss or crash.

24. BEGIN, COMMIT, ROLLBACK

Transactions are essential for maintaining data integrity.

-- Imagine transferring money from one account to another
BEGIN TRANSACTION;

-- 1. Debit money from Account A
UPDATE accounts SET balance = balance 100 WHERE id = 1;

-- 2. Credit money to Account B
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- If both operations succeed, make the changes permanent
COMMIT;

-- If something had gone wrong, you would issue a ROLLBACK
-- to undo all changes since the BEGIN TRANSACTION.
-- ROLLBACK;

Part 8: The #1 Security Risk

25. What is SQL Injection?

SQL Injection (SQLi) is a vulnerability that occurs when an attacker can interfere with the queries that an application makes to its database. It generally allows an attacker to view data they are not normally able to retrieve, and can sometimes lead to modifying or deleting data, and even full remote code execution.

It happens when application code builds SQL queries by concatenating strings with user input.

Conceptual Vulnerable Code (in Python):

# DO NOT DO THIS
user_id = request.args.get("id") # User provides: 105 OR 1=1
query = "SELECT * FROM products WHERE id = " + user_id

# The final query becomes:
# SELECT * FROM products WHERE id = 105 OR 1=1
# The `OR 1=1` is always true, so the query returns ALL products!

26. The UNION Attack

The UNION operator is a powerful tool for attackers. It allows them to combine the results of the vulnerable query with a query of their own choosing.

Attack Scenario: * Vulnerable URL: example.com/products.php?id=1 * Original Query: SELECT name, description FROM products WHERE id = 1 * Attacker's Input: 1' UNION SELECT username, password FROM users -- * Final Malicious Query:

SELECT name, description FROM products WHERE id = '1' UNION SELECT username, password FROM users -- '
The database will execute both SELECT statements and return the results together. The attacker has now dumped the users table.

27. Blind SQL Injection

In some cases, the results of the query are not returned to the user. An attacker can still exfiltrate data by asking a series of true/false questions, often using time delays.

  • Boolean based: ... AND SUBSTRING(@@version, 1, 1) = '5' (Does the page look different?)
  • Time based: ... AND IF(SUBSTRING(@@version, 1, 1) = '5', SLEEP(5), 0) (Does the page take 5 seconds longer to load?)

28. The Ultimate Defense: Parameterized Queries

To prevent SQL injection, you must separate the SQL code from the data. This is done using parameterized queries (also known as prepared statements).

The database receives the query template first, with placeholders (? or :name) for the data. Then, the application sends the data separately. The database treats this data only as data, never as executable code.

Conceptual Secure Code (in different languages):

Python (sqlite3 library):

user_id = "105 OR 1=1"
# The `?` is a placeholder. The driver safely inserts the data.
cursor.execute("SELECT * FROM products WHERE id = ?", (user_id,))

Java (JDBC):

String userId = "105 OR 1=1";
String sql = "SELECT * FROM products WHERE id = ?";
PreparedStatement pstmt = connection.prepareStatement(sql);
pstmt.setString(1, userId);
ResultSet rs = pstmt.executeQuery();

PHP (PDO):

$userId = "105 OR 1=1";
$stmt = $pdo->prepare('SELECT * FROM products WHERE id = :id');
$stmt->execute(['id' => $userId]);

The principle is the same everywhere: Never build queries by concatenating strings!

Part 9: Cookbook

29. Find Duplicate Email Addresses

This is useful for data cleaning. It groups by email and then uses HAVING to find groups with more than one member.

SELECT
    email,
    COUNT(email) AS occurrences
FROM
    users
GROUP BY
    email
HAVING
    COUNT(email) > 1;

30. Calculate Monthly Active Users

Assuming you have a user_logins table with a login_timestamp.

-- For PostgreSQL
SELECT
    DATE_TRUNC('month', login_timestamp) AS login_month,
    COUNT(DISTINCT user_id) AS monthly_active_users
FROM
    user_logins
WHERE
    login_timestamp >= NOW() INTERVAL '1 year'
GROUP BY
    login_month
ORDER BY
    login_month;

31. Find Customers Who Haven't Ordered

This is a classic use case for a LEFT JOIN.

SELECT
    c.customer_name
FROM
    customers c
LEFT JOIN
    orders o ON c.id = o.customer_id
WHERE
    o.id IS NULL;