SQL Basics for Developers: SELECT, JOINs, Aggregates, and Indexes
Read and shape relational data with core SQL—filters, joins, GROUP BY, and why indexes matter for performance.
Databases & SQL Beginner 7 min read
·
SQL statements declare what data you want; the database chooses how to scan indexes and join tables. You write WHERE to filter early (less I/O), JOIN to relate tables by keys, and GROUP BY only when you need aggregates—each clause has a cost, which is why order and indexes matter.
These patterns apply to PostgreSQL, MySQL, SQLite, and others. Running a database locally is easy with Docker Compose.
SELECT and WHERE
Why limit columns in SELECT: Narrower rows mean less memory and network; listing only needed fields also documents intent.
SELECT id, email FROM users WHERE active = 1 LIMIT 10;
What this does: Scans or seeks rows matching active = 1, returns first ten after filtering. Add ORDER BY when “first ten” must be deterministic.
Filter rows with WHERE; sort with ORDER BY; paginate with LIMIT/OFFSET or keyset pagination for scale.
JOINs
INNER JOIN
When to use: You only want rows that exist on both sides of a relationship—orphan rows disappear, which is correct for required foreign keys.
SELECT o.id, c.name
FROM orders o
INNER JOIN customers c ON c.id = o.customer_id;
LEFT JOIN
When to use: You must keep every row from the driving table (customers with zero orders, for example) and show NULL where no match exists.
Keeps all rows from the left table and fills right columns with NULL when there is no match—useful for optional relations or reports.
Aggregates and GROUP BY
Why GROUP BY: Aggregate functions (COUNT, SUM) collapse many rows into one value per group; SQL requires you to name the grouping keys so results are well-defined.
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;
Why HAVING instead of WHERE here: WHERE runs before aggregation; HAVING filters after counts exist.
Indexes (high level)
Why indexes exist: Without them, the engine scans whole tables (slow at scale). B-tree indexes help equality and range predicates on leading columns. Too many indexes slow writes—each insert updates every index on that table.
Use EXPLAIN (or EXPLAIN ANALYZE on Postgres) to see sequential scans, join types, and row estimates.
Parameterized queries
Why never concatenate user input into SQL: Attackers can inject syntax that changes query meaning. Placeholders let the driver send values separately from the command so special characters cannot break out of string literals.
Your ORM or driver documentation shows the exact syntax.
Related guides
Expose data safely over HTTP with REST APIs and curl.
Frequently asked questions
INNER vs LEFT JOIN in one sentence?
INNER keeps only matches; LEFT keeps all left rows and NULL-pads missing right-side matches.
When do I need a transaction?
When multiple statements must succeed or fail together (transferring money, multi-table inserts). Use BEGIN/COMMIT/ROLLBACK (syntax varies slightly by engine).
Is NoSQL replacing SQL?
They solve different problems. Relational SQL remains the default for structured business data and reporting.
How do I practice safely?
Use a disposable local database or a Docker volume you can wipe; avoid production credentials on a laptop.