Hoja de referencia SQL
Referencia completa SQL: SELECT, JOINs, GROUP BY, funciones de ventana, CTEs, INSERT/UPDATE/DELETE, índices y diferencias entre PostgreSQL/MySQL/SQLite.
Basic SELECT structure
SELECT column_a, column_b, AGG(column_c) AS total FROM table_a JOIN table_b ON table_a.id = table_b.a_id WHERE table_a.status = 'active' GROUP BY column_a, column_b HAVING AGG(column_c) > 100 ORDER BY total DESC LIMIT 50;
JOIN types
| Type | Returns | Use case |
|---|---|---|
INNER JOIN | Rows that match in both tables | Customers who have placed orders |
LEFT JOIN | All left rows + matching right (NULL otherwise) | All customers, with orders if any |
RIGHT JOIN | All right rows + matching left | Rare — usually rewritten as LEFT JOIN |
FULL OUTER JOIN | All rows from both, NULL where no match | Reconciling two datasets |
CROSS JOIN | Cartesian product | Rare — combinations, calendar dimensions |
SELF JOIN | Same table joined to itself | Hierarchies, employee→manager |
Aggregate functions
| Function | Notes |
|---|---|
COUNT(*) | All rows including NULLs |
COUNT(col) | Non-NULL values in col |
COUNT(DISTINCT col) | Unique non-NULL values |
SUM / AVG | Ignore NULLs |
MIN / MAX | Smallest / largest non-NULL |
STRING_AGG / GROUP_CONCAT | Concatenate strings (PostgreSQL / MySQL) |
ARRAY_AGG | Collect into array (PostgreSQL) |
Window functions
-- Running total SELECT date, amount, SUM(amount) OVER (ORDER BY date) AS running_total FROM transactions; -- Rank within partitions SELECT department, name, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank FROM employees; -- Previous row's value SELECT date, amount, LAG(amount) OVER (ORDER BY date) AS prev_amount, amount - LAG(amount) OVER (ORDER BY date) AS day_over_day_change FROM transactions;
| Function | Returns |
|---|---|
ROW_NUMBER() | 1, 2, 3, … no ties |
RANK() | 1, 2, 2, 4 — ties get same rank, gap after |
DENSE_RANK() | 1, 2, 2, 3 — ties same rank, no gap |
NTILE(n) | Bucket into n groups (quartiles, deciles) |
LAG(col) / LEAD(col) | Previous / next row's value |
FIRST_VALUE / LAST_VALUE | First / last in window |
CTEs (Common Table Expressions)
WITH active_customers AS ( SELECT id, name FROM customers WHERE status = 'active' ), recent_orders AS ( SELECT customer_id, SUM(total) AS total FROM orders WHERE created_at >= NOW() - INTERVAL '30 days' GROUP BY customer_id ) SELECT ac.name, COALESCE(ro.total, 0) AS spent_30d FROM active_customers ac LEFT JOIN recent_orders ro ON ro.customer_id = ac.id ORDER BY spent_30d DESC; -- Recursive CTE: organisation hierarchy WITH RECURSIVE tree AS ( SELECT id, name, manager_id, 1 AS level FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.id, e.name, e.manager_id, t.level + 1 FROM employees e JOIN tree t ON e.manager_id = t.id ) SELECT * FROM tree;
INSERT / UPDATE / DELETE
-- Insert single
INSERT INTO users (name, email) VALUES ('Alice', 'a@x.com');
-- Insert from query
INSERT INTO users_archive (id, name)
SELECT id, name FROM users WHERE created_at < '2024-01-01';
-- Upsert (PostgreSQL)
INSERT INTO users (id, email) VALUES (1, 'a@x.com')
ON CONFLICT (id) DO UPDATE SET email = EXCLUDED.email;
-- Update with JOIN-like logic
UPDATE orders SET status = 'shipped'
WHERE id IN (SELECT order_id FROM shipments WHERE shipped_at IS NOT NULL);
-- Delete with safety
DELETE FROM logs WHERE created_at < NOW() - INTERVAL '90 days';
-- Always SELECT first to verify what will be deleted!Indexes
| Type | When to use |
|---|---|
| B-tree (default) | Equality, range, sorting — covers 95% of cases |
| Composite | WHERE on multiple columns. Order matters: most-selective first |
| Unique | Email, slug, business keys — enforces uniqueness |
| Partial | Index only WHERE active = true — smaller, faster |
| Expression / functional | WHERE LOWER(email) = ? → index on LOWER(email) |
| GIN (PostgreSQL) | Full-text search, JSONB, arrays |
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at DESC); CREATE UNIQUE INDEX idx_users_email ON users (email); CREATE INDEX idx_active ON users (created_at) WHERE status = 'active';
Dialect quick reference
| Operation | PostgreSQL | MySQL | SQLite |
|---|---|---|---|
| String concat | a || b or CONCAT | CONCAT(a, b) | a || b |
| Pagination | LIMIT 10 OFFSET 20 | LIMIT 20, 10 | LIMIT 10 OFFSET 20 |
| Auto-increment | SERIAL or GENERATED | AUTO_INCREMENT | INTEGER PRIMARY KEY |
| Now() | NOW() or CURRENT_TIMESTAMP | NOW() | CURRENT_TIMESTAMP |
| Upsert | ON CONFLICT | ON DUPLICATE KEY UPDATE | ON CONFLICT |
| Boolean | Native | TINYINT(1) | INTEGER 0/1 |
Performance hints
- Always run
EXPLAIN(orEXPLAIN ANALYZEin PostgreSQL) on slow queries — it shows the actual plan and timing. - SELECT only the columns you need — avoid
SELECT *in production code. - WHERE before JOIN where possible — filtering early reduces the rows that need joining.
- Use
EXISTSorINfor membership tests rather than counting all matches. - Don't wrap indexed columns in functions — that disables the index.
- Run
ANALYZE(or its equivalent) after large data loads — keeps stats fresh for the planner.
The SQL execution order (the one mental model that matters)
SQL is written in this order: SELECT … FROM … WHERE … GROUP BY … HAVING … ORDER BY … LIMIT. But it's executed in a different order: FROM & JOIN first (the rows exist), WHERE filters, GROUP BY collapses, HAVING filters groups, SELECT picks columns, ORDER BY sorts, LIMIT cuts. This explains why you can't reference an alias from SELECT in WHERE — WHERE runs before SELECT exists. You can in ORDER BY because it runs after.
Dialect differences
The core (SELECT/JOIN/GROUP BY) is universal. But pagination, string concatenation, date functions, and DDL diverge. PostgreSQL: LIMIT 10 OFFSET 20, || for concat. MySQL: LIMIT 20, 10, CONCAT(). SQL Server: OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY, + concat. SQLite is mostly PostgreSQL-compatible. When portability matters, prefer LIMIT/OFFSET and CONCAT().
Preguntas frecuentes
- What's the difference between INNER JOIN and LEFT JOIN?
- INNER JOIN returns only rows where the join condition matches in both tables — drops anything unmatched. LEFT JOIN keeps all rows from the left table, fills NULLs where the right table has no match. Practical rule: 'I want all customers, plus their orders if any' → LEFT JOIN customers→orders. 'I want only customers who have orders' → INNER JOIN. Most reporting queries you write are LEFT JOINs.
- When should I use a CTE instead of a subquery?
- CTEs (WITH clauses) make complex queries readable. Use a CTE when you reference the same subquery multiple times, when nesting depth would exceed 2 levels, or when you want named intermediate steps for debugging. Modern engines (PostgreSQL 12+, MySQL 8+) inline CTEs unless you use WITH ... MATERIALIZED, so there's no performance penalty. For one-shot derived tables, plain subqueries are fine.
- What does GROUP BY actually do?
- GROUP BY collapses rows that share the same value(s) in the grouped columns into one row, applying aggregate functions (COUNT, SUM, AVG, MIN, MAX) across the collapsed rows. Every column in SELECT must either be in GROUP BY or be inside an aggregate. Common mistake: SELECT name, COUNT(*) FROM users — fails because name isn't grouped. Either GROUP BY name or wrap it in an aggregate.
- How are window functions different from GROUP BY?
- GROUP BY collapses rows. Window functions (OVER clause) keep all rows but compute aggregates across a 'window' of related rows. Use window functions for running totals, ranks, percentiles, lag/lead — anything where you need to know about other rows but want to keep the original row in the output. Example: SELECT name, salary, AVG(salary) OVER (PARTITION BY dept) FROM employees gives every employee with their department's average salary.
- Why is my query slow even with an index?
- Common reasons: (1) Function on indexed column — WHERE LOWER(email) = 'x' kills the index; use citext or store lowercased values. (2) Leading wildcard — WHERE name LIKE '%foo' can't use a B-tree. (3) OR conditions across columns — engine often picks one index, sequential-scans the rest. (4) Wrong column order in compound index — index on (a,b) doesn't help WHERE b = 1. (5) Out-of-date statistics — run ANALYZE. EXPLAIN is your friend; always check the actual plan.
Herramientas relacionadas
- Formateador JSON
Formatea, embellece, minifica y valida JSON en tu navegador
- Generador de Código QR
Genera códigos QR para URLs, texto, Wi-Fi y más. Descarga como PNG.
- Generador de Contraseñas
Genera contraseñas fuertes y aleatorias con longitud y conjuntos de caracteres personalizados.
- Codificador / Decodificador Base64
Codifica texto a Base64 o decodifica Base64 de vuelta a texto plano.
- Codificador / Decodificador de URL
Codifica o decodifica URLs y cadenas de consulta con codificación por porcentaje.