SQL Cheat Sheet
Comprehensive SQL reference: SELECT, JOINs, GROUP BY, window functions, CTEs, INSERT/UPDATE/DELETE, indexes, and PostgreSQL/MySQL/SQLite differences.
Basic SELECT structure
SELECT column_a, column_b, AGG(column_c) AS total FROM table_a JOIN table_b ON table_a.id = table_b.a_id WHERE table_a.status = 'active' GROUP BY column_a, column_b HAVING AGG(column_c) > 100 ORDER BY total DESC LIMIT 50;
JOIN types
| Type | Returns | Use case |
|---|---|---|
INNER JOIN | Rows that match in both tables | Customers who have placed orders |
LEFT JOIN | All left rows + matching right (NULL otherwise) | All customers, with orders if any |
RIGHT JOIN | All right rows + matching left | Rare — usually rewritten as LEFT JOIN |
FULL OUTER JOIN | All rows from both, NULL where no match | Reconciling two datasets |
CROSS JOIN | Cartesian product | Rare — combinations, calendar dimensions |
SELF JOIN | Same table joined to itself | Hierarchies, employee→manager |
Aggregate functions
| Function | Notes |
|---|---|
COUNT(*) | All rows including NULLs |
COUNT(col) | Non-NULL values in col |
COUNT(DISTINCT col) | Unique non-NULL values |
SUM / AVG | Ignore NULLs |
MIN / MAX | Smallest / largest non-NULL |
STRING_AGG / GROUP_CONCAT | Concatenate strings (PostgreSQL / MySQL) |
ARRAY_AGG | Collect into array (PostgreSQL) |
Window functions
-- Running total SELECT date, amount, SUM(amount) OVER (ORDER BY date) AS running_total FROM transactions; -- Rank within partitions SELECT department, name, salary, RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank FROM employees; -- Previous row's value SELECT date, amount, LAG(amount) OVER (ORDER BY date) AS prev_amount, amount - LAG(amount) OVER (ORDER BY date) AS day_over_day_change FROM transactions;
| Function | Returns |
|---|---|
ROW_NUMBER() | 1, 2, 3, … no ties |
RANK() | 1, 2, 2, 4 — ties get same rank, gap after |
DENSE_RANK() | 1, 2, 2, 3 — ties same rank, no gap |
NTILE(n) | Bucket into n groups (quartiles, deciles) |
LAG(col) / LEAD(col) | Previous / next row's value |
FIRST_VALUE / LAST_VALUE | First / last in window |
CTEs (Common Table Expressions)
WITH active_customers AS ( SELECT id, name FROM customers WHERE status = 'active' ), recent_orders AS ( SELECT customer_id, SUM(total) AS total FROM orders WHERE created_at >= NOW() - INTERVAL '30 days' GROUP BY customer_id ) SELECT ac.name, COALESCE(ro.total, 0) AS spent_30d FROM active_customers ac LEFT JOIN recent_orders ro ON ro.customer_id = ac.id ORDER BY spent_30d DESC; -- Recursive CTE: organisation hierarchy WITH RECURSIVE tree AS ( SELECT id, name, manager_id, 1 AS level FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.id, e.name, e.manager_id, t.level + 1 FROM employees e JOIN tree t ON e.manager_id = t.id ) SELECT * FROM tree;
INSERT / UPDATE / DELETE
-- Insert single
INSERT INTO users (name, email) VALUES ('Alice', 'a@x.com');
-- Insert from query
INSERT INTO users_archive (id, name)
SELECT id, name FROM users WHERE created_at < '2024-01-01';
-- Upsert (PostgreSQL)
INSERT INTO users (id, email) VALUES (1, 'a@x.com')
ON CONFLICT (id) DO UPDATE SET email = EXCLUDED.email;
-- Update with JOIN-like logic
UPDATE orders SET status = 'shipped'
WHERE id IN (SELECT order_id FROM shipments WHERE shipped_at IS NOT NULL);
-- Delete with safety
DELETE FROM logs WHERE created_at < NOW() - INTERVAL '90 days';
-- Always SELECT first to verify what will be deleted!Indexes
| Type | When to use |
|---|---|
| B-tree (default) | Equality, range, sorting — covers 95% of cases |
| Composite | WHERE on multiple columns. Order matters: most-selective first |
| Unique | Email, slug, business keys — enforces uniqueness |
| Partial | Index only WHERE active = true — smaller, faster |
| Expression / functional | WHERE LOWER(email) = ? → index on LOWER(email) |
| GIN (PostgreSQL) | Full-text search, JSONB, arrays |
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at DESC); CREATE UNIQUE INDEX idx_users_email ON users (email); CREATE INDEX idx_active ON users (created_at) WHERE status = 'active';
Dialect quick reference
| Operation | PostgreSQL | MySQL | SQLite |
|---|---|---|---|
| String concat | a || b or CONCAT | CONCAT(a, b) | a || b |
| Pagination | LIMIT 10 OFFSET 20 | LIMIT 20, 10 | LIMIT 10 OFFSET 20 |
| Auto-increment | SERIAL or GENERATED | AUTO_INCREMENT | INTEGER PRIMARY KEY |
| Now() | NOW() or CURRENT_TIMESTAMP | NOW() | CURRENT_TIMESTAMP |
| Upsert | ON CONFLICT | ON DUPLICATE KEY UPDATE | ON CONFLICT |
| Boolean | Native | TINYINT(1) | INTEGER 0/1 |
Performance hints
- Always run
EXPLAIN(orEXPLAIN ANALYZEin PostgreSQL) on slow queries — it shows the actual plan and timing. - SELECT only the columns you need — avoid
SELECT *in production code. - WHERE before JOIN where possible — filtering early reduces the rows that need joining.
- Use
EXISTSorINfor membership tests rather than counting all matches. - Don't wrap indexed columns in functions — that disables the index.
- Run
ANALYZE(or its equivalent) after large data loads — keeps stats fresh for the planner.
SQLの実行順序(最も重要なメンタルモデル)
方言の違い
SQLはこの順序で書きます:SELECT … FROM … WHERE … GROUP BY … HAVING … ORDER BY … LIMIT。しかし実行される順序は異なります:まずFROM & JOIN(行が存在する)、WHEREでフィルタリング、GROUP BYで集約、HAVINGでグループをフィルタリング、SELECTで列を選択、ORDER BYで並び替え、LIMITで切り取り。これがWHERE句でSELECTのエイリアスを参照できない理由です——WHEREはSELECTが存在する前に実行されます。ORDER BYでは参照できます——SELECTの後に実行されるからです。
コア(SELECT/JOIN/GROUP BY)は共通です。しかしページネーション、文字列連結、日付関数、DDLは異なります。PostgreSQL:LIMIT 10 OFFSET 20、連結には||。MySQL:LIMIT 20, 10、CONCAT()。SQL Server:OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY、+で連結。SQLiteはほぼPostgreSQL互換です。移植性が重要な場合はLIMIT/OFFSETとCONCAT()を使用してください。
バックエンドエンジニアは、ページネーションされたユーザーレコードの取得、分析イベントの集計、レポート用のユーザーと注文テーブルの結合にクエリを書きます。データアナリストはウィンドウ関数とCTEを使ってコホート分析、ファネルドロップオフレポート、7日間のローリング平均を構築します。QAエンジニアはテスト実行後のデータベース状態を確認するアドホッククエリを書きます。基本的なSQLを持つプロダクトマネージャーは、データリクエストを待たずにシンプルなSELECT … WHERE … GROUP BYクエリでビジネスの質問に答えます。DevOpsエンジニアはエラー率やレイテンシのパーセンタイルのためにログテーブルをクエリします。
本番にクエリをデプロイする前に必ずEXPLAIN ANALYZE(PostgreSQL)またはEXPLAIN(MySQL)を実行してください——プランにはインデックスが必要な大きなテーブルへのシーケンシャルスキャンが現れることがよくあります。外部キー列にインデックスを付けてください:JOIN条件でほぼ常に使用され、よく見落とされます。アプリケーションクエリではSELECT *を避けてください——未使用の列のフェッチはI/Oを無駄にし、スキーマ変更時に壊れます。大きな分析クエリでは、5レベルのサブクエリをネストするより中間結果を一時テーブルに展開してください——デバッグしやすく、プランナーが一時テーブルにインデックスを作成できるため多くの場合より速くなります。
SQLデータベース(PostgreSQL、MySQL、SQLite)は、データが固定された明確なスキーマを持つ場合、エンティティ間の関係(ユーザー→注文→製品)を柔軟にクエリする必要がある場合、ACID トランザクション保証が重要な場合(財務記録、在庫)に優れています。NoSQLデータベース(MongoDB、DynamoDB、Cassandra)は、スキーマが流動的な場合、書き込みスループットが読み取りの複雑さをはるかに超える場合、数千ノードでの水平スケーリングが必要な場合に有利です。1億行までのほとんどのウェブアプリケーションでは、適切にインデックスが設定されたPostgreSQLデータベースがNoSQLの複雑さなしにすべてを処理します。
よくある質問
- What's the difference between INNER JOIN and LEFT JOIN?
- INNER JOIN returns only rows where the join condition matches in both tables — drops anything unmatched. LEFT JOIN keeps all rows from the left table, fills NULLs where the right table has no match. Practical rule: 'I want all customers, plus their orders if any' → LEFT JOIN customers→orders. 'I want only customers who have orders' → INNER JOIN. Most reporting queries you write are LEFT JOINs.
- When should I use a CTE instead of a subquery?
- CTEs (WITH clauses) make complex queries readable. Use a CTE when you reference the same subquery multiple times, when nesting depth would exceed 2 levels, or when you want named intermediate steps for debugging. Modern engines (PostgreSQL 12+, MySQL 8+) inline CTEs unless you use WITH ... MATERIALIZED, so there's no performance penalty. For one-shot derived tables, plain subqueries are fine.
- What does GROUP BY actually do?
- GROUP BY collapses rows that share the same value(s) in the grouped columns into one row, applying aggregate functions (COUNT, SUM, AVG, MIN, MAX) across the collapsed rows. Every column in SELECT must either be in GROUP BY or be inside an aggregate. Common mistake: SELECT name, COUNT(*) FROM users — fails because name isn't grouped. Either GROUP BY name or wrap it in an aggregate.
- How are window functions different from GROUP BY?
- GROUP BY collapses rows. Window functions (OVER clause) keep all rows but compute aggregates across a 'window' of related rows. Use window functions for running totals, ranks, percentiles, lag/lead — anything where you need to know about other rows but want to keep the original row in the output. Example: SELECT name, salary, AVG(salary) OVER (PARTITION BY dept) FROM employees gives every employee with their department's average salary.
- Why is my query slow even with an index?
- Common reasons: (1) Function on indexed column — WHERE LOWER(email) = 'x' kills the index; use citext or store lowercased values. (2) Leading wildcard — WHERE name LIKE '%foo' can't use a B-tree. (3) OR conditions across columns — engine often picks one index, sequential-scans the rest. (4) Wrong column order in compound index — index on (a,b) doesn't help WHERE b = 1. (5) Out-of-date statistics — run ANALYZE. EXPLAIN is your friend; always check the actual plan.
関連ツール
- JSON Formatter, Validator & Repair Tool
Format, minify, validate, and repair JSON instantly in your browser. Sort keys alphabetically, auto-format on paste, download as file, escape/unescape strings — free, no sign-up, 100% client-side.
- QRコード生成
URL、テキスト、Wi-FiなどのQRコードを生成。PNGでダウンロード可能。
- パスワード生成
カスタム長と文字セットで強力なランダムパスワードを生成。
- Base64 エンコーダー / デコーダー
テキストをBase64にエンコード、または逆にデコード。
- URLエンコーダー / デコーダー
パーセントエンコーディングでURLとクエリ文字列をエンコードまたはデコード。