

Photo by editor
# Introduction
Data analysts need to work with large amounts of information stored in databases. Before they can create reports or find insights, they must first pull the right data and prepare it for use. This is where SQL (Structured Query Language) comes in. SQL is a tool that helps analysts retrieve data, clean it and organize it in the desired format.
In this article, we’ll look at the most important SQL queries that every data analyst should know.
# 1. Selecting data with select
Select Statement is the basis of SQL. You can select or use specific columns * To return all available fields.
SELECT name, age, salary FROM employees;This query only draws namefor , for , for , . ageand salary column from employees Table
# 2. Filtering data with where
where It annoys people who match your circumstances. It supports comparison and logical operators to create precise filters.
SELECT * FROM employees WHERE department="Finance";Where the clause returns only the employees belonging to the Finance Department.
# 3. Sorting results with order
Order by The clause sorts the query results in ascending or descending order. It is used to rank records by numeric, text, or date values.
SELECT name, salary FROM employees ORDER BY salary DESC;This query sorts the employees by salary in descending order, so the highest paid employees appear first.
# 4. Remove duplicates with separate
separately The keyword returns only the unique values ​​from the column. This is useful when creating neat lists of categories or attributes.
SELECT DISTINCT department FROM employees;Removes separate duplicate entries, returning each department name only once.
# 5. Limiting results with limits
limit The clause limits the number of rows returned by a query. It is often paired Order by To display high results or sample data from large tables.
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;It retrieves the top 5 employees with the highest salaries by combining them Order by with the limit.
# 6. Collection with group by
By group Rows of clauses that share the same values ​​in specified columns. It is used in aggregate functions such as SUM()for , for , for , . AVG()or COUNT() To prepare the summary.
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;Group organizes queues by department, and AVG(salary) Calculates the average salary for each group.
# 7. Filtering groups with being
to be Clause filters group the results after aggregation. It is used when conditions depend on aggregate values, such as sums or averages.
SELECT department, COUNT(*) AS num_employees
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;The query counts the number of employees in each department and then filters to keep only departments with more than 10 employees.
# 8. Join the tables with joins
join clause joins rows from two or more tables based on the corresponding column. It helps in retrieving associated data, such as employees with their departments.
SELECT e.name, d.name AS department
FROM employees e
JOIN departments d ON e.dept_id = d.id;Here, the join associates the employees with their matching department names.
# 9. Combination of results with union
Union Combines the results of two or more queries into a single dataset. It automatically removes duplicates unless you use them UNION ALLwhich holds them.
SELECT name FROM employees UNION SELECT name FROM customers;This query joins the two names employees And customers Tables in the same list.
# 10. String functions
String functions in SQL are used to manipulate and transform text data. They help with things like concatenating names, changing case, trimming spaces, or extracting parts of a string.
SELECT CONCAT(first_name, ' ', last_name) AS full_name, LENGTH(first_name) AS name_length FROM employees;This query creates a full name by concatenating the first and last names and calculates the length of the first name.
# 11. Date and time functions
The date and time functions in SQL let you work with temporary data for analysis and reporting. They can calculate differences, extract components such as years or months, and adjust dates by adding or subtracting intervals. For example, DATEDIFF() with the CURRENT_DATE Can measure duration.
SELECT name, hire_date, DATEDIFF(CURRENT_DATE, hire_date) AS days_at_company FROM employees;It calculates how many days each employee has been with the company from today.
# 12. Creating new columns with case
Case Expressions create new columns with conditional logic, similar to IF-else statements. It lets you dynamically categorize or change data within your queries.
SELECT name,
CASE
WHEN age < 30 THEN 'Junior'
WHEN age BETWEEN 30 AND 50 THEN 'Mid-level'
ELSE 'Senior'
END AS experience_level
FROM employees;The case statement creates a new column called experience_level Based on age limits
# 13. Handling Missing Values ​​with Coalesce
coalesce Handles missing values ​​by returning the first non-null value from the list. It is usually used to replace NULL Fields with default values, such as “N/A”.
SELECT name, COALESCE(phone, 'N/A') AS contact_number FROM customers;Here, Coalesce replaces missing phone numbers with “N/A”.
# 14. Subqueries
Subsets are nested queries within another query to provide intermediate results. They are used WHEREfor , for , for , . FROMor SELECT Clauses to dynamically filter, compare, or create datasets.
SELECT name, salary FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);This query compares each employee’s salary to the company’s average salary using a nested subset.
# 15. Window functions
Window functions perform calculations on a set of rows while still returning individual row details. They are commonly used for sorting, running totals and comparing values ​​between rows.
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank FROM employees; RANK() The function assigns a salary-based classification to each employee without grouping the rows.
# The result
Mastering SQL is an extremely valuable skill for any data analyst, as it provides the foundation for extracting, transforming and interpreting data. From filtering and aggregation to joining and maintaining datasets, SQL empowers analysts to turn raw data into meaningful insights that drive decision-making. By mastering the essential questions, analysts not only streamline their workflow but also ensure accuracy and scalability in their analyses.
Jayta Gulti A machine learning enthusiast and technical writer driven by his passion for building machine learning models. He holds a Master’s degree in Computer Science from the University of Liverpool.