Ch1 SQL data query

Keywords: MySQL


This article focuses on the most commonly used query data SELECT statement in SQL language.


1. SELECT * FROM data_ table; --> from clause

FROM declares that the data source for retrieving data is the data table

SELECT keyword, followed by the column name to be retrieved, * represents all columns, and the number of columns can be one or more. Multiple columns are separated by commas, and the last column name should not be followed by commas

2. De duplication, comment and limit the number of result lines

DISTINCT precedes the column name and is valid for all columns

# inline comments, / * * / multiline comments

LIMIT is used to LIMIT the number of result columns in MYSQL


SELECT DISTINCT prod_name, prod_id
FROM products

3. Sorting and sorting direction

ORDER BY column1 can be sorted according to a column or multiple rows. First, the first one is the same, and then the second one

ORDER BY is ascending by default. Descending sorting can be realized through DESC statement. It should be noted that DESC is different from DISTINCT. It can only act on the column name directly in front of it. If you want to order descending on multiple columns, you need to specify DESC for each column

ORDER BY should be written at the end of the statement, close to;, But more importantly, the ORDER BY sorting operation is the last of all operations. It operates on the data set of the result. You may not feel this now, as will be mentioned later

4. WHERE clause of filtered data

Set filter conditions in the WHERE clause to extract a subset of table data. The following are optional clause operators

(1) Clause operator:

(2) Judgment of null value is / is not

select * from [Table name] where Field name operator is null
select * from [Table name] where Field name operator is not null

(3) Combination condition AND OR NOT ()

The meanings of AND, not AND OR need not be explained. AND has higher priority than OR. Use parentheses to determine the priority. Eliminate ambiguity

SELECT prod_name,prod_price,quantity
FROM products
WHERE prod_price <= 100 AND quantity >= 1000;

(4) IN operator

Specify the condition range. Each condition IN the range can be matched. The IN operator can realize the nesting of SELECT statements, which will be described later

(5) LIKE operators and wildcards

Use wildcards to search text fields (strings) for fuzzy matching of strings

%Wildcards represent any number of occurrences of any character

select * from [Table name] where [Listing] like pattern;
select * from [Table name] where [Listing] not like pattern;
pattern ; Matching patterns, such as 'abc' '%abc' 'abc%' '%abc%' Single quotation marks must be added
# Find products starting with 'Fish'
SELECT prod_id, prod_name
FROM products
WHERE prod_name LIKE 'Fish%';

As for case sensitivity, windows is case insensitive, while linux is case sensitive

_ The underscore wildcard can only match a single character

It should be noted that wildcard search takes longer. Don't overuse wildcards. It's best to use others

5. Calculated fields and aliases

Field basically has the same meaning as column. The calculated field here mostly refers to the column obtained by calculation or other methods

(1) obtained by splicing

MYSQL needs to use the Concat function, which will be described later

​ eg:

SELECT Concat(vend_name,' (', vend_country,')')
FROM vendors
ORDER BY vend_name;

(2) obtained by arithmetic calculation

Perform arithmetic calculation on the retrieved data, support basic addition, subtraction, multiplication and division, and parentheses can also distinguish the operation order

SELECT prod_id, quantity, item_price
FROM orderitems
WHERE order_num = 20008;


SQL supports aliasing. The result of the calculated field here has no name but a value. However, an unnamed column cannot be used in the client application because the client cannot reference it

Aliases are given with the keyword AS

Aliasing is not only used for calculating fields, but also for column names and table names. It is more useful for the convenience of subsequent operations, such as self join, which will be discussed later


SELECT prod_id, quantity, item_price, quantity*item_price AS total_price
FROM orderitems

6. SQL function

SQL functions vary according to different database management systems. They are generally divided into three parts: text, date and arithmetic;

7. Aggregation function used to summarize data

In actual queries, we often need to summarize data without actually retrieving them, so SQL provides aggregation functions

Aggregation functions operate on columns, including the following:

AVG()Returns the average value of a column (ignoring NULL)
COUNT()Returns the number of rows in a column
MAX()Returns the maximum value of a column (ignoring NULL)
MIN()Returns the minimum value of a column (ignoring NULL)
SUM()Returns the sum of the values of a column (ignoring NULL)

COUNT(*) does not ignore NULL, and count (column) specifies that the column ignores NULL

SELECT COUNT(*) AS num_items,
		MIN(prod_price) AS price_min,
		MAX(prod_price) AS price_max,
		AVG(prod_price) AS price_avg
FROM products;

8. Grouped data

Grouping is established with the GROUP BY clause of the SELECT statement. WHERE is to filter the data table to produce a subset, while GROUP BY is to generate multiple logical groups (sets) through a column. Each GROUP has common characteristics and variable quantity. GROUP BY must appear after the WHERE clause and before the ORDER BY clause


SELECT vend_id, COUNT(*) AS num_prods
FROM products
GROUP BY vend_id;

HAVING filters the specified rows according to the filter grouping, WHERE. HAVING filters after the data grouping, WHERE filters before the data grouping

9. Syntax order and execution order of select clause (gray is often important)

Grammatical order


  2. FROM

  3. WHERE




Execution sequence

  1. FROM determine data source

  2. WHERE filter rows


  4. HAVING filter grouping

  5. SELECT columns that are not retrieved

  6. ORDER BY sort

Note that the aggregation function is not fixed

Alias problem here, the alias FROM the time of SELECT cannot be used earlier than its execution, and can be used later. The same is true for FROM

10. Subquery - nesting of queries

Every time a SELECT statement is retrieved, a data table is returned. Similarly, this data table can also be used as the data source of another SELECT statement. The IN operator is used


SELECT cust_id
FROM orders
WHERE order_num IN (SELECT order_num 
					FROM orderitems
					WHERE prod_id = 'RGAN01')

Subqueries are always processed from inside to outside. Due to the IN operator (guess), the SELECT statement as a subquery can only query a single column

Subqueries can also be calculated fields


SELECT cust_name,
	   	FROM orders
	   	WHERE orders.cust_id = customers.cust_id) AS orders
FROM customers
ORDER BY cust_name;

The reason why column names need to be fully qualified here is that when processing multiple data tables, multiple tables have the same column name, which needs to be distinguished to avoid ambiguity. Therefore, column names need to be fully qualified, which is more obvious in the later connection

11. Connection table

I believe you are tired or left early. What I wrote is really not so easy to understand. It is mainly used for reference in the future

Join tables are one of the most powerful features of SQL

As mentioned in the previous article, it is not a good thing for the same data to appear many times in a database. According to the three paradigms of relational database design, relational tables often appear, one table for one kind of data, and each table is related to each other through some common values

Therefore, SQL must provide multi data table join query, that is, join table

(1) Inner connection

SELECT vend_name, prod_name, prod_price
FROM vendors, products
WHERE vendors.vend_id  = products.vend_id;
SELECT vend_name, prod_name, prod_price
FROM vendors 
INNER JOIN products ON vendors.vend_id = products.vend_id;

Fully qualified column names are used here. Both vendors and products have a column called vend_id must be distinguished

FROM here, there are two data tables as the data source. In fact, the two tables do Cartesian product to produce n*m rows of data, and then filter certain rows through WHERE

Two different ways of writing are OK

Don't join unnecessary tables. The more tables you join, the worse the performance degradation. You should clarify which tables are the same and completely limit them

(2) Self coupling

In fact, this is to connect yourself with yourself (do Cartesian product) by aliasing table names

SELECT c1.cust_id, c1.cust_name, c1.cust_contact
FROM customers AS c1, customers AS c2
WHERE c1.cust_name = c2.cust_name 
	AND c2.cust_contact = 'jim jones';

(3) Natural connection

Whenever a table is joined, at least one column should appear in more than one table (the joined column). The standard join returns all data, and the same column even appears multiple times. The natural join excludes multiple occurrences, so that each column returns only once. This is done by using wildcards (SELECT *) for one table and explicit subsets for the columns of other tables

SELECT C.*, O.order_num, O.order_date,
		OI.prod_id, OI.quantity, OI.item_price
FROM customers AS C, orders AS O,
	 orderitems AS OI
WHERE C.cust_id = O.cust_id
	AND OI.order_num = O.order_num
	AND prod_id = 'RGAN01';

(4) Extraneous junction

The outer join contains rows that have no associated rows in the related table;

For example, if a customer keeps a table and an order keeps a table, not all customers have orders, nor do customers have only one order. There may be multiple orders. Count the number of orders for each customer

If you use an inline connection, customers without orders will be filtered out, while an external connection will not

SELECT customers.cust_id,
	   COUNT(orders.order_num) AS num_ord
FROM customers
  LEFT OUTER JOIN orders ON customers.cust_id = orders.cust_id
GROUP BY customers.cust_id;

12. Combined query

Through UNION, the results of multiple SELECT statements and queries are combined into one result set for return. Each query must contain the same column, expression or aggregation function. The data types of columns must be compatible


The SELECT statement is broad and profound. I'm so tired. I've studied it all day, 886~~

Posted by elecktricity on Sun, 21 Nov 2021 12:04:25 -0800