T-SQL -- pivoting and inverse pivoting

Keywords: Database SQL

0. Test data set and description

0.1 preparation of test data

[test data 1]

WITH Temp1 AS 
(
SELECT 'Zhang San' AS Name, 'chinese' AS Subject, one hundred AS Scores
UNION ALL
SELECT 'Zhang San' AS Name, 'mathematics' AS Subject, ninety AS Scores
UNION ALL
SELECT 'Zhang San' AS Name, 'English' AS Subject, eighty AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'chinese' AS Subject, ninety AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'mathematics' AS Subject, seventy AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'English' AS Subject, sixty AS Scores
)
SELECT * FROM  Temp1 

result:
Name Subject Scores
---- ------- -----------
Zhang San Chinese      one hundred
 Zhang San mathematics      ninety
 Zhang San English      eighty
 Li Si Chinese      ninety
 Li Si mathematics      seventy
 Li Si English      sixty

[test data 2]

WITH Temp2 AS 
(
 SELECT 'Zhang San' AS Name, 'Class 2, Grade 8' AS Class, NULL AS English, NULL AS mathematics, one hundred AS chinese
 UNION ALL
 SELECT 'Li Si' AS Name, 'Class 2, grade 3' AS Class, sixty AS English, seventy AS mathematics, ninety AS chinese
 UNION ALL
 SELECT 'Zhang San' AS Name, 'Class 2, grade 3' AS Class, eighty AS English, ninety AS mathematics, NULL AS chinese
 )
SELECT * FROM Temp2 

result:
Name Class      English          mathematics          chinese
---- ---------- ----------- ----------- -----------
Zhang, class 2, grade 38      NULL        NULL        one hundred
 Class 2, grade 3, Li Si      sixty          seventy          ninety
 Zhang San, class 2, grade 3      eighty          ninety          NULL

0.2 understanding of one-dimensional table and two-dimensional table

Test data 1:

  • Called: one dimensional table
  • Characteristics: duplicate values may occur in specific fields of each record
  • A one-dimensional table is the most primitive form of data recording.
  • A one-dimensional table contains a column of fields representing the properties of each record. For example, the Subject column in test data 1
  • The field name of a one-dimensional table represents the actual meaning of the data under the field. For example, the data under the Scores column in test data 1 represents the score

Test data 2:

  • Called: 2D table
  • Feature: the column field name is the type information, and the name of each record
  • The two-dimensional table displays the data information after classification and summary
  • If there is no explanation for the two-dimensional table, it is impossible to clarify the meaning of the data, such as test data 2. If you don't tell you that this is the result, you can't understand the true meaning of the data value

Comparison between one-dimensional table and two-dimensional table:

  • According to the first database paradigm: an entity should not contain duplicate types. The corresponding table is that similar values cannot be repeated on multiple columns of a table

    • The creation table should be created in the format of one-dimensional table to record the generated information.
    • If you create a two-dimensional table when creating a table and use all categories as fields, there may be many fields in the table. When categories are added, you need to continue to add fields later.
  • One dimensional table is convenient for storing data and also for later query

  • The two-dimensional table can intuitively represent the information of data. It is a report and report for display

1. Perspective conversion

1.1 pivoting using standard SQL

--Note that the business scenario is implemented here CASE WHEN Statement
--We should use the default ELSE NULL ,instead of ELSE 0. No data is no data, 0 is 0
SELECT Name,
	   SUM(CASE WHEN Subject ='chinese' THEN Temp1.Scores ElSE NULL END) AS chinese ,
	   SUM(CASE WHEN Subject ='mathematics' THEN Temp1.Scores ELSE NULL END) AS mathematics ,
	   SUM(CASE WHEN Subject ='English' THEN Temp1.Scores ELSE NULL END) AS English 
FROM Temp1  GROUP BY Temp1.Name

result:
Name     English          mathematics          chinese
----   ----------- ----------- -----------
Li Si      sixty           seventy            ninety
 Zhang San      eighty           ninety            one hundred

[note]: using standard SQL statements for data perspective conversion, we can find that perspective conversion is divided into three stages:

Grouping -- > extension -- > aggregation

Take the above example to illustrate:

  • Grouping: we group according to the Name field so that each user generates a record
    • Grouping using GROUP BY
  • Extension: extension is the extension of columns. Use case statement to expand three subjects (Chinese, mathematics and English) into columns
    • Use SELECT in conjunction with CASE WHEN
  • Aggregate: aggregate by grouping basis. Here, group by Name field and aggregate Scores field by SUM function
    • Aggregate using any aggregate function

Compare the above three logical steps:

  • Name field: grouping by field (grouping element). Specifies the left column to be used for the pivot report, also known as the set key
  • Subject field: field to be extended (element to be extended). The column where the column head in the pivot result table is located is also called the pivot key, and "Chinese, mathematics, English" is the set of column names of the target column, also known as the pivot output column
  • Scores field: aggregate field (aggregate element). The data mainly displayed in the pivot result report is the value from this field, also known as pivot value

1.2 pivoting using pivot function in T-SQL

[function]: pivot
[function]: row to column function, SQL Server 2005
[syntax]: pivot (column name in (value to be converted to column name) of any aggregate function for value to be converted to column name)
[usage]: select... from table pivot(...) as T

[example 1]:
There is a transcript, which records the scores of each student in each subject, as follows.
Now we need to take the subjects as the column head to count the scores of each student.

WITH Temp1 AS 
(
SELECT 'Zhang San' AS Name, 'chinese' AS Subject, 100 AS Scores
UNION ALL
SELECT 'Zhang San' AS Name, 'mathematics' AS Subject, ninety AS Scores
UNION ALL
SELECT 'Zhang San' AS Name, 'English' AS Subject, 80 AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'chinese' AS Subject, ninety AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'mathematics' AS Subject, seventy AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'English' AS Subject, 60 AS Scores
)
SELECT T.Name, T.English, T.mathematics, T.chinese FROM Temp1 PIVOT(SUM(Scores) FOR [Subject] IN (chinese,mathematics,English)) T;

result:
Name  English          mathematics          chinese
---- ----------- ----------- -----------
Li Si60            seventy          ninety
 Zhang San 80            ninety          100

[Note 1]:
We are in the following format:
select... From table pivot (any aggregate function (aggregate field) for field name to be extended in (element set to be extended)) as T
Pivoting a table using pivot

If the field in table is neither an aggregate field nor a field to be extended, it will default to the field of grouping basis in pivot function.

Therefore, the pivoting result set using pivot function is generally not the original table in the database, but the result set of a specific field queried

[Note 2]:
If the element set to be extended is an unconventional identifier, such as other types of data starting with date or number, we need to use square brackets to enclose each element to be extended.

[example 2]:

Now add a Class field in test data 1 and pivot the data with pivot function

WITH Temp1 AS 
(
SELECT 'Zhang San' AS Name, 'chinese' AS Subject, 100 AS Scores,'Class 1, Grade 8' AS Class
UNION ALL
SELECT 'Zhang San' AS Name, 'mathematics' AS Subject, 90 AS Scores,'Class 2, grade 3' AS Class
UNION ALL
SELECT 'Zhang San' AS Name, 'English' AS Subject, 80 AS Scores,'Class 2, grade 3' AS Class
UNION ALL
SELECT 'Li Si' AS Name, 'chinese' AS Subject, 90 AS Scores,'Class 2, grade 3' AS Class
UNION ALL
SELECT 'Li Si' AS Name, 'mathematics' AS Subject, 70 AS Scores,'Class 2, grade 3' AS Class
 UNION ALL
 SELECT 'Li Si' AS Name, 'English' AS Subject, 60 AS Scores,'Class 2, grade 3' AS Class
)
SELECT T.Name,T.Class, T.English, T.mathematics, T.chinese FROM Temp1 PIVOT(SUM(Scores) FOR [Subject] IN (chinese,mathematics,English)) T;

result:

Name      Class      English          mathematics          chinese
---- ---------- ----------- ----------- -----------
Zhang, class 1, grade 38      NULL        NULL        100
 Class 2, grade 3, Li Si      60          70          90
 Zhang San, class 2, grade 3      80          90          NULL

[description]: the class in the result set Temp1 is neither an aggregation field nor a field to be extended, so it will be the grouping basis by default!

Often, the grouping basis we expect is only the Name field,

Therefore, it is generally recommended not to directly pivot the basic table in the database, but to query specific fields as a table result set, and then pivot the result set.

1.3 how to obtain the element set to be extended

2. Inverse perspective conversion

Inverse perspective: inverse perspective conversion expands the values from multiple columns in a single record to multiple records with the same value in a single column, so that non-standard data can be integrated into a more standard version.

2.1 reverse pivoting using standard SQL

The inverse perspective of data is also divided into three steps: treat, beheading, accept and be a dog

Ah spit, no, it should be: generate a copy -- > extract elements -- > delete irrelevant intersections

WITH Temp AS 
(
	--Test data set
 	SELECT 'Zhang San' AS Name, 'Class 2, Grade 8' AS Class, NULL AS English, NULL AS mathematics, 100 AS chinese
 	UNION ALL
 	SELECT 'Li Si' AS Name, 'Class 2, grade 3' AS Class, 60 AS English, 70 AS mathematics, 90 AS chinese
 	UNION ALL
 	SELECT 'Zhang San' AS Name, 'Class 2, grade 3' AS Class, 80 AS English, 90 AS mathematics, NULL AS chinese
 )
 ,Temp1 AS 
 (
	--Make a copy for each row
	--Cross connect with the column requiring inverse perspective to obtain Cartesian product
 	SELECT * FROM  Temp CROSS JOIN( VALUES('English'),('mathematics'),('chinese')) AS T(SubjectName)
	--result
	--Name	Class	   English	   mathematics	   chinese	    SubjectName
	--Zhang San	Class 2, Grade 8	NULL	NULL	100	    English
	--Zhang San	Class 2, Grade 8	NULL	NULL	100	    mathematics
	--Zhang San	Class 2, Grade 8	NULL	NULL	100	    chinese
	--Li Si	Class 2, grade 3	60	    70	    90	    English
	--Li Si	Class 2, grade 3	60	    70	    90	    mathematics
	--Li Si	Class 2, grade 3	60	    70	    90	    chinese
	--Zhang San	Class 2, grade 3	80	    90   	NULL	English
	--Zhang San	Class 2, grade 3	80	    90  	NULL	mathematics
	--Zhang San	Class 2, grade 3	80	    90  	NULL	chinese
 )
 ,Temp2 AS
 (
	 --Current row SubjectName If it is English, take the English column
	SELECT Name,Class,Temp1.SubjectName,
        CASE Temp1.SubjectName
           WHEN 'English' THEN
               English
           WHEN 'mathematics' THEN
               mathematics
           WHEN 'chinese' THEN
               chinese
        END AS Scores
    FROM Temp1
	--result
	--Name	Class	SubjectName	Scores
	--Zhang San	Class 2, Grade 8	English	NULL
	--Zhang San	Class 2, Grade 8	mathematics	NULL
	--Zhang San	Class 2, Grade 8	chinese	100
	--Li Si	Class 2, grade 3	English	60
	--Li Si	Class 2, grade 3	mathematics	70
	--Li Si	Class 2, grade 3	chinese	90
	--Zhang San	Class 2, grade 3	English	80
	--Zhang San	Class 2, grade 3	mathematics	90
	--Zhang San	Class 2, grade 3	chinese	NULL
)
--Delete irrelevant cross rows (exclude) NULL Value (row)
SELECT * FROM Temp2 WHERE  Temp2.Scores IS NOT NULL

--result
--Name	Class	SubjectName	Scores
--Zhang San	Class 2, Grade 8	chinese	100
--Li Si	Class 2, grade 3	English	60
--Li Si	Class 2, grade 3	mathematics	70
--Li Si	Class 2, grade 3	chinese	90
--Zhang San	Class 2, grade 3	English	80
--Zhang San	Class 2, grade 3	mathematics	90

2.2 inverse pivoting using the unpivot function in T-SQL

Function: unpivot
[function]: column to row function, SQL Server 2005
[syntax]: unpivot (column name as attribute value column for attribute name in (column name of the same attribute))
[usage]: select... from table unpivot(...) as T

[example 1]

WITH Temp2 AS 
(
 SELECT 'Zhang San' AS Name, 'Class 2, Grade 8' AS Class, NULL AS English, NULL AS mathematics, 100 AS chinese
 UNION ALL
 SELECT 'Li Si' AS Name, 'Class 2, grade 3' AS Class, 60 AS English, 70 AS mathematics, 90 AS chinese
 UNION ALL
 SELECT 'Zhang San' AS Name, 'Class 2, grade 3' AS Class, 80 AS English, 90 AS mathematics, NULL AS chinese
 )
SELECT * FROM Temp2 UNPIVOT(Socore FOR Subject IN(English, mathematics, chinese)) T;

result:
Name    Class        Socore      Subject
----   ----------   --------    ----------
Zhang, class 2, grade 38      100         chinese
 Class 2, grade 3, Li Si      60          English
 Class 2, grade 3, Li Si      70          mathematics
 Class 2, grade 3, Li Si      90          chinese
 Zhang San, class 2, grade 3      80          English
 Zhang San, class 2, grade 3      90          mathematics

3. After pivoting, reverse pivoting reduces data information

In the above example data, pivot the original data (test data 1) into test data 2. The inverse perspective result of test data 2 is changed back to test data 1.

This is just a coincidence, because there are no multiple records with the same attribute (the same person and the same subject) in our test data 1:
For example, if Li Si has two English scores as follows:

WITH Temp1 AS 
(
SELECT 'Zhang San' AS Name, 'chinese' AS Subject, 100 AS Scores
UNION ALL
SELECT 'Zhang San' AS Name, 'mathematics' AS Subject, 90 AS Scores
UNION ALL
SELECT 'Zhang San' AS Name, 'English' AS Subject, NULL AS Scores--Zhang San has no record in English
UNION ALL
SELECT 'Li Si' AS Name, 'chinese' AS Subject, 90 AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'mathematics' AS Subject, 70 AS Scores
UNION ALL
SELECT 'Li Si' AS Name, 'English' AS Subject, 60 AS Scores--Li Si has two English achievements
UNION ALL
SELECT 'Li Si' AS Name, 'English' AS Subject, 60 AS Scores--Li Si has two English achievements
)
,Temp2 AS --Pivoting
(
SELECT * FROM  Temp1  PIVOT(SUM(Scores) FOR Subject IN (chinese,mathematics,English))AS T
)
SELECT * FROM Temp2 UNPIVOT(Scores FOR Subject IN (chinese,mathematics,English))AS T

Perspective results:

Name    chinese          mathematics          English
----   ----------- ----------- -----------
Li Si     90           70            one hundred and twenty
 Zhang San     100          90            NULL

Reverse perspective of perspective results:

Name Scores      Subject
---- ----------- -------------
Li si90          chinese
 Li Si 70          mathematics
 Li Si 120         English--Here, the inverse perspective cannot be restored to the original data (here, it is aggregated into one piece of data)
Zhang San 100         chinese
 Zhang San 90          mathematics
					   --Zhang San English NULL Records exist in the original data, but not after the perspective and reverse perspective

As detailed earlier, perspective consists of three logical steps, grouping -- > extension -- > aggregation.

Because pivoting contains aggregation, the amount of information in the table is reduced, so it is impossible to restore the original data in reverse pivoting.

Simply put:

UNPIVOT is not entirely the reverse operation of PIVOT.
PIVOT performs aggregation and merges multiple possible rows into one row in the output.
UNPIVOT does not reproduce the result of the original table valued expression because the rows have been merged.

In addition, the NULL value in the UNPIVOT input also disappears in the output.
If the value disappears, it indicates that the original NULL value may already exist in the input before the PIVOT operation.

Posted by thesmartone on Thu, 07 Oct 2021 00:51:55 -0700