# Handle missing and abnormal values in the data

Keywords: Python data visualization

# preface

In order to cope with my homework, I wanted to find a code on the Internet and handed it in. I found that I didn't have what I wanted, so I wrote it in tears for two days. In order to study hard, I rolled it up with my classmates 🤣
If you need to hand in your homework, just look at the source code directly. In the CleaningData () method, you must change it to the data you need. I've written everything else for you

# target

Deal with missing and abnormal values in data (find out and re assign values)

# process

## Blank value (missing value)

For the processing of blank value, I use the average value of the two lines before and after the blank value to assign a value;
If the first and second lines are blank values, the position will always be found;
If a column has only one data, I will use this data to assign a value to the whole column ψ (｀∇´) ψ
(at least I hope the last situation doesn't happen)

## Outliers

Exception handling, I use Three point Lagrange interpolation
Why use this? Because the teacher asked ƪ ( ˘ ⌣ ˘)ʃ
Use three points to find an unknown equation, and use this equation to predict the value of abnormal value position.
This equation can be worked out. You can try to output it. (shown below)

# harvest

Harvest 1:

```# Show all columns
pd.set_option('display.max_columns', None)
# Show all rows
pd.set_option('display.max_rows', None)
# Align column names
pd.set_option('display.unicode.ambiguous_as_wide', True)
pd.set_option('display.unicode.east_asian_width', True)
# Cancel scientific counting output
np.set_printoptions(suppress=True)
# In order to display the numbers intuitively, the scientific counting method is not used
pd.set_option('display.float_format', lambda x: '%.2f' % x)
# Used to display Chinese labels normally
plt.rcParams['font.sans-serif'] = ['SimHei']
# Used to display negative signs normally
plt.rcParams['axes.unicode_minus'] = False
```

Harvest 2:
np.where (data = = '1'): find the position of 1

```print(np.where(data == ''))  # Find the coordinates of the blank value
```

Harvest 3:
A valid multidimensional iterator object that can traverse an array.
Here I convert string data into floating-point numbers. Two can be implemented

```np.nditer(row)
or
[float(num) for num in arr[:, 0]]
```

Harvest 4:
Three point Lagrange interpolation

```from scipy.interpolate import lagrange
iloc_point_of_time = [float(num) for num in arr[:, 0]][4:7:1]
iloc_temperature = [float(num) for num in arr[:, 1]][4:7:1]
a1 = lagrange(iloc_point_of_time, iloc_temperature)
print(a1)
```

Harvest 5:
If you want to make the points on the line chart highlight, you only need to use plot () and scatter () at the same time
It is the combination of line chart and scatter chart # Core source code

## Process blank values

Get the coordinate array of the blank array, and pass the elements of the array to row and col in fill_value (arrow, row, Col)

```def Create_Fill_Value_Arr(arr):  # If you need coordinates, the number of abscissa and ordinate should be the same. Create a 2-dimensional zero array with len (row) length
# row and col are sets of abscissa and ordinate, respectively
# print(np.where(data == ''))  # Find the coordinates of the blank value
row = np.where(arr == '')
col = np.where(arr == '')
CoordinateArray = np.zeros((len(row), 2))  # Blank value coordinate array
row_count = 0
for i in np.nditer(row):
CoordinateArray[row_count] = i
row_count += 1
col_count = 0
for i in np.nditer(col):
CoordinateArray[col_count] = i
col_count += 1
return CoordinateArray
```

arr: array to be processed
row: abscissa of the element to be processed
col: ordinate of the element to be processed

```def Fill_Value(arr, row, col):
# The two-dimensional array is filled with blank values
# There is a precondition here. This assignment is based on the upstream and downstream values of the same column, so the column must not 1 change, only the number of rows
current_row = row  # Record the number of lines entered
current_col = col  # Record the number of columns entered
next_row = 0  # Record the number of lines in the next line
next_col = col  # Record the number of columns in the next column
pre_row = 0  # Record the number of lines in the previous line
pre_col = col  # Record the number of columns in the previous column

# Assignment of the first and last lines
if row == 0 or row == len(arr) - 1:
if row == 0:  # If the position of the input element is in the first row, the non empty number of the next row in the same column is assigned to it
flag = True  # Tag, if the assignment ends, the loop ends
while flag:
current_row += 1
if arr[current_row][current_col] != '':
arr[row][col] = arr[current_row][current_col]  # assignment
flag = False
elif arr[current_row][current_col] == '':  # Judge if the current number is null
current_row += 1  # Then find the value of the next row in the same column
else:
print("Fill_Value()method==first line==Something went wrong!!")
elif row == len(arr) - 1:  # If the position of the input element is in the last row, the non empty number of the row on the same column is assigned to it
flag = True  # Tag, if the assignment ends, the loop ends
current_row -= 1
while flag:
if arr[current_row][current_col] != '':
arr[row][col] = arr[current_row][current_col]
flag = False
elif arr[current_row][current_col] == '':
current_row -= 1
else:
print("Fill_Value()Methodical==Last line==Something went wrong!!")
else:
print("If this occurs, the condition judgment of the first and last lines is wrong")

# Assignment of non first and last lines
elif row != 0 and row != len(arr) - 1:  # Assign the average value of the upper and lower rows of the same column of the value to it
# There is an arithmetic pit here, that is, when you get to this line, the previous numbers must be non empty values, because the above algorithm fills the previous values,
# Therefore, for this number, the first n lines have values, but the latter ones do not necessarily have values. It is possible that many lines have null values,
# Therefore, we need to make some judgment when selecting the next value
# The practice here is to find a value to average no matter how many lines are separated
# If all rows in this row are blank values, the algorithm here assigns the previous value of this row to all subsequent rows
flag = True  # Tag, if the assignment ends, the loop ends
next_row = current_row + 1
pre_row = current_row - 1
while flag:
if arr[next_row][next_col] != '':  # Here we judge whether the next line is null
arr[row][col] = str((float(arr[next_row][next_col]) +  # Since the numerical calculation of two numbers is involved here, it needs to be converted to numerical type, while the above only involves the assignment behavior of a single line
float(arr[pre_row][pre_col])) / 2)  # Floating point numbers are required here, not integers
flag = False
elif arr[next_row][next_col] == '':
next_row += 1
else:
print("Fill_Value()Methodical==Assignment of non first and last lines==Something went wrong!!")
else:
print('How could this be wrong?')
return arr
```

## Handling outliers

You need to modify some data here, because when you write it, you find that if you keep improving, it won't be enough for me to write in a week, so I can only be lazy secretly. Forgive me (´▽ ʃ ♡ ƪ) (≧∇≦)ﾉ(❁´◡❁)

arr: enter the array after processing the blank value

```# Abnormal data can be obtained through the box diagram
def CleaningData(arr):
# The data of arr is all strings
# If a data is very different from all the data in the table, it is necessary to replace the normal data with the abnormal data
point_of_time = [float(num) for num in arr[:, 0]]  # This sampling time point should not be used
temperature = [float(num) for num in arr[:, 1]]
relative_humidity = [float(num) for num in arr[:, 2]]
Gas_concentration = [float(num) for num in arr[:, 3]]
Carbon_monoxide_concentration = [float(num) for num in arr[:, 4]]
# The data of data is all float
data = {
'temperature': temperature,
'relative_humidity': relative_humidity,
'Gas_concentration': Gas_concentration,
'Carbon_monoxide_concentration': Carbon_monoxide_concentration,
}
df = pd.DataFrame(data)
df.boxplot()  # Here, pandas has its own process
plt.ylabel("ylabel")
plt.xlabel("xlabel")  # We set the title of the abscissa and ordinate.
plt.savefig('Box diagram.jpg')
plt.show()
# The des data here comes from df.describe (), because
des = {
'temperature': {
'count': 44.00,
'mean': 53.63,
'std': 146.07,
'min': 20.50,
'25%': 29.06,
'50%': 31.73,
'75%': 35.93,
'max': 999.99,
},
'relative_humidity': {
'count': 44.00,
'mean': 118.39,
'std': 194.72,
'min': 60.00,
'25%': 67.00,
'50%': 76.00,
'75%': 88.25,
'max': 999.99,
},
'Gas_concentration': {
'count': 44.00,
'mean': 25.23,
'std': 150.37,
'min': 1.08,
'25%': 1.90,
'50%': 2.65,
'75%': 3.39,
'max': 999.99,
},
'Carbon_monoxide_concentration': {
'count': 44.00,
'mean': 49.15,
'std': 209.90,
'min': 1.08,
'25%': 2.47,
'50%': 3.94,
'75%': 5.23,
'max': 999.99,
}
}
columns_list = ['temperature', 'relative_humidity', 'Gas_concentration', 'Carbon_monoxide_concentration']
# If any value exceeds the average value, the value is regarded as an abnormal value
arrays_list = []  # This list is used to record the coordinates of outliers
col = 0
row = 0
for col_list in columns_list:
for i in range(len(arr)):
if data[col_list][row] >= des[col_list]['std']:
arrays_list.append([row, col])
row += 1
elif data[col_list][row] < des[col_list]['std']:
row += 1
else:
print("This can't happen once")
col += 1
row = 0
# Three values select (4, 5, 6) rows
iloc_point_of_time = [float(num) for num in arr[:, 0]][4:7:1]  # This sample collection time
iloc_temperature = [float(num) for num in arr[:, 1]][4:7:1]
iloc_relative_humidity = [float(num) for num in arr[:, 2]][4:7:1]
iloc_Gas_concentration = [float(num) for num in arr[:, 3]][4:7:1]
iloc_Carbon_monoxide_concentration = [float(num) for num in arr[:, 4]][4:7:1]
# Here, the lagrange three value interpolation method is used to call the lagrange () function
a1 = lagrange(iloc_point_of_time, iloc_temperature)
a2 = lagrange(iloc_point_of_time, iloc_relative_humidity)
a3 = lagrange(iloc_point_of_time, iloc_Gas_concentration)
a4 = lagrange(iloc_point_of_time, iloc_Carbon_monoxide_concentration)
# Assign outliers
for i in range(len(arrays_list)):
if arrays_list[i] == 0:
arr[arrays_list[i]] = a1(arrays_list[i])
elif arrays_list[i] == 1:
arr[arrays_list[i]] = a2(arrays_list[i])
elif arrays_list[i] == 2:
arr[arrays_list[i]] = a3(arrays_list[i])
elif arrays_list[i] == 3:
arr[arrays_list[i]] = a4(arrays_list[i])
print(arr)
return arr
```

# Write at the end

Because the quality of the code you write is not high, it will be a headache for you to read the whole blog.
If you really need data processing code, I'll try to extract what I think is the key content.
Wait a little further to modify the source code.

# source file

## data

I think you should be able to input these data into the csv file. It's good to read them anyway (～￣▽￣)～

```1,30.22,69,2.9,3.6
2,37.68,,2.86,3.64
3,29.32,66,,1.66
4,37.44,68,1.18,6.49
5,29.46,75,3.81,4.78
6,30.12,77,1.93,4.2
7,26.3,83,2.07,3.67
8,,66,1.46,
9,34.38,62,3.36,4.39
10,30.79,75,2.4,5.79
11,25.17,69,3.34,6.28
12,20.5,80,2.95,6.22
13,37.21,,999.99,4.75
14,31.88,94,2.41,1.49
15,39.94,77,1.98,1.08
16,33.65,67,2.03,
17,27.21,79,1.41,2.5
18,27.57,999,3.39,1.7
19,31.59,94,3.27,5.02
20,,93,,4.69
21,34.88,92,1.58,3.01
22,29.65,92,2.69,5.16
23,26.05,80,2.61,2.8
24,33.5,79,1.26,6.81
25,34.71,72,1.82,2.49
26,999.99,,3.77,999.99
27,37.06,99,1.94,3.85
28,28.57,64,3.93,
29,28,66,1.08,2.4
30,25.67,60,2.33,3.84
31,20.59,67,3.88,5.19
32,39.84,999,3.4,3.77
33,29.22,74,1.09,999.99
34,33.19,61,3.22,1.74
35,38.7,62,3.61,6.52
36,,62,,6.83
37,21.56,93,3.63,4.43
38,38.06,67,3.21,2.41
39,37.95,68,1.12,4.56
40,35.8,96,1.52,5.54
41,29.75,80,1.3,2.23
42,32.69,69,3.26,1.43
43,33.94,96,3.42,1.4
44,36.31,78,3.69,5.34
```

## Source code

```import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import lagrange

# Show all columns
pd.set_option('display.max_columns', None)
# Show all rows
pd.set_option('display.max_rows', None)
# Align column names
pd.set_option('display.unicode.ambiguous_as_wide', True)
pd.set_option('display.unicode.east_asian_width', True)
# Cancel scientific counting output
np.set_printoptions(suppress=True)
# In order to display the numbers intuitively, the scientific counting method is not used
pd.set_option('display.float_format', lambda x: '%.2f' % x)

'''
param :
arr: target array
row: that 's ok
col: column
'''
def Fill_Value(arr, row, col):
# The two-dimensional array is filled with blank values
# There is a precondition here. This assignment is based on the upstream and downstream values of the same column, so the column must not 1 change, only the number of rows
current_row = row  # Record the number of lines entered
current_col = col  # Record the number of columns entered
next_row = 0  # Record the number of lines in the next line
next_col = col  # Record the number of columns in the next column
pre_row = 0  # Record the number of lines in the previous line
pre_col = col  # Record the number of columns in the previous column

# Assignment of the first and last lines
if row == 0 or row == len(arr) - 1:
if row == 0:  # If the position of the input element is in the first row, the non empty number of the next row in the same column is assigned to it
flag = True  # Tag, if the assignment ends, the loop ends
while flag:
current_row += 1
if arr[current_row][current_col] != '':
arr[row][col] = arr[current_row][current_col]  # assignment
flag = False
elif arr[current_row][current_col] == '':  # Judge if the current number is null
current_row += 1  # Then find the value of the next row in the same column
else:
print("Fill_Value()method==first line==Something went wrong!!")
elif row == len(arr) - 1:  # If the position of the input element is in the last row, the non empty number of the row on the same column is assigned to it
flag = True  # Tag, if the assignment ends, the loop ends
current_row -= 1
while flag:
if arr[current_row][current_col] != '':
arr[row][col] = arr[current_row][current_col]
flag = False
elif arr[current_row][current_col] == '':
current_row -= 1
else:
print("Fill_Value()Methodical==Last line==Something went wrong!!")
else:
print("If this occurs, the condition judgment of the first and last lines is wrong")

# Assignment of non first and last lines
elif row != 0 and row != len(arr) - 1:  # Assign the average value of the upper and lower rows of the same column of the value to it
# There is an arithmetic pit here, that is, when you get to this line, the previous numbers must be non empty values, because the above algorithm fills the previous values,
# Therefore, for this number, the first n lines have values, but the latter ones do not necessarily have values. It is possible that many lines have null values,
# Therefore, we need to make some judgment when selecting the next value
# The practice here is to find a value to average no matter how many lines are separated
# If all rows in this row are blank values, the algorithm here assigns the previous value of this row to all subsequent rows
flag = True  # Tag, if the assignment ends, the loop ends
next_row = current_row + 1
pre_row = current_row - 1
while flag:
if arr[next_row][next_col] != '':  # Here we judge whether the next line is null
arr[row][col] = str((float(arr[next_row][next_col]) +  # Since the numerical calculation of two numbers is involved here, it needs to be converted to numerical type, while the above only involves the assignment behavior of a single line
float(arr[pre_row][pre_col])) / 2)  # Floating point numbers are required here, not integers
flag = False
elif arr[next_row][next_col] == '':
next_row += 1
else:
print("Fill_Value()Methodical==Assignment of non first and last lines==Something went wrong!!")
else:
print('How could this be wrong?')
return arr

def Create_Fill_Value_Arr(arr):  # If you need coordinates, the number of abscissa and ordinate should be the same. Create a 2-dimensional zero array with len (row) length
# row and col are sets of abscissa and ordinate, respectively
# print(np.where(data == ''))  # Find the coordinates of the blank value
row = np.where(arr == '')
col = np.where(arr == '')
CoordinateArray = np.zeros((len(row), 2))  # Blank value coordinate array
row_count = 0
for i in np.nditer(row):
CoordinateArray[row_count] = i
row_count += 1
col_count = 0
for i in np.nditer(col):
CoordinateArray[col_count] = i
col_count += 1
return CoordinateArray

# Abnormal data can be obtained through the box diagram
def CleaningData(arr):
# The data of arr is all strings
# If a data is very different from all the data in the table, it is necessary to replace the normal data with the abnormal data
point_of_time = [float(num) for num in arr[:, 0]]  # This sampling time point should not be used
temperature = [float(num) for num in arr[:, 1]]
relative_humidity = [float(num) for num in arr[:, 2]]
Gas_concentration = [float(num) for num in arr[:, 3]]
Carbon_monoxide_concentration = [float(num) for num in arr[:, 4]]
# The data of data is all float
data = {
'temperature': temperature,
'relative_humidity': relative_humidity,
'Gas_concentration': Gas_concentration,
'Carbon_monoxide_concentration': Carbon_monoxide_concentration,
}
df = pd.DataFrame(data)
df.boxplot()  # Here, pandas has its own process
plt.ylabel("ylabel")
plt.xlabel("xlabel")  # We set the title of the abscissa and ordinate.
plt.savefig('Box diagram.jpg')
plt.show()
# The des data here comes from df.describe (), because
des = {
'temperature': {
'count': 44.00,
'mean': 53.63,
'std': 146.07,
'min': 20.50,
'25%': 29.06,
'50%': 31.73,
'75%': 35.93,
'max': 999.99,
},
'relative_humidity': {
'count': 44.00,
'mean': 118.39,
'std': 194.72,
'min': 60.00,
'25%': 67.00,
'50%': 76.00,
'75%': 88.25,
'max': 999.99,
},
'Gas_concentration': {
'count': 44.00,
'mean': 25.23,
'std': 150.37,
'min': 1.08,
'25%': 1.90,
'50%': 2.65,
'75%': 3.39,
'max': 999.99,
},
'Carbon_monoxide_concentration': {
'count': 44.00,
'mean': 49.15,
'std': 209.90,
'min': 1.08,
'25%': 2.47,
'50%': 3.94,
'75%': 5.23,
'max': 999.99,
}
}
columns_list = ['temperature', 'relative_humidity', 'Gas_concentration', 'Carbon_monoxide_concentration']
# If any value exceeds the average value, the value is regarded as an abnormal value
arrays_list = []  # This list is used to record the coordinates of outliers
col = 0
row = 0
for col_list in columns_list:
for i in range(len(arr)):
if data[col_list][row] >= des[col_list]['std']:
arrays_list.append([row, col])
row += 1
elif data[col_list][row] < des[col_list]['std']:
row += 1
else:
print("This can't happen once")
col += 1
row = 0
# Three values select (4, 5, 6) rows
iloc_point_of_time = [float(num) for num in arr[:, 0]][4:7:1]  # This sample collection time
iloc_temperature = [float(num) for num in arr[:, 1]][4:7:1]
iloc_relative_humidity = [float(num) for num in arr[:, 2]][4:7:1]
iloc_Gas_concentration = [float(num) for num in arr[:, 3]][4:7:1]
iloc_Carbon_monoxide_concentration = [float(num) for num in arr[:, 4]][4:7:1]
# Here, the lagrange three value interpolation method is used to call the lagrange () function
a1 = lagrange(iloc_point_of_time, iloc_temperature)
a2 = lagrange(iloc_point_of_time, iloc_relative_humidity)
a3 = lagrange(iloc_point_of_time, iloc_Gas_concentration)
a4 = lagrange(iloc_point_of_time, iloc_Carbon_monoxide_concentration)
# Assign outliers
for i in range(len(arrays_list)):
if arrays_list[i] == 0:
arr[arrays_list[i]] = a1(arrays_list[i])
elif arrays_list[i] == 1:
arr[arrays_list[i]] = a2(arrays_list[i])
elif arrays_list[i] == 2:
arr[arrays_list[i]] = a3(arrays_list[i])
elif arrays_list[i] == 3:
arr[arrays_list[i]] = a4(arrays_list[i])
# print(arr)
return arr

# Represent the data in the form of images
def CreatePicture(arr):
# Here, corresponding variables are created for files, which may not be applicable to other files
# Because the initial value is a string, you need to convert the string in the list to floating point, otherwise an error will be reported
point_of_time = [float(num) for num in arr[:, 0]]
temperature = [float(num) for num in arr[:, 1]]
relative_humidity = [float(num) for num in arr[:, 2]]
Gas_concentration = [float(num) for num in arr[:, 3]]
Carbon_monoxide_concentration = [float(num) for num in arr[:, 4]]
# Connect each point into a line segment
plt.plot(point_of_time, temperature, label="Temperature(℃)", linestyle=":")
plt.plot(point_of_time, relative_humidity, label="relative humidity", linestyle="--")
plt.plot(point_of_time, Gas_concentration, label="Gas concentration", linestyle="-.")
plt.plot(point_of_time, Carbon_monoxide_concentration,
label="Carbon monoxide concentration", linestyle="-")
# Show each point
plt.scatter(point_of_time, temperature, s=10)
plt.scatter(point_of_time, relative_humidity, s=10)
plt.scatter(point_of_time, Gas_concentration, s=10)
plt.scatter(point_of_time, Carbon_monoxide_concentration, s=10)
plt.rcParams['font.sans-serif'] = ['SimHei']  # Used to display Chinese labels normally
plt.rcParams['axes.unicode_minus'] = False  # Used to display negative signs normally
plt.title("Coal mine sampling data sheet", fontsize=18)
plt.xlabel("Sampling time point", fontsize=12)
plt.ylabel("numerical value", fontsize=12)
plt.legend(fontsize=10)
plt.savefig('Line chart.jpg')
plt.show()

# Encapsulate all required functions in the main () method
def main(file):
# The type here is converted to string type, because float or int is not suitable for null and numeric values
data = np.loadtxt(file, skiprows=1, dtype=str, delimiter=',')
CoordinateArray = Create_Fill_Value_Arr(data)  # Returns an array of coordinates containing all null values
for i in range(len(CoordinateArray)):
arr_final = Fill_Value(
data, int(CoordinateArray[i]), int(CoordinateArray[i]))  # Although the floating-point type is converted to integer type, the coordinates here should be integers, so there is no problem with this forced conversion
arr_finally = CleaningData(arr_final)
CreatePicture(arr_finally)
return arr_finally

# It's best to use absolute addresses here, because some software seems to be inaccurate in locating relative addresses
print(main(file))
```

## Operation results

This is the modified result. The previous outliers and blank values have been re assigned
There is a small problem: the temperature has an abnormal value after calculation, which may remind me not to take it lightly. The code that feels super good will also be abnormal

```[['1' '30.22' '69' '2.9' '3.6']
['2' '37.68' '67.5' '2.86' '3.64']
['3' '29.32' '66' '2.02' '1.66']
['4' '37.44' '68' '1.18' '6.49']
['5' '29.46' '75' '3.81' '4.78']
['6' '30.12' '77' '1.93' '4.2']
['7' '26.3' '83' '2.07' '3.67']
['8' '30.340' '66' '1.46' '4.0299']
['9' '34.38' '62' '3.36' '4.39']
['10' '30.79' '75' '2.4' '5.79']
['11' '25.17' '69' '3.34' '6.28']
['12' '20.5' '80' '2.95' '6.22']
['13' '37.21' '87.0' '21.569' '4.75']
['14' '31.88' '94' '2.41' '1.49']
['15' '39.94' '77' '1.98' '1.08']
['16' '33.65' '67' '2.03' '1.79']
['17' '27.21' '79' '1.41' '2.5']
['18' '27.57' '107.0' '3.39' '1.7']
['19' '31.59' '94' '3.27' '5.02']
['20' '33.235' '93' '2.425' '4.69']
['21' '34.88' '92' '1.58' '3.01']
['22' '29.65' '92' '2.69' '5.16']
['23' '26.05' '80' '2.61' '2.8']
['24' '33.5' '79' '1.26' '6.81']
['25' '34.71' '72' '1.82' '2.49']
['26' '-41.04' '85.5' '3.77' '6.0900']
['27' '37.06' '99' '1.94' '3.85']
['28' '28.57' '64' '3.93' '3.125']
['29' '28' '66' '1.08' '2.4']
['30' '25.67' '60' '2.33' '3.84']
['31' '20.59' '67' '3.88' '5.19']
['32' '39.84' '107.0' '3.4' '3.77']
['33' '29.22' '74' '1.09' '6.0900']
['34' '33.19' '61' '3.22' '1.74']
['35' '38.7' '62' '3.61' '6.52']
['36' '30.130' '62' '3.62' '6.83']
['37' '21.56' '93' '3.63' '4.43']
['38' '38.06' '67' '3.21' '2.41']
['39' '37.95' '68' '1.12' '4.56']
['40' '35.8' '96' '1.52' '5.54']
['41' '29.75' '80' '1.3' '2.23']
['42' '32.69' '69' '3.26' '1.43']
['43' '33.94' '96' '3.42' '1.4']
['44' '36.31' '78' '3.69' '5.34']]
[['1' '30.22' '69' '2.9' '3.6']
['2' '37.68' '67.5' '2.86' '3.64']
['3' '29.32' '66' '2.02' '1.66']
['4' '37.44' '68' '1.18' '6.49']
['5' '29.46' '75' '3.81' '4.78']
['6' '30.12' '77' '1.93' '4.2']
['7' '26.3' '83' '2.07' '3.67']
['8' '30.340' '66' '1.46' '4.0299']
['9' '34.38' '62' '3.36' '4.39']
['10' '30.79' '75' '2.4' '5.79']
['11' '25.17' '69' '3.34' '6.28']
['12' '20.5' '80' '2.95' '6.22']
['13' '37.21' '87.0' '21.569' '4.75']
['14' '31.88' '94' '2.41' '1.49']
['15' '39.94' '77' '1.98' '1.08']
['16' '33.65' '67' '2.03' '1.79']
['17' '27.21' '79' '1.41' '2.5']
['18' '27.57' '107.0' '3.39' '1.7']
['19' '31.59' '94' '3.27' '5.02']
['20' '33.235' '93' '2.425' '4.69']
['21' '34.88' '92' '1.58' '3.01']
['22' '29.65' '92' '2.69' '5.16']
['23' '26.05' '80' '2.61' '2.8']
['24' '33.5' '79' '1.26' '6.81']
['25' '34.71' '72' '1.82' '2.49']
['26' '-41.04' '85.5' '3.77' '6.0900']
['27' '37.06' '99' '1.94' '3.85']
['28' '28.57' '64' '3.93' '3.125']
['29' '28' '66' '1.08' '2.4']
['30' '25.67' '60' '2.33' '3.84']
['31' '20.59' '67' '3.88' '5.19']
['32' '39.84' '107.0' '3.4' '3.77']
['33' '29.22' '74' '1.09' '6.0900']
['34' '33.19' '61' '3.22' '1.74']
['35' '38.7' '62' '3.61' '6.52']
['36' '30.130' '62' '3.62' '6.83']
['37' '21.56' '93' '3.63' '4.43']
['38' '38.06' '67' '3.21' '2.41']
['39' '37.95' '68' '1.12' '4.56']
['40' '35.8' '96' '1.52' '5.54']
['41' '29.75' '80' '1.3' '2.23']
['42' '32.69' '69' '3.26' '1.43']
['43' '33.94' '96' '3.42' '1.4']
['44' '36.31' '78' '3.69' '5.34']]
```

The box chart is to see whether there are abnormal values and how many abnormal values there are
But look at the output pictures, you can see that you have only learned a little fur The following is a combination of line chart and scatter chart Posted by katlis on Sat, 23 Oct 2021 07:36:05 -0700