# matplotlib of data analysis

Keywords: Python Machine Learning matplotlib

## matplotlib of data analysis

### Introduction to matplotlib

matplotlib official website

Data analysis: make statistics and sort out a large number of data to draw conclusions and provide data support for subsequent decision-making

Learn matplotlib?

• It can visualize the data and present it more intuitively
• Make the data more objective and persuasive

matplotlib: the most popular python bottom drawing library, which mainly makes data visualization charts. Its name is taken from MATLIB and built by imitating it.

### matplotlib Foundation

Simple example

```from matplotlib import pyplot as plt
import random

#Solve the problem of Chinese display
plt.rcParams['font.sans-serif'] = ['KaiTi'] # Specifies the default font

x = range(0, 120)
y = [random.randint(20, 35) for i in range(0, 120)]

#Set the picture size (when the image is blurred, you can pass in the dpi parameter to make the picture clearer)
fig = plt.figure(figsize=(11, 8), dpi=80)

#mapping
plt.plot(x, y)

_xtick_labels = ["10 spot{}branch".format(i * 10) for i in range(6)]
_xtick_labels += ["11 spot{}branch".format(i * 10) for i in range(6)]

#rotation the number of degrees the scale font is rotated
plt.xticks(list(x)[::10], _xtick_labels, rotation=45)

plt.xlabel("time")
plt.ylabel("temperature/℃")
plt.title("10 Temperature change per minute from 0:00 to 12:00")

#preservation
plt.savefig("./Simple example t1.png")
plt.show()
```

The illustration is as follows ### Draw multiple figures and different difference figures

Suppose you draw a party with your friends between the ages of 11 and 31 on the same map

```from matplotlib import pyplot as plt
import random

#Solve the problem of Chinese display
plt.rcParams['font.sans-serif'] = ['KaiTi'] # Specifies the default font

y_1 = [1, 0, 1, 1, 2, 4, 3, 2, 3, 4, 4, 5, 6, 5, 4, 3, 3, 1, 1, 1]
y_2 = [1, 0, 3, 1, 2, 2, 3, 3, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1]
x = range(11, 31)

#Set picture size
fig = plt.figure(figsize=(12, 5), dpi=80)

#mapping
"""
Can be specified when drawing
color = 'r' line color
linestyle = '--' Line style
linewidth = 5 Line thickness
alpha = 0.5 transparency
"""
plt.plot(x, y_1, label="own", color="orange")
plt.plot(x, y_2, label="deskmate", color="cyan")

#Set scale
_xtick_labels = ["{}year".format(i) for i in x]
plt.xticks(x, _xtick_labels)
plt.yticks(range(0, 9))

#Draw mesh
plt.grid(alpha=0.5, linestyle=":")
plt.legend(loc="upper right")

plt.xlabel("Age")
plt.ylabel("party/second")
plt.title("11 Annual gatherings between the ages of and 26")

plt.show()
```

The illustration is as follows ### Comparison of common statistical charts

• Line chart: a statistical chart showing the increase or decrease of statistical quantity with the rise or fall of line

Features: it can display the change trend of data and reflect the change of things (change)

• Histogram: a series of longitudinal stripes or line segments with different heights represent the data distribution.

Generally, the horizontal axis represents the data range and the vertical axis represents the distribution

Features: draw continuous data to show the distribution of one or more groups of data (Statistics)

• Bar chart: data arranged in rows or columns of a worksheet can be drawn into a bar chart.

Features: draw discrete data, be able to see the size of each data at a glance, and compare the differences between data (Statistics)

• Scatter diagram: use two groups of data to form multiple coordinate points to investigate the distribution of coordinate points,

Judge whether there is some correlation between the two variables or summarize the distribution pattern of coordinate points.

Features: judge whether there is a quantitative relationship trend between variables and display outliers (distribution law)

### Scatter plot

Code example

```from matplotlib import pyplot as plt

#Solve the problem of Chinese display
plt.rcParams['font.sans-serif'] = ['KaiTi'] # Specifies the default font

y_3 = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,
21,16,17,20,14,15,15,15,19,21,22,22,22,23]
y_10 = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,
20,21,20,22,15,11,15,5,13,17,10,11,13,12,13,6]

x_3 = range(1, 32)
x_10 = range(51, 82)

#Set drawing size
plt.figure(figsize=(11, 7), dpi=80)

#Using scatter method to draw scatter chart is the only difference from drawing line chart before
plt.scatter(x_3, y_3, label='3 month')
plt.scatter(x_10, y_10, label='10 month')

_x = list(x_3) + list(x_10)
_xtick_labels = ["3 month{}day".format(i) for i in x_3]
_xtick_labels += ["10 month{}day".format(i - 50) for i in x_10]
plt.xticks(_x[::5], _xtick_labels[::5], rotation=45)

plt.legend(loc="upper right")

plt.xlabel("time")
plt.ylabel("temperature")
plt.title("title")

#Exhibition
plt.show()
```

The illustration is as follows ### Draw bar chart

#### Draw basic bar chart

Code example

```#Draw a horizontal bar chart
from matplotlib import pyplot as plt
#Solve the problem of Chinese display
plt.rcParams['font.sans-serif'] = ['KaiTi'] # Specifies the default font

a = ["Vulgar novel", "Speed and passion 8", "Burning years", "Schindler's list", "Brave heart", "Gone with the wind",
"Brilliant life", "Beautiful life", "How beautiful life is", "godfather", "The Hobbit", "Titanic"]
b = [36.01, 25.90, 17.53, 29.60, 42.40, 33.53, 37.80, 40.52, 60.43, 57.33, 43.90, 50.99]

#Set drawing size
plt.figure(figsize=(10, 7), dpi=80)
#Draw bar chart
plt.barh(range(len(a)), b, height=0.4, color='cyan')
#Set string to x axis
plt.yticks(range(len(a)), a)

#If a vertical bar chart is drawn, the corresponding code is as follows
#plt.bar(range(len(a)), b, width=0.3)
#plt.xticks(range(len(a)), a, rotation=40)

plt.grid(alpha=0.4, color='pink')

plt.ylabel("film")
plt.xlabel("box office/Hundred million")
plt.title("xxx Box office statistics of films in")

plt.show()
```

The diagram is as follows: #### Draw multiple bar graphs

Code example

```from matplotlib import pyplot as plt

#Solve the problem of Chinese display
plt.rcParams['font.sans-serif'] = ['KaiTi'] # Specifies the default font

a = ["Pride and Prejudice", "Camellia woman", "Gone with the wind", "Reason and emotion"]
b_16 = [1746, 324, 4466, 389]
b_15 = [1247, 158, 2039, 189]
b_14 = [2490, 389, 3900, 289]

bar_width = 0.2

x_14 = list(range(len(a)))
x_15 = [i + bar_width for i in x_14]
x_16 = [i + bar_width * 2 for i in x_14]

#Set drawing size
plt.figure(figsize=(10, 7), dpi=80)

plt.bar(x_14, b_14, width=bar_width, label="9 May 14")
plt.bar(x_15, b_15, width=bar_width, label="9 May 15")
plt.bar(x_16, b_16, width=bar_width, label="9 June 16")

#Set legend
plt.legend(loc="upper right")

#Set the scale of x
plt.xticks(x_15, a)

plt.show()
```

The diagram is as follows: ### Draw histogram

Simple example

```#180 department film duration
from matplotlib import pyplot as plt
import random
#Solve the problem of Chinese display
plt.rcParams['font.sans-serif'] = ['KaiTi'] # Specifies the default font
a = [random.randint(0, 100) + 90 for i in range(1, 181)]
#Number of calculation groups
d = 5  #Group distance
num_bins = (max(a) - min(a)) // d

#Set drawing size
plt.figure(figsize=(11, 6 ), dpi=80)

plt.hist(a, num_bins, normed=True)

#Sets the scale of the x-axis
plt.xticks(range(min(a), max(a) + d, d))

plt.grid(alpha=0.4, color='cyan')
plt.show()
```

The diagram is as follows: ### Property settings

#### 1. Location loc

loc='center left 'is equivalent to loc=6

```             'best':         0,
'upper right':  1,
'upper left':   2,
'lower left':   3,
'lower right':  4,
'right':        5,
'center left':  6,
'center right': 7,
'lower center': 8,
'upper center': 9,
'center':       10,
```

#### 2. Linetype

```-	Solid line
--	Dotted line
-.	The form is-.
:	Small dotted line
```

#### 3. Broken line point marker

```s--square
h--hexagon
H--hexagon
*--*shape
+--plus
x--x shape
d--diamond
D--diamond
p--Pentagonal
```

### Other graphic drawing

Example of matplotlib official website

As shown in the figure below Click any icon you are interested in, which has a complete drawing code, and its data can be changed, such as

```import matplotlib.pyplot as plt
import numpy as np
from matplotlib.patches import Ellipse

# Fixing random state for reproducibility
np.random.seed(19680801)

NUM = 200

ells = [Ellipse(xy=np.random.rand(3) * 10,
width=np.random.rand(), height=np.random.rand(),
angle=np.random.rand() * 360)
for i in range(NUM)]

fig, ax = plt.subplots(subplot_kw={'aspect': 'equal'})
for e in ells:
e.set_clip_box(ax.bbox)
e.set_alpha(np.random.rand())
e.set_facecolor(np.random.rand(3))

ax.set_xlim(0, 10)
ax.set_ylim(0, 10)

plt.show()
```

The renderings are as follows ### reference material

https://matplotlib.org/

[python tutorial] data analysis -- numpy, pandas, matplotlib

matplotlib can easily solve the problem of Chinese garbled code

legend setting of python drawing.

Python data analysis: drawing of line chart and scatter chart

Posted by grudz on Tue, 28 Sep 2021 03:17:13 -0700