Notes on machine learning -- iris classification

Keywords: Anaconda

Iris classification is a classic machine learning application.

Suppose a plant enthusiast has observed many iris flowers and recorded the data of these flowers (petal length, width and calyx length, width), and all these flowers belong to setosa, versicolor or virginica. Now you need to predict the types of flowers based on the recorded data.

Because this is a classic data set, it is in the data sets module of scikit learn. We can call the load? Iris function to load the data:

from sklearn.datasets import load_iris 
iris_dataset = load_iris()
print("Keys of iris_dataset: \n{}".format(iris_dataset.keys()))

Results:

Keys of iris_dataset: 
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

We print the data under these key s respectively. First, data. Because there is too much data, we only print the first five:

print("first 5 data: \n{}".format(iris_dataset['data'][:5]))

The results are as follows: petal length and width, calyx length and width respectively

first 5 data: 
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]

Then target, and target_names:

print("target: \n{}".format(iris_dataset['target']))
print("target_names: \n{}".format(iris_dataset['target_names']))

The results are as follows: 0, 1 and 2 should correspond to three kinds of flowers

target: 
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]
 
 target_names: 
['setosa' 'versicolor' 'virginica']

DESCR is the description of data set, I will not show it. Finally, look at feature names and filename:

print("feature_names: \n{}".format(iris_dataset['feature_names']))
print("filename: \n{}".format(iris_dataset['filename']))

We can know which four attributes and the CSV address of the recorded data. The results are as follows:

feature_names: 
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

filename: 
C:\Anaconda\lib\site-packages\sklearn\datasets\data\iris.csv

Posted by Virii on Fri, 22 Nov 2019 07:08:31 -0800