The use of Pyyaml module in Python

Keywords: Python encoding JSON pip

I. what is YAML

YAML is a special language for writing configuration files, which is far more convenient than JSON format.

The design goal of YAML is to facilitate human reading and writing.

YAML is a lighter file format than XML and JSON, and it's simpler and more powerful. It can express structure through indentation. Does it sound like a good match for Python?

As the name implies, a document written in a language can be called a YAML document. PyYaml is a Python specific module for YAML file operation, which is very simple to use

Install pip install pyyaml. If py2, use pip install yaml

II. Simple use of PyYaml

It's very simple to use. Just like json and pickle, load and dump are enough for us to use.

load() example: returns an object

import yaml

yaml_str = """
name: A big river
age: 1956
job: Singer
"""

y = yaml.load(yaml_str, Loader=yaml.SafeLoader)
print(y)

Operation result:

{'name': 'A big river', 'age': 1956, 'job': 'Singer'}

load_all() example: generate an iterator

If a string or file contains several yaml documents, you can use yaml.load_all to parse the entire document.

yaml_test.yaml file content:

---
name: qiyu
age: 20 year
---
name: qingqing
age: 19 year

Operate the test.py file of yaml file as follows:

import yaml

with open("./yaml_test", 'r', encoding='utf-8') as ymlfile:
    cfg = yaml.load_all(ymlfile, Loader=yaml.SafeLoader)
    for data in cfg:
        print(data)

Operation result:

{'name': 'qiyu', 'age': '20 year'}
{'name': 'qingqing', 'age': '19 year'}

dump() example: generating a python object as a yaml document

import yaml

json_data = {'name': 'A big river',
             'age': 1956,
             'job': ['Singer','Dancer']}

y = yaml.dump(json_data, default_flow_style=False).encode('utf-8').decode('unicode_escape')
print(y)

Operation result:

age: 1956
job:
- Singer
- Dancer
name: "A big river"

Using dump() to pass in parameters, you can directly write the contents to the yaml file:

import yaml

json_data = {'name': 'A big river',
             'age': 1956,
             'job': ['Singer', 'Dancer']}
with open('./yaml_write.yaml', 'w') as f:
    y = yaml.dump(json_data, f)
    print(y)

yaml_write.yaml after writing content:

yaml.dump_all() example: output multiple segments to a file

import yaml

obj1 = {"name": "river", "age": 2019}
obj2 = ["Lily", 1956]
obj3 = {"gang": "ben", "age": 1963}
obj4 = ["Zhuqiyu", 1994]

with open('./yaml_write_all.yaml', 'w', encoding='utf-8') as f:
y = yaml.dump([obj1, obj2, obj3, obj4], f)
print(y)

with open('./yaml_write_all.yaml', 'r') as r:
y1 = yaml.load(r, Loader=yaml.SafeLoader)
print(y1)

Yaml after writing all.yaml:

Why does the format after writing a file have 1 "-" and 2 "-"?

Why is the format read by yaml file List?

Grammar rules and data structure of YAML

After reading the above four simple examples, let's summarize the basic syntax of YAML

The basic syntax rules of YAML are as follows:

1. Case sensitive
2. Use indent to express hierarchy
3. The Tab key is not allowed when indenting, only the space is allowed.
4. The number of indented spaces is not important, as long as the elements of the same level are aligned to the left
5. ා means comment. From this character to the end of the line, it will be ignored by the parser. This is the same as python's comment

6. Items in the list are represented by "-" and key value pairs in the dictionary are separated by ":"

Now that you know the grammar rules, let's answer the following two questions:

1. With 1 "-" for different modules (single array or dictionary), with 2 "-" because the elements in the array start with "-" and add the "-" for different modules, the two "-" are displayed

2. Because yaml file contains multiple modules (multiple arrays or dictionaries), it reads a collection of these modules

3. If there is only one dictionary in yaml file, the type of data read out is also a dictionary

YAML supports three data structures:

1. Object: set of key value pairs 2. Array: a set of values in order, sequence or list

3. scalars: single and indivisible values, such as string, Boolean value, integer, floating-point number, Null, time and date

Examples of supporting data:

Content of yaml_test_data.yaml:

str: "Big River"                           #Character string
int: 1548                                 #integer
float: 3.14                               #Floating point number
boolean: true                              #Boolean value
None: null                                # It can also be used. ~ Number to indicate null
time: '2019-11-20T08:47:46.576701+00:00'       # Time, ISO8601 
date: 2019-11-20 16:47:46.576702 # date

Operation code:

import yaml
import datetime
import pytz

yaml_data = {
    "str": "Big River",
    "int": 1548,
    "float": 3.14,
    'boolean': True,
    "None": None,
    'time': datetime.datetime.now(tz=pytz.timezone('UTC')).isoformat(),
    'date': datetime.datetime.today()
}

with open('./yaml_test', 'w') as f:
    y = yaml.dump(yaml_data, f)
    print(y)

with open('./yaml_test', 'r') as r:
    y1 = yaml.load(r, Loader=yaml.SafeLoader)
    print(y1)

Console output:

Other grammar rules

1. If there is no space or special character in the string, quotation marks are not needed, but if there is space or special character in the string, quotation marks are needed

2, citation

&And * for reference

name: &name SKP
tester: *name

Operation result:

{'name': 'SKP', 'tester': 'SKP'}

3. Forced conversion

Use!! to realize

str: !!str 3.14
int: !!int "123"

Operation result:

{'int': 123, 'str': '3.14'}

4, segmentation

In the same yaml file, you can use "-" 3 "-" to segment, so that multiple documents can be written in one file

For example, see the above example of load_all()

IV. generating yaml documents from python objects

1. yaml.dump() method

import yaml
import os

def generate_yaml_doc(yaml_file):
    py_object = {'school': 'zhu',
                 'students': ['a', 'b']}
    file = open(yaml_file, 'w', encoding='utf-8')
    yaml.dump(py_object, file)
    file.close()

current_path = os.path.abspath(".")
yaml_path = os.path.join(current_path, "generate.yaml")
generate_yaml_doc(yaml_path)
"""Result
school: zhu
students:
- a
- b
"""

2. Use yaml method in ruamel module to generate standard yaml documents

import os
from ruamel import yaml # pip3 install ruamel.yaml def generate_yaml_doc_ruamel(yaml_file): py_object = {'school': 'zhu', 'students': ['a', 'b']} file = open(yaml_file, 'w', encoding='utf-8') yaml.dump(py_object, file, Dumper=yaml.RoundTripDumper) file.close() current_path = os.path.abspath(".") yaml_path = os.path.join(current_path, "generate.yaml") generate_yaml_doc_ruamel(yaml_path) """Result school: zhu students: - a - b """

Use the yaml method in the ruamel module to read the yaml document (the usage is the same as the single import yaml module)

import os
from ruamel import yaml

def get_yaml_data_ruamel(yaml_file):
    file = open(yaml_file, 'r', encoding='utf-8')
    data = yaml.load(file, Loader=yaml.Loader)
    file.close()
    print(data)

current_path = os.path.abspath(".")
yaml_path = os.path.join(current_path, "generate.yaml")
get_yaml_data_ruamel(yaml_path)

Posted by agnalleo on Wed, 20 Nov 2019 04:46:16 -0800