Experience Recall (Manual) => peewee: CRUD Complete Resolution of Python-ORM

Keywords: Python SQL MySQL PHP

statement

Last address: https://segmentfault.com/a/11...
Although I explained in the previous article, I would like to emphasize that peewee is python-ORM (MySQL,Sqlite,postgresql only)
Although ORM is seamless and compatible with many databases, some fine grammars are not common to databases.
I use MySQL, so the following is based on MySQL (the other two databases are not much different, 99% are the same)
Chief Officer's Address: http://docs.peewee-orm.com/en...
Official Github address: https://github.com/coleifer/p...

Adding data

Mode 1: (Recommendation)
    zhang = Owner.create(name='Zhang', age=20)
    
//Mode 2:
    zhang = Owner(name='Zhang1', age=18)
    zhang.save() 
    # As you can see, it needs to use save(), so it's recommended to use one way.

//Mode 3: (Recommendation)
    cython = Owner.insert(name='Cython', age=15).execute()
    # In Mode 1 and Mode 2, the returned results are all model instances "(which means creating an instance)"
    # In this way, the result is the newly inserted primary key value "(meaning that no instance will be created)"

If there are foreign key associations, and if there are Owner keys referenced by the Pet class, there are two ways to insert data:

Mode 1: Pass values with new object variables:
    lin = Owner.create(name='lin', age=20)            
    tom1 = Pet.create(name='Tom', age=1, owner=lin) #Note owner = Lin
    
Mode 2: Maintain the primary key ID manually and pass values through the primary key (or query id):
    Lin = Owner. create (id = 100, name ='lin', age = 20)  ID gives itself a value of 100
    tom1 = Pet.create(name='Tom', age=1, owner=100) #Note owner=100

Insert multiple data: (Official files have several methods, I only say the most advocated, the fastest method (the advantage is one-time submission, no cycle).)

Mode 1:
    """
        Note the format [{}, {}, {}, {}]
        Each dictionary corresponds to a record.
    """
    data = [
        {'name': 'Alice', 'age': 18},
        {'name': 'Jack', 'age': 17},
    ]
    Owner.insert_many(data).execute()
    
Way 2: It's not necessary to specify keys in the data. It's more convenient.
    """
        Note the format [(), (), ()]
        Each tuple corresponds to a record.
    """
    data = [
        ('Alice', 18),
        ('Jack', 17),
    ]
    User.insert_many(data, fields=[Owner.name, Owner.age]).execute()
Note: You have to bring an execute() to the tail.

If the amount of data is too large, problems such as OOM may arise. You can batch manually, but peewee gives us the api for the finished product

from peewee import chunked
with mysql_db.atomic():    # Officials recommend business parcels
    for batch in chunked(data, 100):    # One hundred at a time, chunked() returns an iterative object
        Owner.insert_many(batch).execute()

Two ways to prevent data duplicate insertion (or to prevent the setting of the primary key, duplicate insertion throws an exception, causing the program to fail to run):

Method 1: INGORE Keyword (which is automatically ignored if conflicting)
    SQL:
        insert ignore  into owner (name,age) values ('lin',30);
    peewee:
        Owner.insert(name='lin', age=30).on_conflict_ignore()

//Method 2: Use ON DUPLICATE KEY UPDATE (this way, if there is a conflict, you can also do some operations)
    SQL: 
        insert into owner (name,age) values ('lin',30) 
            ON DUPLICATE KEY 
                UPDATE name='lin', age=30;             # If the conflict occurs, you can reset the value           
    peewee:
        Owner.insert(name='lin', age=30).on_conflict(
            preserve=[Owner.name, Owner.age],          # If conflict occurs, you want to keep the same fields.
            update={Owner.name: 'lin', Owner.age: 30}  # If conflict, what do you want to update?
        ).execute()
        # Note: Preservation and update are used according to the situation. Generally, one is enough.

Delete data

Method 1:
    php = Owner.get(name='PHP')
    php.delete_instance()
//Method 2:
    Owner.delete().where(Owner.name == 'lin').execute()
    # Note that this approach is similar to adding, and that there must be an execute()

Modify data

Mode 1: (not recommended)
    owner = queries for individual object results
    owner.name = 'Pack'
    owner.name = 50
    owner.save()# As you can see, we also need to call save() manually.
    
Mode 2: (Recommendation)
    query = Owner.update(name='Pack', age=50).where(Owner.name == 'Zhang')
    query.execute()

Query data

Query single data (especially, if you have more than one, it will only return you the first one)

"""Existence returns the original object and nonexistence throws it error"""
one_owner = Owner.get(name='Zhang2') 
print(one_woner.age)

//Extension 1: get_or_create
    """Existence returns the original object. If nonexistent, insert data and return new objects"""
    obj, status = Owner.get_or_create(name='Zhang23213',age=3123)
    print(obj.name, status)    
        # obj is the new object returned
        # status indicates whether the insertion was successful True or False
        
//Extension 2: get_or_none
    """Existence returns the original object, and nonexistence returns the original object. None  (Not throwing error)"""
    Owner.get_or_none(name='abc')

Query multiple data

Normally query all data

owners = Owner.select()        # Returning the result owners is a collection of objects that need to be traversed
for owner in owners:           # owner is each object (corresponding to each record)
    print(woner.name) 

Of course, you can change to python dict format after querying:

owners = Owner.select().dicts()    # Returns the result owners as a collection of class dictionary objects
for owner in owners:               # owner is the object of every dictionary. (It corresponds to every record)
    print(owner['name'])           # Dictionary grammar value, understand, let's not say more.

The above query may result in OOM if there is a large amount of data, so it can be converted to iteration:

"""Add at the end of each query .iterator() that will do"""
eg:
    owners = Owner.select().iterator()
    owners = Owner.select().dicts().iterator()

Conditional Query:

First of all, I would like to emphasize that MySQL is case-sensitive.

MySQL 5.7+, case-sensitive; (MySQL 8, and MariDB I did not try, should be the same as 5.7)
But this case-sensitive is only for the table name "" quotation mark of the SQL statement (that is, non-string grammar).
For instance:
    There is a table called owner.
        desc owner  correct
        desc OWNER # error, table does not exist
    In this case, the case is strictly distinguished because the quotation mark operation of the string is not involved.
            
The data grammar in the quotation mark (actually involving strings) is case-insensitive.
    For example (because the following examples all have "" string operations, so they are case-insensitive):
        SQL:
            Examples of queries:
                select * from owner where name='zHang'
                select * from owner where name='ZHANG'
                They both queried the same data.
            Insert examples:
                insert into owner values("zhaNg")
                insert into owner values("zhang")
                They insert the same data.                    
        peewee:
            Examples of queries:
                ...where(name="zhang")  
                ...where(name="ZHaNg")
                They both queried the same data.
            Insert examples:
                ...insert({'name':'Zhang')
                ...insert({'name': 'zhANG')
                They insert the same data.

Official-Conditional Operator: http://docs.peewee-orm.com/en...
The connection above is the official file operator, below I will extract some of the commonly used.

Common operators

And or not:

With:&
    Model class. Where ((User. is_active = True) & (User. is_admin = True)
Or:|
    Model class. Where ((User. is_admin) | (User. is_superuser)
Non:~
    Model class. Where (~ (User. username. contains ('admin'))

I will say two sentences to facilitate memory:
    1. In the SQL statement, "and or not" is the "and or not" grammar. Why does peewee not follow it?
        Answer: Because "python primitive grammar" is also the three... Conflict, so peewee changed.
    2. Look at the example above. The code for each conditional operator is enclosed in "()".

Range:

# Query data aged 18 to 20 (before and after closure)
for owner in Owner.select().where(Owner.age.between(18,20)): 
    print(owner.age)

Contains & does not include:

Does not include: not_in  (Ibid.)
//Does not include:in_

# Find out the records containing Alice and Tom's names
for owner in Owner.select().where(Owner.name.in_(['Alice', 'Tom'])): 
    print(owner.name)

Is it null:

# True means that all the records with null name are checked out.
# False means that all records with non-null name s are checked out.

for owner in Owner.select().where( Owner.name.is_null(True) ):
    print(owner.name)

Start with... And end with...

with..Start: startswith
//End with...: endswith

# Look up everything that starts with ali
for owner in Owner.select().where(Owner.name.startswith('ali')):
    print(owner.name)

Fuzzy Query:

# Query out data containing li strings
for owner in Owner.select().where(Owner.name.contains('li')):
    print(owner.name)

Regular Query:

That's interesting. As we emphasized earlier, MySQL quoted strings are case-insensitive.
Regularization provides us with case-sensitive API s. (This is a special case where only regular case-sensitive functionality exists. Remember)

Example conditions:
    Suppose we have a data name Alice
    
regexp: A strict case-sensitive rule
    # regexp is used, case-sensitive, conditional is al lowercase, so of course, can not find out, return to empty.
    for owner in Owner.select().where(Owner.name.regexp('al*')):
        print(owner.name)
iregexp: Case-insensitive rule
    # iregexp is used, case-insensitive. So even if you lower-case al, you can find Alice.
    for owner in Owner.select().where(Owner.name.iregexp('al*')):
        print(owner.name)

sort

# Default asc()
for owner in Owner.select().order_by(Owner.age):
    print(owner.age)

# Descending desc()
for owner in Owner.select().order_by(Owner.age.desc()):
    print(owner.age)

Grouping

# Grouping by name, counting all records whose number of heads is greater than 1, descending query  
query = Owner.select(Owner.name, fn.count(Owner.name).alias('total_num')) \
    .group_by(Owner.name) \
    .having(fn.count(Owner.name) > 1) \
    .order_by(SQL('total_num').desc())
    
for owner in query:
    Print (the number of people whose name is {owner.name} is {owner.total_num}

What should we pay attention to in grouping?
    1. Grouping operations, like group by in SQL, what fields are written after group by, and the previous select ion must also include
    2. alias ('statistical result field name') is a new field name for the statistical result. 
    3. The function of SQL ('total_num') is to give temporarily named query strings, which are used as temporary fields to support API s such as desc().
    4. The API of peewee is written in high imitation SQL, which is convenient for users. So we'd better synchronize the syntax specification of SQL in the following order:
         where > group_by > having > order_by

Polymerization Principle

If you don't understand peewee's principle of fn aggregation, you can read my previous article.
https://segmentfault.com/a/11...

The principle of aggregation is as follows: (Take fn.count() of the above grouping as an example)
    fn was imported in advance by me (I said at the beginning from peewee import *) and everything was imported.
    fn can use aggregation operations, I looked at the source code: explain ideas (not necessarily particularly correct):
        fn is the object of an instance of the Function class
        Function() defines the _getattr_ method, (_getattr_ at the beginning I have given the link, do not understand can be transmitted)
        
    When you use fn.xx():
        xx is passed as a string to _getattr_.
        _ getattr_ uses the decorator mode to put your xx string.
            After a series of operations, mapping to the same name of the SQL statement (these operations include case conversion, etc.)
            So you use fn.CoUNt as well as fn.CoUNt.
        In the final analysis, fn.xx() means that FN maps xx as a string to an SQL statement, which can be mapped to execute.

Common fn aggregation functions

fn.count()
    Total head count:
        for owner in Owner.select(fn.count(Owner.name).alias('total_num')):
            print(owner.total_num)
fn.lower() / fn.upper()
    Change the name to lowercase/capitalize (note that it's temporary, not real), and query it out:
        for owner in Owner.select(fn.Upper(Owner.name).alias('lower_name')):
            print(owner.lower_name)
fn.sum()
    Age summation:
        for owner in Owner.select(fn.sum(Owner.age).alias('sum_age')):
            print(owner.sum_age)
fn.avg()
    Average age age age:
        for owner in Owner.select(fn.avg(Owner.age).alias('avg_age')):
            print(owner.avg_age)
fn.min() / fn.max()
    Find out the minimum/maximum age:
        for owner in Owner.select(fn.max(Owner.age).alias('max_age')):
            print(owner.max_age)
fn.rand()    
    Usually used for disorderly queries (the default is ascending): 
        for owner in  Owner.select().order_by()
            print(owner.name)   

Prerequisite data preparation for association query

from peewee import *

mysql_db = MySQLDatabase('Your database name', user='username', password='Your password',
                         host='Your IP', port=3306, charset='utf8mb4')
class BaseModel(Model):
    class Meta:
        database = mysql_db

class Teacher(BaseModel):
    teacher_name = CharField()

class Student(BaseModel):
    student_name = CharField()
    teacher = ForeignKeyField(Teacher, backref='student')

class Course(BaseModel):
    course_name = CharField()
    teacher = ForeignKeyField(Teacher, backref='course')
    student = ForeignKeyField(Student, backref='course')

mysql_db.create_tables([Teacher, Student, Course])
data = (
    ('Tom', ('stu1', 'stu2'), ('Chinese',)),
    ('Jerry', ('stu3', 'stu4'), ('English',)),
)

for teacher_name, stu_obj, course_obj in data:
    teacher = Teacher.create(teacher_name=teacher_name)
    for student_name in stu_obj:
        student = Student.create(student_name=student_name, teacher=teacher)
        for course_name in course_obj:
            Course.create(teacher=teacher, student=student, course_name=course_name)

Association Query

Mode 1: join (connection order Teacer - > Student, Student - > Course)

# Note: You don't have to write on, because peewee will automatically pair you up.
query = Teacher.select(Teacher.teacher_name, Student.student_name, Course.course_name) \
    .join(Student, JOIN.LEFT_OUTER). \       #  Teacer -> Student
    join(Course, JOIN.LEFT_OUTER) \          #  Student -> Course
    .dicts()
for obj in query:
    print(f"Teacher:{obj['teacher_name']},Student:{obj['student_name']},curriculum:{obj['course_name']}")

Mode 2: switch (connection order Teacer - > Student, Teacher - > Course)

# It shows that the data examples I gave may not apply to the semantics of this way, but simply throw out the grammar.
query = Teacher.select(Teacher.teacher_name, Student.student_name, Course.course_name) \
    .join(Student) \                    # Teacher -> Student
    .switch(Student) \                  # Notice here that the join context power is returned to Teacher
    .join(Course, JOIN.LEFT_OUTER) \    # Teacher -> Course
    .dicts()
for obj in query:
    print(f"Teacher:{obj['teacher_name']},Student:{obj['student_name']},curriculum:{obj['course_name']}")

Mode 3: join_from (the same effect as mode 2, except for some changes in grammatical writing)

query = Teacher.select(Teacher.teacher_name, Student.student_name, Course.course_name) \
    .join_from(Teacher, Student) \                    # Notice here that the connection end and end objects are specified directly.
    .join_from(Teacher, Course, JOIN.LEFT_OUTER) \    # Notice here that the connection end and end objects are specified directly.
    .dicts()
for obj in query:
    print(f"Teacher:{obj['teacher_name']},Student:{obj['student_name']},curriculum:{obj['course_name']}")

Mode 4: Association subquery
(Note: Association sub-query means that we joined a table before, and now joins are not followed by tables, but sub-queries.)
The SQL version is as follows:

SELECT `t1`.`id`, `t1`.`student_name`, `t1`.`teacher_id`, `t2`.`stu_count` 
FROM `student` AS `t1` 
INNER JOIN (
    SELECT `t1`.`teacher_id` AS `new_teacher`, count(`t1`.`student_name`) AS `stu_count` 
    FROM `student` AS `t1` GROUP BY `t1`.`teacher_id`
) AS `t2` 
ON (`t2`.`new_teacher` = `t1`.`teacher_id`

The peewee version is as follows:

# Subqueries (grouped by students'teachers' foreign keys, counting the number of students per teacher)
temp_query = Student.select(
    Student.teacher.alias('new_teacher'),             # Remember this renaming
    fn.count(Student.student_name).alias('stu_count') # Statistical students, remember the alias, refer to the following. c grammar
).group_by(Student.teacher)    # Grouping by the teacher's foreign key in the student table
# Main query
query = Student.select(
    Student,                 # select passes on the whole class representative and queries
    temp_query.c.stu_count   # Specify the query field as the field of the sub-query, so you need to specify it in. c grammar
).join(
    temp_query,              # Association subquery
    on=(temp_query.c.new_teacher == Student.teacher) # Relevant Conditions
).dicts()

for obj in query:
    print(obj)

Way 5: No foreign key Association query (no foreign key can join oh, just specify on your own)
Rebuild a table without foreign keys and insert data

class Teacher1(BaseModel):
    teacher_name = CharField()

class Student1(BaseModel):
    student_name = CharField()
    teacher_id = IntegerField()
    
mysql_db.create_tables([Teacher1, Student1])
data = (
    ('Tom', ('zhang1', 1)),
    ('Jerry', ('zhang2', 2)),
)
for teacher_name, student_obj in data:
    Teacher1.create(teacher_name=teacher_name)
    student_name, teacher_id = student_obj
    Student1.create(student_name=student_name, teacher_id=teacher_id)

Now we implement a foreign key-free Association query:

"""Query the name of the student's teacher"""
query = Student1.select(
    Student1,     # As mentioned above, if you pass a field in select, you will look up a field, and if you pass a class, you will look up all fields.
    Teacher1      # Because the latter is join ed, but by default peewee does not list the appearance of Teacher1.
                  # So you need to specify Teacher1 manually (if we want to look up Teacher1 table information, this must be specified)
).join(
    Teacher1,     # Although there is no foreign key association, it is still join able (as is native SQL)
    on=(Student1.teacher_id==Teacher1.id)  #  This on must be specified manually
                  # Emphasize that peewee will automatically do on for us when we have foreign keys, so we don't need to specify
                  # However, this is a case where there is no foreign key association, so you must specify on manually, otherwise you can't find it.
).dicts()
for obj in query:
    print(obj)    

Mode 6: Self-associative query

# New Definition Table
class Category(Model):
    name = CharField()
    parent = ForeignKeyField('self', backref='children')  
    # Note that the foreign key quotation here is "self", which is a fixed string; backref is a reverse quotation, said.
# Create tables
mysql_db.create_tables([Category])

# insert data
data = ("son", ("father", ("grandfather", None)))
def insert_self(data):
    if data[1]:
        parent = insert_self(data[1])
        return Category.create(name=data[0], parent=parent)
    return Category.create(name=data[0])
insert_self(data)    # This is a recursive insertion method I defined myself. Maybe a little low.

# Maybe it's a little winding. Let me paste the insertion directly.
mysql> select * from category;
    +----+-------------+-----------+
    | id | name        | parent_id |
    +----+-------------+-----------+
    |  1 | grandfather |      NULL |
    |  2 | father      |         1 |
    |  3 | son         |         2 |
    +----+-------------+-----------+

# Start the query
Parent = Category.alias()   # This is the renaming operation of the table (temporary query). The acceptance parameter Parent is the table name
                            # Because self-related, self and self, copy a copy (renaming is equivalent to temporary self-copy)
query = Category.select(
    Category,
    Parent
).join(
    Parent,
    join_type=JOIN.LEFT_OUTER,    # Because the top class is empty and the default connection is inner
                                  # So the top data (grandfather) is not available.
                                  # So to look up all the data, you need to use the ==> left connection.
    # on=(Parent.id == Category.parent)    # Officials said that on needs to be specified, but I tried, not to write can also be related.
).dicts()

At this point, the introduction of related query operations is over!
Next, I will emphasize and explain the six ways mentioned above.

You can see that I used dicts() in the previous six ways, and returned a dictionary-like format. (The field names in this way conform to the SQL specification)

Of course, you can also return it in the form of class objects. (It's a bit cumbersome, I recommend dicts())
If you want to return a class object, see the following code (a little more in this way below):
query = Teacher.select(Teacher.teacher_name, Student.student_name, Course.course_name) \
    .join_from(Teacher, Student) \
    join_from(Teacher, Course, JOIN.LEFT_OUTER) # Note, I didn't use dicts()
    
for obj in query:
    print(obj.teacher_name) # This line should be all right. Teacher itself has the teacher_name field
    # Note that, according to the principle of SQL, since a join query has been made, the query results should have fields of all tables directly.
    # That's true, but peewee, we need to specify the name of the multi-table first. In the fields following the multi-table, the correct way to write it is as follows
    print(obj.student.student_name) # instead of calling obj.student_name directly
    print(obj.course.course_name) # instead of calling obj.course_name directly 
    
# Embedding a point first, if you see the example code of the N+1 query problem below, it's a bit like this.
# But as I said directly, this is pre-join(), so when it comes to look-up queries, no additional look-up queries are triggered.
# Naturally, there will be no N+1. 
# But if you don't have a join, but the query involves the appearance, then an additional appearance query will be triggered and N+1 will occur.

Associated N+1 Query Query Question:

What is N+1 query? Look at the following examples:

# There is nothing special about the data. Let's assume that the relationship between teachers and students is one-to-many (note that we use foreign keys).
class Teacher(BaseModel):
    teacher_name = CharField()

class Student(BaseModel):
    student_name = CharField()
    teacher_id = ForeignKeyField(Teacher, backref='student')

# query
teachers = Teacher.select()            # This is one time. Find out N data.
for teacher_obj in teachers:
    for student in teacher_obj.student:  # This is N cycles (N represents the data queried)
        print(student.student_name)    
        # For each appearance attribute involved, additional queries are required for the appearance, with an additional N times.
# So you can see that we query 1+N times in total, which is N+1 query. 
# In fact, we first make a table connection, query once can solve the problem. This N+1 way is a real brother.
# Here are two ways to avoid N+1

There are two ways peewee can solve the N+1 problem:
Mode 1: (join)

Connect well with join first, then query (6 ways of join mentioned above, there is always one that meets your needs)
Because peewee supports user display calling join grammar, join is a particularly good solution to the N+1 problem.

Mode 2: (prefetch of peewee)

# Of course, in addition to join, you can also use the following method provided by peewee
 # At first glance, you will find that our example of n+1 query is similar, different, you have a closer look.
teacher = Teacher.select()# Check the main table beforehand
 student = Student.select()# Look up the table beforehand
 teacher_and_student = prefetch(teacher, student) # Use prefetch method (key)
for teacher in teacher_and_student:  The following is the same as N+1.
    print(teacher.teacher_name)
    for student in teacher.student:
        print(student.student_name)
Explain:
    0. Preetch, the principle is that the master-slave table of foreign key relationship will be implicitly "one-time" removed. "When needed" can be allocated on demand.
    1. Use prefetch to find out the master-slave table with foreign key association first (note that "foreign key must exist, otherwise it will not work well")
    2. Preetch # is passed in, and peewee will automatically help us find relationships based on foreign keys.
    3. Then look up the information of other tables for the bridge by using the key fields normally.
    4. (
            As an aside, djnago also has a similar prefetch function. (Anyway, it avoids n+1 and optimizes ORM queries.) 
            It seems to set select_related() and prefetch_related() attributes to foreign key fields
        )

Unconcluding remarks

This article mainly talks about CRUD, especially for queries.
I'll also cover the extended functions of peewee in the next article.
Last portal: https://segmentfault.com/a/11...
Next portal:

Posted by mgm_03 on Mon, 02 Sep 2019 07:02:40 -0700