Contact with django has been a long time, but the way of using QuerySet query set has been relatively low-end, only using filter/Q function/exclude to query, when the amount of data is relatively small, it is OK, but if the amount of data is large, and the query is more complex, then the efficiency of using multiple filters to query will be very low. Take advantage of the Qingming vacation time, come to the company to do some private work. Output into this document, one is to deepen the impression and improve proficiency; the other is to share it for the benefit of everyone.
There are two ways to improve the efficiency of query database:
The first is to use native SQL statements to query, which has the advantage of being able to execute in full accordance with the developer's intentions. The efficiency will be very high, but the shortcomings are also obvious: 1. Developers need to be very familiar with SQL statements, increase the workload of developers, which is not conducive to the maintenance of future programs, and increase the coupling degree of programs. 2. If the query condition is dynamic, it will make the development more difficult.
To solve this problem, django provides aggregate (similar to COUNT(*) and annotate(GROUP BY).
Next, I will introduce the second method.
Today, under the direction of colleagues, I have carefully looked at the use of annotate in django, and will dynamically generate SQL statements according to query conditions, so as to improve the efficiency of combined queries.
First, suppose there is such a model:
# python:2.7.9
# django:1.7.8
class MessageTab(models.Model):
msg_sn = models.CharField(max_lenth=20, verbose_name=u'number')
msg_name = models.CharField(max_length=50, verbose_name=u'Message name')
message_time = models.DateTimeField(verbose_name=u'Time of message appearance')
msg_status = models.CharField(max_length=50, default='Untreated', verbose_name=u'Message state')
class Meta:
db_table = 'message_tab'
If in the development process, there is a requirement to query the number of message states. So we often use filter(... .count(...) To query. Now we can use:
msgS = MessageTab.objects.values_list('msg_status').annotate(Count('id'))
Where id is a self-increasing field automatically generated by the database. The use of values_list is Google, or print.
At this point, the code actually executed by the database can be passed through:
print msgS.query
Print out. As you can see:
SELECT `message_tab`.`msg_status`, COUNT(`message_tab`.`id`) AS `id__count` FROM `message_tab` GROUP BY `message_tab`.`msg_status` ORDER BY NULL
It's very intuitive. group by msg_status. If you want to customize id_count, such as specifying msg_num, you can use: annotate(msg_num=Count('id'))
When there are multiple query conditions, such as the last 7 days of query, message_name belongs to a message within a group, Q function can be used:
date_end = now().date() + timedelta(days=1)
date_start = date_end - timedelta(days=7)
messageTimeRange = (date_start, date_end)
GroupList = getGroupIdLis(request.user) # Returns a list of group s that the current user can query. For reference only
qQueryList = [Q(message_time__range=messageTimeRange), Q(message_name__in=GroupList)] # There can be multiple Q function query conditions
msgS = MessageTab.objects.filter(reduce(operator.and_, qQueryList)).values_list('msg_status').annotate(msg_num=Count('id'))
Call print msgS.query again to see the SQL statement:
SELECT `message_tab`.`msg_status`, COUNT(`message_tab`.`id`) AS `msg_num` FROM `message_tab` WHERE (`message_tab`.`message_time` BETWEEN 2017-03-27 00:00:00 AND 2017-04-03 00:00:00 AND `message_tab`.`message_name` IN (1785785, 78757, 285889, 2727333, 7272957, 786767)) GROUP BY
`message_tab`.`msg_status` ORDER BY NULL
Is it perfect!!?
Okay, the first half is over. It's time for dinner. I'm going to have dinner. Later, when you have time, add the specific usage of aggregate.
ayocross 2017.04.02
Links to the original text: http://blog.csdn.net/AyoCross/article/details/68951413 Reprinted please note, thank you!