Aggregation operation of MongoDB learning

Keywords: Database JSON MongoDB Programmer nosql

Basic operation of MongoDB learning

The aggregation operation of mongo is similar to the query of mysql

SQL operations / functionsmongodb aggregation operation
where$match
group by$group
having$match
select$project
order by$sort
limit$limit
sum()$sum
count()$sum
join$lookup

The following examples are compared with sql

The following is a basic structure for using a database

Data link (extraction code: gqh2)

{ 
  _id: ObjectId("5dbe7a545368f69de2b4d36e"),
  street: '493 Hilll Curve',
  city: 'Champlinberg',
  state: 'Texas',
  country: 'Malaysia',
  zip: '24344-1715',
  phone: '425.956.7743 x4621',
  name: 'Destinee Schneider',
  userId: 3573,
  orderDate: 2019-03-26T03:20:08.805Z,
  status: 'created',
  shippingFee: Decimal128("8.00"),
  orderLines: [ 
     { 
       product: 'Refined Fresh Tuna',
       sku: '2057',
       qty: 25,
       price: Decimal128("56.00"),
       cost: Decimal128("46.48") },
     {
       product: 'Refined Concrete Ball',
       sku: '1732',
       qty: 61,
       price: Decimal128("47.00"),
       cost: Decimal128("47") 
     },
  ],
  total: Decimal128("407") 
}

Let's start with some operation cases

select sum(total) from orders

db.orders.aggregate({$group:{_id:null,total:{$sum:"$total"}}})
result:{ _id: null, 'total': Decimal128("44019609") }
select count(1) from orders

db.orders.aggregate({$group:{_id:null,total:{$sum:1}}})
result:{ _id: null, total: 100000 }
select count(1) from orders group by status

db.orders.aggregate({$group:{_id:"$status",total:{$sum:1}}})
result:
{ _id: 'created', total: 20087 }
{ _id: 'shipping', total: 20017 }
{ _id: 'cancelled', total: 19978 }
{ _id: 'completed', total: 20015 }
{ _id: 'fulfilled', total: 19903 }
select count(1) from orders group by status having count(1) > 20000

db.orders.aggregate([
    {$group:{_id:{status:'$status'},total:{$sum:1}}},
    {$match:{total:{$gte:20000}}}
 ])
result:
{ _id: { status: 'created' }, total: 20087 }
{ _id: { status: 'shipping' }, total: 20017 }
{ _id: { status: 'completed' }, total: 20015 }
select count(1) total
from orders 
group by status,year(orderDate),month(orderDate)
order by year(orderDate),month(orderDate)

db.orders.aggregate([
  {
    $group:{
      _id:{
         status:'$status',
         orderDate:{
           year:{$year:"$orderDate"},
           month:{$month:"$orderDate"}
         }
       },
       total:{$sum:1}
     }
  },{
    $sort:{"_id.orderDate.year":1,"_id.orderDate.month":1}
  }
])
result:
{ _id: { status: 'cancelled', orderDate: { year: 2019, month: 1 } }, total: 2066 }
{ _id: { status: 'shipping', orderDate: { year: 2019, month: 1 } }, total: 2058 }
{ _id: { status: 'completed', orderDate: { year: 2019, month: 1 } }, total: 2068 }
{ _id: { status: 'created', orderDate: { year: 2019, month: 1 } }, total: 2047 }
{ _id: { status: 'fulfilled', orderDate: { year: 2019, month: 1 } }, total: 2076 }
{ _id: { status: 'cancelled', orderDate: { year: 2019, month: 2 } }, total: 1816 }
{ _id: { status: 'created', orderDate: { year: 2019, month: 2 } }, total: 1817 }
{ _id: { status: 'shipping', orderDate: { year: 2019, month: 2 } }, total: 1844 }
{ _id: { status: 'completed', orderDate: { year: 2019, month: 2 } }, total: 1813 }
{ _id: { status: 'fulfilled', orderDate: { year: 2019, month: 2 } }, total: 1913 }
......
select *
from(
    select month(orderDate) month,name,status
    from orders
) order
where month = 2

db.orders.aggregate([{$project:{month:{$month:"$orderDate"},name:1,status:1}},{$match:{month:2}}]) 
result:
{ _id: ObjectId("5dbe7a542411dc9de6429190"),name: 'Kris Hansen',status: 'cancelled',month: 2 }
{ _id: ObjectId("5dbe7a542411dc9de6429191"),name: 'Constantin Wuckert',status: 'completed',month: 2 }
{ _id: ObjectId("5dbe7a545368f69de2b4d375"),name: 'Reed Jerde',status: 'fulfilled',month: 2 }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1d2"),name: 'Lyric Hodkiewicz',status: 'cancelled',month: 2 }
.....
select count(*) from orders where month(orderDate) >= 3 group by month(orderDate)

db.orders.aggregate([
    {$match:{$expr:{$gte:[{$month:"$orderDate"},3]}}},
    {$group:{_id:{month:{$month:"$orderDate"}},count:{$sum:1}}}
 ]);
#result
{ _id: { month: 6 }, count: 9915 }
{ _id: { month: 4 }, count: 10083 }
{ _id: { month: 10 }, count: 9928 }
{ _id: { month: 5 }, count: 10142 }
{ _id: { month: 8 }, count: 10194 }
{ _id: { month: 9 }, count: 9779 }
{ _id: { month: 7 }, count: 10240 }
{ _id: { month: 3 }, count: 10201 }

MongoDB aggregate

Aggregation in MongoDB is mainly used to process data (such as statistical average, summation, etc.) and return calculated data results

An aggregate consists of multiple stages. The result of the previous stage will be used as the input of the next stage, so it will also be vividly called pipeline.

Expressions: process input documents and output. The expression is stateless. It can only be used to calculate the document of the current aggregation pipeline and cannot process other documents.

Here we introduce some common operations in the aggregation framework:

  • $project: modify the structure of the input document. It can be used to rename, add, or delete fields, or to create calculation results and nested documents.
  • $match: used to filter data and output only qualified documents$ Match uses MongoDB's standard query operations.
  • $limit: used to limit the number of documents returned by the MongoDB aggregation pipeline.
  • $skip: skip the specified number of documents in the aggregation pipeline and return the remaining documents.
  • $unwind: split an array type field in the document into multiple pieces, each containing a value in the array.
  • $group: groups documents in the collection, which can be used to count results.
  • $sort: sort the input documents and output them.
  • $geoNear: output ordered documents close to a geographic location.

The following is a basic process flow of aggregate

  • db.collection.aggregate() can create a pipeline with multiple components to process a series of documents. These components include: match of filtering operation, project of mapping operation, group of grouping operation, sort of sorting operation, limit of limiting operation, and skip of skipping operation.
  • db.collection.aggregate() uses MongoDB's built-in native operation, which has high aggregation efficiency and supports functions similar to SQL Group By operation.
  • The pipeline is limited to 100MB of memory per stage. If a node pipeline exceeds this limit, MongoDB will generate an error. In order to process large data sets, you can set allowDiskUse to true to write data to temporary files at the aggregation pipeline node. In this way, the memory limit of 100MB can be solved.
  • db.collection.aggregate() can act on the fragment set, but the result cannot be input in the fragment set. MapReduce can act on the fragment set, and the result can also be input in the fragment set.
  • The db.collection.aggregate() method can return a cursor. The data is placed in memory for direct operation. Pointer operation is the same as Mongo shell.
  • The output of db.collection.aggregate() can only be saved in one document, and the size of BSON Document is limited to 16M. It can be solved by returning pointers. In version 2.6, the db.collect.aggregate() method returns a pointer, which can return the size of any result set.

$count

Return document statistics

Let's first look at some count usage methods in non aggregation operations

#The corresponding query is the sum of all data in the orders collection
db.orders.count();
#Result: {"result": 100000}
#Find out the total number of orders from Malaysia
db.orders.find({country:"Malaysia"}).count()
#Result: {"result": 392}

Use $count in the aggregation operation to summarize the number of rows

#Use aggregation to find the total of orders from Malaysia and return it to the counts field
db.orders.aggregate([
    {$match:{country:"Malaysia"}},
    {$count:"counts"}
])
#Result: {"counts": 392}
#The following are two different ways to write, but they are different when $match (you can experience the following first)
db.orders.aggregate([
    {$match:{country:{$eq:"Malaysia"}}},
    {$count:"counts"}
])
db.orders.aggregate([
    {$match:{$expr:{$eq:["$country","Malaysia"]}}},
    {$count:"counts"}
])
#Result: {"counts": 392}

In addition, you can flexibly use group+$sum to implement $count

#The corresponding query is the sum of all data in the orders collection and is returned to the counts field
db.orders.aggregate({$group:{_id:null,counts:{$sum:1}}})
#Result: {"_id": null,"counts": 100000}
#Use aggregation to find the total of orders from Malaysia and return it to the counts field
db.orders.aggregate([
    {$match:{country:{$eq:"Malaysia"}}},
    {$group:{_id:null,counts:{$sum:1}}}
])
#Result: {"_id": null,"counts": 392}

$group

Groups documents according to the specified expression

Basic syntax used by $group:

{ $group: { _id: <expression>, <field1>: { <accumulator1> : <expression1> }, ... } }
  1. _ id + expressions are used as grouping conditions, that is_ The content after id is the same as the expression after group by in sql
  2. _ The field + accumulator operator after id is the same as the aggregate function after select after group by in sql, for example: sum(), avg(), max(), min()

For example:

db.orders.aggregate({$group:{_id:"$country",total:{$sum:"$total"}}})
#result
{ _id: 'Guam', total: Decimal128("182335") }
{ _id: 'El Salvador', total: Decimal128("159475") }
{ _id: 'Saint Martin', total: Decimal128("163267") }
{ _id: 'Botswana', total: Decimal128("189330") }
{ _id: 'San Marino', total: Decimal128("174200") }
{ _id: 'Czech Republic', total: Decimal128("178602") }
{ _id: 'Estonia', total: Decimal128("172816") }
.......
#The mql above is equivalent to the mql in sql
select sum(total) from orders group by country

The memory limit for the $group phase is 100M. By default, $group will generate an error if the stage exceeds this limit. However, to allow processing of large datasets, set the allowDiskUse option to true to enable the $group operation to write to temporary files.

namedescribeAnalogy sql
$avgCalculated meanavg
$firstReturn the first document of each group. If there is sorting, sort it according to. If there is no first document in the default storage order.limit 0,1
$lastReturn the last document of each group. If there is sorting, sort it according to. If there is no last document in the default storage order.-
$maxAccording to the grouping, get the maximum corresponding value of all documents in the collection.max
$minAccording to the grouping, get the minimum value of the corresponding value of all documents in the collection.min
$pushAdds the value of the specified expression to an array.-
$addToSetAdds the value of an expression to a collection (no duplicate values, unordered).-
$sumCalculate sumsum
$stdDevPopReturns the population standard deviation of the input value-
$stdDevSampReturns the sample standard deviation of the input value-

Let's use each expression in turn according to the above documents

  1. $avg calculate average

    --Calculate the average consumption per order for each country
    db.orders.aggregate({$group:{
            _id:"$country",
            avgMoney:{$avg:"$total"}
        }})
    --result    
    { _id: 'Saudi Arabia',avgMoney: Decimal128("433.4898419864559819413092550790068") }
    { _id: 'New Caledonia',avgMoney: Decimal128("441.9833729216152019002375296912114") }
    { _id: 'Congo',avgMoney: Decimal128("451.8834951456310679611650485436893") }
    { _id: 'Turkey',avgMoney: Decimal128("425.7422434367541766109785202863962") }
    { _id: 'Cuba',avgMoney: Decimal128("437.2074074074074074074074074074074") }
    { _id: 'Uruguay',avgMoney: Decimal128("434.1564792176039119804400977995110") }
    ......
  2. $first returns the first document

    --List of order items for the first order of each group according to country grouping
    db.orders.aggregate({$group:{
        _id:"$country",
        firstOrderLines:{$first:"$orderLines"}
    }})
    --result
    { _id: 'Malta',firstOrderLines: [ 
      { product: 'Sleek Frozen Table',sku: '368',qty: 30,price: Decimal128("24.00"),cost:Decimal128("19.44") },
      { product: 'Intelligent Metal',sku: '179',qty: 62,price: Decimal128("91.00"),cost: Decimal128("90.09") },
      { product: 'Intelligent Granite',sku: '9',qty: 31,price: Decimal128("68.00"),cost: Decimal128("61.88") },
      { product: 'Licensed Cotton',sku: '6846',qty: 9,price: Decimal128("16.00"),cost: Decimal128("15.68") } 
    ] }
    { _id: 'Papua New Guinea',firstOrderLines: [ 
      { product: 'Fantastic Wooden',sku: '4333',qty: 32,price: Decimal128("58.00"),cost: Decimal128("57.42") }
      ...
    ] }
    ......
    
    --According to the country grouping, the first item information in the order item list of the first order of each group
    db.orders.aggregate({$group:{
        _id:"$country",
        firstOrder:{$first:{$first:"$orderLines"}}
    }})
    ---result
    { _id: 'Malta',firstOrder:  
     { product: 'Sleek Frozen Table',sku: '368',qty: 30,price: Decimal128("24.00"),cost:Decimal128("19.44") }
    }
    { _id: 'Papua New Guinea',firstOrder:
      { product: 'Fantastic Wooden',sku: '4333',qty: 32,price: Decimal128("58.00"),cost: Decimal128("57.42") }
    }
    ......
  3. $last returns the last document

    --According to the grouping of each country, the price of each last order orderDate
    db.orders.aggregate([{$group:{
        _id:"$country",
        lastOrderDate:{$last:"$orderDate"}
    }}])
    --result
    { _id: 'Micronesia', lastOrderDate: 2019-01-15T07:23:18.002Z }
    { _id: 'Malaysia', lastOrderDate: 2019-05-15T20:16:56.644Z }
    { _id: 'San Marino', lastOrderDate: 2019-09-29T06:10:07.292Z }
  4. $max and $min: maximum and minimum

    --Find out the first order time and the last order time of each group according to the grouping of month and year
    db.orders.aggregate({$group:{
        _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
        maxDate:{$max:"$orderDate"},
        minDate:{$min:"$orderDate"}
    }})
    --result
    { _id: { year: 2019, month: 1 }, maxDate: 2019-01-31T23:53:57.308Z, minDate: 2019-01-01T00:03:59.661Z }
    { _id: { year: 2019, month: 4 }, maxDate: 2019-04-30T23:57:03.352Z, minDate: 2019-04-01T00:02:12.224Z }
    { _id: { year: 2019, month: 3 }, maxDate: 2019-03-31T23:55:10.312Z, minDate: 2019-03-01T00:13:53.761Z }
    { _id: { year: 2019, month: 7 }, maxDate: 2019-07-31T23:55:51.718Z, minDate: 2019-07-01T00:00:07.540Z }
  5. $push adds the specified value to an array, which can be pushed to an existing array. If it does not exist, such an array will be created

    --According to the city, year and month, the order time of each group will be divided into groups push To a new orderDates In the array
    db.orders.aggregate({$group:{
        _id:{city:"$city",year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
        orderDates:{$push:"$orderDate"},
    }})
    --result
    { _id: { city: 'Kennedifurt', year: 2019, month: 9 }, orderDates: [ 2019-09-30T10:25:19.763Z ] }
    { _id: { city: 'South Jewelstad', year: 2019, month: 1 }, orderDates: [ 2019-01-06T19:59:03.380Z ] }
    { _id: { city: 'Germanmouth', year: 2019, month: 9 }, orderDates: [ 2019-09-25T07:45:54.260Z ] }
    { _id: { city: 'Fayebury', year: 2019, month: 8 }, orderDates: [ 2019-08-12T11:08:37.815Z ] }
    { _id: { city: 'New Lailaport', year: 2019, month: 1 }, orderDates: [ 2019-01-19T12:28:56.978Z ] }
    { _id: { city: 'Port Bennyside', year: 2019, month: 2 }, orderDates: [ 2019-02-25T01:18:21.657Z ] }
    { _id: { city: 'Abernathymouth', year: 2019, month: 6 }, orderDates: 
       [ 2019-06-03T18:03:21.149Z,
         2019-06-13T23:35:32.994Z,
         2019-06-18T11:32:22.229Z ] 
    }
  6. $addToSet adds the specified value to a collection. The collection is unordered and will be de duplicated

    --Countries that have placed orders every month are added to the group by month countrySet Middle go
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            countrySet:{$addToSet:"$country"}
        }
    })
    --result  
    {
        "_id": {
          "year": 2019,
          "month": 1
        },
        "countrySet": ["French Guiana", "Germany", "Poland", "Comoros", "Portugal", "Fiji", "France", "Benin", "Greece", "Belarus", "Vietnam", "Ireland", "Vanuatu", "Netherlands Antilles", "Iceland", "Palestinian Territory", "Malawi", "Brazil", "Libyan Arab Jamahiriya", "Kuwait", "Liechtenstein", "Suriname", "Uganda", "New Caledonia", "Bolivia", "Nicaragua", "Burundi", "Uzbekistan", "Jamaica", "South Georgia and the South Sandwich Islands", "Tajikistan", "Mexico", "Singapore", "Sri Lanka", "Antarctica (the territory South of 60 deg S)", "Myanmar", "Tonga", "Slovenia", "Latvia", "Ukraine", "Oman", "Saint Helena", "Bosnia and Herzegovina", "Hungary", "Aruba", "Jordan", "Solomon Islands", "Mozambique", "Svalbard & Jan Mayen Islands", "Taiwan", "Cyprus", "Thailand", "Equatorial Guinea", "Belize", "Niger", "Israel", "Hong Kong", "Senegal", "Costa Rica", "Sierra Leone", "Kiribati", "Lesotho", "Nepal", "Serbia", "Barbados", "Spain", "Czech Republic", "Saint Martin", "Saint Pierre and Miquelon", "Togo", "Somalia", "Northern Mariana Islands", "Maldives", "British Indian Ocean Territory (Chagos Archipelago)", "Montenegro", "Cote d'Ivoire", "United Arab Emirates", "Guernsey", "Bulgaria", "Netherlands", "Greenland", "Niue", "Colombia", "Egypt", "Madagascar", "Brunei Darussalam", "Iraq", "Mauritius", "French Polynesia", "Jersey", "Canada", "Grenada", "Honduras", "New Zealand", "Cocos (Keeling) Islands", "Mayotte", "Virgin Islands, British", "Finland", "Macedonia", "Cook Islands", "Micronesia", "Christmas Island", "Turks and Caicos Islands", "Falkland Islands (Malvinas)", "El Salvador", "Estonia", "Eritrea", "Afghanistan", "San Marino", "Malaysia", "Cambodia", "Anguilla", "Philippines", "Zambia", "Republic of Korea", "Mauritania", "Yemen", "South Africa", "Gambia", "Namibia", "Peru", "Samoa", "Qatar", "Guinea", "Monaco", "Mongolia", "Cayman Islands", "Bouvet Island (Bouvetoya)", "Romania", "Sweden", "Guam", "Guyana", "Japan", "Bangladesh", "Djibouti", "Reunion", "Central African Republic", "Martinique", "Sudan", "Norway", "Guadeloupe", "Malta", "Papua New Guinea", "Macao", "Tunisia", "Iran", "Ghana", "Trinidad and Tobago", "Syrian Arab Republic", "French Southern Territories", "Russian Federation", "Botswana", "Pakistan", "Luxembourg", "Ethiopia", "Austria", "Rwanda", "Holy See (Vatican City State)", "American Samoa", "Tanzania", "Morocco", "Lao People's Democratic Republic", "Faroe Islands", "Bahrain", "China", "Indonesia", "Ecuador", "Tuvalu", "Panama", "Algeria", "Gibraltar", "Nigeria", "Kyrgyz Republic", "Chile", "Cape Verde", "Palau", "Armenia", "Dominican Republic", "Bhutan", "Liberia", "India", "Mali", "Switzerland", "Isle of Man", "Argentina", "Virgin Islands, U.S.", "Swaziland", "Timor-Leste", "Azerbaijan", "Bahamas", "Guatemala", "Saint Lucia", "Sao Tome and Principe", "United States Minor Outlying Islands", "Australia", "Italy", "Paraguay", "Tokelau", "Gabon", "Wallis and Futuna", "Cameroon", "Norfolk Island", "Guinea-Bissau", "Chad", "Zimbabwe", "Nauru", "Pitcairn Islands", "Georgia", "Kenya", "Bermuda", "Kazakhstan", "Democratic People's Republic of Korea", "Puerto Rico", "Croatia", "Antigua and Barbuda", "Seychelles", "Marshall Islands", "Burkina Faso", "Denmark", "United Kingdom", "Dominica", "Albania", "Angola", "Slovakia (Slovak Republic)", "Western Sahara", "Belgium", "Saudi Arabia", "Turkey", "Congo", "Cuba", "Uruguay", "Montserrat", "United States of America", "Lebanon", "Saint Vincent and the Grenadines", "Saint Kitts and Nevis", "Saint Barthelemy", "Haiti", "Moldova", "Heard Island and McDonald Islands", "Lithuania", "Turkmenistan", "Venezuela", "Andorra"]
      },
      {
        "_id": {
          "year": 2019,
          "month": 9
        },
        "countrySet": ["Germany", "Poland", "French Guiana", "Fiji", "France", "Comoros", "Portugal", "Benin", "Greece", "Belarus", "Ireland", "Vietnam", "Brazil", "Malawi", "Vanuatu", "Netherlands Antilles", "Palestinian Territory", "Iceland", "Kuwait", "Libyan Arab Jamahiriya", "Liechtenstein", "New Caledonia", "Suriname", "Uganda", "Bolivia", "Uzbekistan", "Burundi", "Nicaragua", "Tajikistan", "Jamaica", "South Georgia and the South Sandwich Islands", "Sri Lanka", "Mexico", "Singapore", "Antarctica (the territory South of 60 deg S)", "Tonga", "Myanmar", "Slovenia", "Latvia", "Oman", "Saint Helena", "Ukraine", "Bosnia and Herzegovina", "Aruba", "Jordan", "Hungary", "Mozambique", "Solomon Islands", "Svalbard & Jan Mayen Islands", "Thailand", "Taiwan", "Cyprus", "Equatorial Guinea", "Belize", "Niger", "Senegal", "Hong Kong", "Israel", "Kiribati", "Costa Rica", "Sierra Leone", "Lesotho", "Saint Martin", "Spain", "Barbados", "Nepal", "Togo", "Maldives", "Czech Republic", "Somalia", "Saint Pierre and Miquelon", "Serbia", "Northern Mariana Islands", "Montenegro", "British Indian Ocean Territory (Chagos Archipelago)", "Cote d'Ivoire", "United Arab Emirates", "Guernsey", "Niue", "Bulgaria", "Netherlands", "Egypt", "Colombia", "Greenland", "Brunei Darussalam", "Madagascar", "Mauritius", "Iraq", "Canada", "French Polynesia", "Jersey", "Grenada", "Cocos (Keeling) Islands", "New Zealand", "Honduras", "Virgin Islands, British", "Mayotte", "Cook Islands", "Finland", "Macedonia", "Micronesia", "Turks and Caicos Islands", "Christmas Island", "Estonia", "Falkland Islands (Malvinas)", "El Salvador", "Eritrea", "Malaysia", "San Marino", "Afghanistan", "Anguilla", "Cambodia", "Zambia", "Republic of Korea", "Mauritania", "Philippines", "South Africa", "Gambia", "Yemen", "Qatar", "Peru", "Namibia", "Guinea", "Samoa", "Cayman Islands", "Monaco", "Mongolia", "Bouvet Island (Bouvetoya)", "Romania", "Sweden", "Guam", "Guyana", "Djibouti", "Japan", "Bangladesh", "Reunion", "Central African Republic", "Sudan", "Norway", "Martinique", "Guadeloupe", "Papua New Guinea", "Malta", "Tunisia", "Macao", "Iran", "Ghana", "Syrian Arab Republic", "Trinidad and Tobago", "French Southern Territories", "Botswana", "Luxembourg", "Russian Federation", "Pakistan", "Ethiopia", "Holy See (Vatican City State)", "Panama", "Austria", "Rwanda", "American Samoa", "Faroe Islands", "Tanzania", "Morocco", "Lao People's Democratic Republic", "Ecuador", "China", "Indonesia", "Bahrain", "Algeria", "Tuvalu", "Gibraltar", "Nigeria", "Kyrgyz Republic", "Chile", "Palau", "Cape Verde", "Bhutan", "Dominican Republic", "Armenia", "Mali", "Isle of Man", "Liberia", "India", "Switzerland", "Argentina", "Virgin Islands, U.S.", "Timor-Leste", "Swaziland", "Azerbaijan", "United States Minor Outlying Islands", "Saint Lucia", "Bahamas", "Guatemala", "Australia", "Sao Tome and Principe", "Tokelau", "Paraguay", "Italy", "Wallis and Futuna", "Gabon", "Cameroon", "Guinea-Bissau", "Chad", "Norfolk Island", "Zimbabwe", "Nauru", "Georgia", "Kenya", "Pitcairn Islands", "Bermuda", "Kazakhstan", "Democratic People's Republic of Korea", "Croatia", "Puerto Rico", "Antigua and Barbuda", "Seychelles", "Marshall Islands", "Burkina Faso", "Dominica", "Denmark", "Albania", "United Kingdom", "Angola", "Slovakia (Slovak Republic)", "Western Sahara", "Belgium", "Turkey", "Congo", "Saudi Arabia", "Uruguay", "Cuba", "United States of America", "Montserrat", "Lebanon", "Saint Kitts and Nevis", "Saint Vincent and the Grenadines", "Saint Barthelemy", "Haiti", "Moldova", "Lithuania", "Heard Island and McDonald Islands", "Turkmenistan", "Venezuela", "Andorra"]
      }
    .......
  7. $sum calculate sum

    --Get the total income of each group by month grouping sumTotal
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            sumTotal:{$sum:"$total"}
        }
    })
    --result
    { _id: { year: 2019, month: 2 }, sumTotal: Decimal128("4072808") }
    { _id: { year: 2019, month: 10 }, sumTotal: Decimal128("4356471") }
    { _id: { year: 2019, month: 5 }, sumTotal: Decimal128("4460433") }
    ......
  8. $stdDevPop returns the overall standard deviation of the input value

    --Group by month,Calculation of overall standard deviation
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            popTotal:{$stdDevPop:"$total"}
        }
    })
    --result
    { _id: { year: 2019, month: 2 }, popTotal: 189.3064965965138 }
    { _id: { year: 2019, month: 10 }, popTotal: 187.19676293125292 }
    { _id: { year: 2019, month: 5 }, popTotal: 189.54277980510432 }
    { _id: { year: 2019, month: 8 }, popTotal: 189.52305549485735 }
    { _id: { year: 2019, month: 6 }, popTotal: 189.99641948294692 }
    { _id: { year: 2019, month: 1 }, popTotal: 188.89723701416594 }
    { _id: { year: 2019, month: 4 }, popTotal: 189.33635941008336 }
    { _id: { year: 2019, month: 3 }, popTotal: 190.39465578257668 }
    { _id: { year: 2019, month: 7 }, popTotal: 189.01641050584374 }
    { _id: { year: 2019, month: 9 }, popTotal: 188.10379143822877 }
  9. $stdDevSamp returns the sample standard deviation of the input value

    --Group by month,Calculation of sample standard deviation
    db.orders.aggregate({
        $group:{
            _id:{year:{$year:"$orderDate"},month:{$month:"$orderDate"}},
            sampTotal:{$stdDevSamp:"$total"}
        }
    })
    --result
    { _id: { year: 2019, month: 2 }, sampTotal: 189.31678247750685 }
    { _id: { year: 2019, month: 9 }, sampTotal: 188.1134099175866 }
    { _id: { year: 2019, month: 7 }, sampTotal: 189.02564049879336 }
    { _id: { year: 2019, month: 3 }, sampTotal: 190.40398862519802 }
    { _id: { year: 2019, month: 5 }, sampTotal: 189.55212494401323 }
    { _id: { year: 2019, month: 4 }, sampTotal: 189.34574899869335 }
    { _id: { year: 2019, month: 1 }, sampTotal: 188.90639411415503 }
    { _id: { year: 2019, month: 8 }, sampTotal: 189.53235199281477 }
    { _id: { year: 2019, month: 6 }, sampTotal: 190.00600146946147 }
    { _id: { year: 2019, month: 10 }, sampTotal: 187.20619136123352 }

$match

Accept a document with specified query criteria. The query syntax is the same as the read operation query syntax.

Basic syntax {$match: {< query >}}

In practical application, put $match in front of the pipe as much as possible. This has two benefits:

  1. You can quickly filter out unwanted documents to reduce the workload of the pipeline
  2. If $match is executed before re projection and grouping, the query can use the index.
--be similar to in query
db.orders.aggregate({
    $match:{
        country:{$in:["Romania", "Sweden", "Guam", "Guyana"]}
    }
})
--Results: orders from these countries were found
--Range query
db.orders.aggregate({
    $match:{
        orderDate:{$gte:ISODate("2019-02-01"),$lt:ISODate("2019-02-04")}
    }
})
--result:Find out 2019-02-01 By 2019-02-03 All orders for these three days

$expr use aggregate expression

--Query data for March and beyond - Group by month - sum Number of orders
db.orders.aggregate([
    {$match:{$expr:{$gte:[{$month:"$orderDate"},3]}}},
    {$group:{_id:{month:{$month:"$orderDate"}},count:{$sum:1}}}
 ]);
--result
{ _id: { month: 7 }, count: 10240 }
{ _id: { month: 5 }, count: 10142 }
{ _id: { month: 6 }, count: 9915 }
{ _id: { month: 4 }, count: 10083 }
{ _id: { month: 10 }, count: 9928 }
{ _id: { month: 9 }, count: 9779 }
{ _id: { month: 3 }, count: 10201 }
{ _id: { month: 8 }, count: 10194 }

$mod uses modulo operator

--query total The property is followed by an order ending in 00
db.orders.aggregate([
    {$match:{total:{$mod:[100,0]}}}
 ]);
--result
{ _id: ObjectId("5dbe7a575368f69de2b4d4db"),
  street: '5929 Elroy Points',
  city: 'Retaberg',
  state: 'Utah',
  country: 'Cote d\'Ivoire',
  zip: '73722-0034',
  phone: '113.509.1520',
  name: 'Sanford Runte',
  userId: 7843,
  orderDate: 2019-02-21T20:26:32.458Z,
  status: 'completed',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Steel Shoes',
       sku: '1374',
       qty: 82,
       price: Decimal128("15.00"),
       cost: Decimal128("13.35") },
     { product: 'Sleek Frozen Salad',
       sku: '2698',
       qty: 79,
       price: Decimal128("41.00"),
       cost: Decimal128("33.21") },
     { product: 'Intelligent Granite Mouse',
       sku: '17',
       qty: 55,
       price: Decimal128("54.00"),
       cost: Decimal128("50.76") },
     { product: 'Handcrafted Wooden Chicken',
       sku: '2079',
       qty: 4,
       price: Decimal128("17.00"),
       cost: Decimal128("17") } ],
  total: Decimal128("500") }
{ _id: ObjectId("5dbe7a575368f69de2b4d50c"),
  street: '6159 Vandervort Camp',
  city: 'South Bobby',
  state: 'Montana',
  country: 'Guernsey',
  zip: '55141',
  phone: '173.672.8440 x661',
  name: 'Jovan Rice',
  userId: 3526,
  orderDate: 2019-09-14T21:05:45.049Z,
  status: 'shipping',
  shippingFee: Decimal128("9.00"),
  orderLines: 
   [ { product: 'Small Metal Sausages',
       sku: '8130',
       qty: 11,
       price: Decimal128("80.00"),
       cost: Decimal128("67.2") },
     { product: 'Intelligent Rubber Chicken',
       sku: '3775',
       qty: 61,
       price: Decimal128("10.00"),
       cost: Decimal128("8") },
     { product: 'Generic Rubber Table',
       sku: '7102',
       qty: 36,
       price: Decimal128("10.00"),
       cost: Decimal128("8.5") } ],
  total: Decimal128("100") }
......

$regex uses regular expressions to match

--Order quantity of mobile phone numbers starting with 184
db.orders.aggregate([
    {$match:{ phone: { $regex: /^184/ }}},
    {$count:"counts"}
 ]);
--result
{"counts": 55}

$unwind

Split the array into separate documents

format

{
  $unwind:
  {
      path: <field path>,
      includeArrayIndex: <string>,  
      preserveNullAndEmptyArrays: <boolean> 
  } 
}

includeArrayIndex: optional. The name of a new field is used to store the array index of the element. The name cannot start with $.

preserveNullAndEmptyArrays: optional. The default value is false. If it is true, if the path has no corresponding field or the corresponding array size is 0, then $unwind will output the document. The default value is false and will not output.

--Filter a piece of data and split the array
db.orders.aggregate([
    {$match:{_id:ObjectId("5dbe7aa650fc769de3e1b551")}},
    {$unwind:"$orderLines"},
 ]);
--result
{ _id: ObjectId("5dbe7aa650fc769de3e1b551"),
  street: '3340 Marielle Manors',
  city: 'New Maymie',
  state: 'Connecticut',
  country: 'Malawi',
  zip: '22434-3104',
  phone: '184.544.4826 x4858',
  name: 'Annette Langworth',
  userId: 9830,
  orderDate: 2019-01-23T11:56:14.972Z,
  status: 'shipping',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   { product: 'Sleek Granite Gloves',
     sku: '6176',
     qty: 31,
     price: Decimal128("74.00"),
     cost: Decimal128("71.04") },
  total: Decimal128("313") }
{ _id: ObjectId("5dbe7aa650fc769de3e1b551"),
  street: '3340 Marielle Manors',
  city: 'New Maymie',
  state: 'Connecticut',
  country: 'Malawi',
  zip: '22434-3104',
  phone: '184.544.4826 x4858',
  name: 'Annette Langworth',
  userId: 9830,
  orderDate: 2019-01-23T11:56:14.972Z,
  status: 'shipping',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   { product: 'Licensed Soft Cheese',
     sku: '2702',
     qty: 70,
     price: Decimal128("55.00"),
     cost: Decimal128("53.9") },
  total: Decimal128("313") }
......

$project

Specify the desired and unwanted fields from the document

format

{ $project: { <specification(s)> } }

specifications There are the following forms:

<field>: <1 or true> Include this field
<field>: <0 or false> This field is not included

_id: <0 or false>        Specify_id field
--If one or more attributes are 1, only one or more attributes are displayed + _id
db.orders.aggregate({
    $project:{name:1}
})
--result
{ _id: ObjectId("5dbe7a545368f69de2b4d36e"), name: 'Destinee Schneider' }
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"), name: 'Ashlynn Sipes' }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1cc"), name: 'Genoveva Bauch' }
{ _id: ObjectId("5dbe7a542411dc9de6429190"), name: 'Kris Hansen' }

--If one or several properties are 0, all properties except this one or several properties will be displayed
db.orders.aggregate({
    $project:{orderLines:0}
})
--result
{
    "_id": {"$oid": "5dbe7a545368f69de2b4d36e"},
    "city": "Champlinberg",
    "country": "Malaysia",
    "name": "Destinee Schneider",
    "orderDate": {"$date": "2019-03-26T03:20:08.805Z"},
    "phone": "425.956.7743 x4621",
    "shippingFee": {"$numberDecimal": 8.00},
    "state": "Texas",
    "status": "created",
    "street": "493 Hilll Curve",
    "total": {"$numberDecimal": 407},
    "userId": 3573,
    "zip": "24344-1715"
},
{
    "_id": {"$oid": "5dbe7a545368f69de2b4d36f"},
    "city": "Linwoodburgh",
    "country": "United States of America",
    "name": "Ashlynn Sipes",
    "orderDate": {"$date": "2019-07-18T07:21:53.530Z"},
    "phone": "508.326.5494 x1218",
    "shippingFee": {"$numberDecimal": 7.00},
    "state": "Indiana",
    "status": "shipping",
    "street": "39476 Lacey Harbor",
    "total": {"$numberDecimal": 439},
    "userId": 2500,
    "zip": "84551"
}
......

--Show only nested properties
db.orders.aggregate({
    $project:{"orderLines.price":1}
})
perhaps
db.orders.aggregate({
    $project:{orderLines:{price:1}}
})
--result
{ _id: ObjectId("5dbe7a542411dc9de6429193"),
  orderLines: 
   [ { price: Decimal128("75.00") },
     { price: Decimal128("64.00") },
     { price: Decimal128("34.00") },
     { price: Decimal128("98.00") },
     { price: Decimal128("88.00") },
     { price: Decimal128("20.00") },
     { price: Decimal128("59.00") },
     { price: Decimal128("20.00") },
     { price: Decimal128("90.00") },
     { price: Decimal128("45.00") },
     { price: Decimal128("42.00") },
     { price: Decimal128("28.00") } ] }
{ _id: ObjectId("5dbe7a5450fc769de3e19d20"),
  orderLines: 
   [ { price: Decimal128("51.00") },
     { price: Decimal128("10.00") },
     { price: Decimal128("63.00") },
     { price: Decimal128("12.00") },
     { price: Decimal128("37.00") },
     { price: Decimal128("43.00") },
     { price: Decimal128("39.00") },
     { price: Decimal128("68.00") },
     { price: Decimal128("21.00") } ] }
......

The use of $cond if then else is equivalent to case when then else in SQL

$$REMOVE removes this attribute when this condition is met

--Not a July document, remove this attribute
db.orders.aggregate({
    $project:{
        name:1,
        orderDate:{
            $cond: {
                   if: { $ne: [ {"$month":"$orderDate"}, 7 ] },
                   then: "$$REMOVE",
                   else: "$orderDate"
            }
        }
    }
})
--result
{ _id: ObjectId("5dbe7a545368f69de2b4d36e"), name: 'Destinee Schneider' }
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"), name: 'Ashlynn Sipes', orderDate: 2019-07-18T07:21:53.530Z }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1cc"), name: 'Genoveva Bauch' }
{ _id: ObjectId("5dbe7a542411dc9de6429190"), name: 'Kris Hansen' }

Mapping to a property contains multiple properties

--use substr Intercept the first letter and use strLenCP take name Length of
db.orders.aggregate({
    $project:{
        _id: 0,
        nm:{
            name:"$name",
            firstLetter:{$substr:["$name",0,1]},
            nameLenth:{$strLenCP:"$name"}
        }
    }
})
--result
{ nm: { name: 'Destinee Schneider', firstLetter: 'D', nameLenth: 18 } }
{ nm: { name: 'Ashlynn Sipes', firstLetter: 'A', nameLenth: 13 } }
{ nm: { name: 'Genoveva Bauch', firstLetter: 'G', nameLenth: 14 } }
{ nm: { name: 'Kris Hansen', firstLetter: 'K', nameLenth: 11 } }
{ nm: { name: 'Dudley Kertzmann', firstLetter: 'D', nameLenth: 16 } }
......

Map the values of multiple attributes to an array

db.orders.aggregate({
    $project:{
        _id: 0,
        msg:[
            "$name","$orderDate","$orderLines.price"
        ]
    }
})
--result
{msg: 
   [ 
     'Gudrun Stamm',
     2019-09-10T01:00:00.679Z,
     [ 
       Decimal128("17.00"),
       Decimal128("91.00"),
       Decimal128("51.00"),
       Decimal128("10.00"),
       Decimal128("18.00"),
       Decimal128("46.00"),
       Decimal128("69.00"),
       Decimal128("18.00"),
       Decimal128("89.00"),
       Decimal128("99.00") 
    ] 
  ] 
}
{ msg: 
   [ 
     'Jalon Erdman',
     2019-03-06T08:30:55.042Z,
     [ 
       Decimal128("37.00"),
       Decimal128("91.00"),
       Decimal128("88.00"),
       Decimal128("20.00"),
       Decimal128("75.00"),
       Decimal128("46.00") 
    ] 
  ] 
}
{ msg: 
   [ 
     'Mossie Ankunding',
     2019-05-25T09:40:13.662Z,
     [ 
       Decimal128("14.00"),
       Decimal128("49.00"),
       Decimal128("38.00"),
       Decimal128("55.00"),
       Decimal128("20.00") 
     ] 
   ] 
}
{ msg: 
   [ 
     'Jorge Toy',
     2019-09-28T23:07:35.137Z,
     [ 
       Decimal128("71.00"),
       Decimal128("62.00"),
       Decimal128("59.00"),
       Decimal128("43.00"),
       Decimal128("55.00"),
       Decimal128("65.00"),
       Decimal128("57.00") 
     ] 
   ] 
}
......

$limit

Limit the number of pieces and obtain the first n pieces of data

db.orders.aggregate({
    $limit:2
})
--result
{ _id: ObjectId("5dbe7a545368f69de2b4d36e"),
  street: '493 Hilll Curve',
  city: 'Champlinberg',
  state: 'Texas',
  country: 'Malaysia',
  zip: '24344-1715',
  phone: '425.956.7743 x4621',
  name: 'Destinee Schneider',
  userId: 3573,
  orderDate: 2019-03-26T03:20:08.805Z,
  status: 'created',
  shippingFee: Decimal128("8.00"),
  orderLines: 
   [ { product: 'Refined Fresh Tuna',
       sku: '2057',
       qty: 25,
       price: Decimal128("56.00"),
       cost: Decimal128("46.48") },
     { product: 'Intelligent Wooden Towels',
       sku: '5674',
       qty: 72,
       price: Decimal128("84.00"),
       cost: Decimal128("68.88") },
     { product: 'Refined Steel Bacon',
       sku: '5009',
       qty: 8,
       price: Decimal128("53.00"),
       cost: Decimal128("50.35") } ],
  total: Decimal128("407") }
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"),
  street: '39476 Lacey Harbor',
  city: 'Linwoodburgh',
  state: 'Indiana',
  country: 'United States of America',
  zip: '84551',
  phone: '508.326.5494 x1218',
  name: 'Ashlynn Sipes',
  userId: 2500,
  orderDate: 2019-07-18T07:21:53.530Z,
  status: 'shipping',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Soft Soap',
       sku: '6274',
       qty: 71,
       price: Decimal128("91.00"),
       cost: Decimal128("89.18") },
     { product: 'Intelligent Steel Chair',
       sku: '8278',
       qty: 13,
       price: Decimal128("67.00"),
       cost: Decimal128("62.31") },
     { product: 'Small Rubber Shoes',
       sku: '3534',
       qty: 60,
       price: Decimal128("76.00"),
       cost: Decimal128("71.44") } ],
  total: Decimal128("439") }
......

$skip

Skip the first n rows of data query

--Query articles 2 and 3
db.orders.aggregate([{
        $skip:1
    },{
        $limit:2
}])
--result
{ _id: ObjectId("5dbe7a545368f69de2b4d36f"),
  street: '39476 Lacey Harbor',
  city: 'Linwoodburgh',
  state: 'Indiana',
  country: 'United States of America',
  zip: '84551',
  phone: '508.326.5494 x1218',
  name: 'Ashlynn Sipes',
  userId: 2500,
  orderDate: 2019-07-18T07:21:53.530Z,
  status: 'shipping',
  shippingFee: Decimal128("7.00"),
  orderLines: 
   [ { product: 'Fantastic Soft Soap',
       sku: '6274',
       qty: 71,
       price: Decimal128("91.00"),
       cost: Decimal128("89.18") },
     { product: 'Incredible Concrete Chips',
       sku: '3756',
       qty: 6,
       price: Decimal128("18.00"),
       cost: Decimal128("15.12") },
     { product: 'Intelligent Steel Chair',
       sku: '8278',
       qty: 13,
       price: Decimal128("67.00"),
       cost: Decimal128("62.31") },
     { product: 'Small Rubber Shoes',
       sku: '3534',
       qty: 60,
       price: Decimal128("76.00"),
       cost: Decimal128("71.44") } ],
  total: Decimal128("439") }
{ _id: ObjectId("5dbe7a54cd023b9de4efc1cc"),
  street: '699 Harvey Row',
  city: 'Electamouth',
  state: 'South Dakota',
  country: 'Burundi',
  zip: '61826',
  phone: '(936) 449-4255 x58095',
  name: 'Genoveva Bauch',
  userId: 8302,
  orderDate: 2019-03-15T13:53:48.925Z,
  status: 'shipping',
  shippingFee: Decimal128("5.00"),
  orderLines: 
   [ { product: 'Intelligent Soft Salad',
       sku: '3711',
       qty: 85,
       price: Decimal128("86.00"),
       cost: Decimal128("76.54") },
     { product: 'Generic Cotton Ball',
       sku: '2112',
       qty: 44,
       price: Decimal128("21.00"),
       cost: Decimal128("19.32") },
     { product: 'Rustic Plastic Keyboard',
       sku: '6451',
       qty: 19,
       price: Decimal128("81.00"),
       cost: Decimal128("77.76") } ],
  total: Decimal128("341") }

$sort

Sort documents ascending: 1 descending: - 1

--Sort by name
db.orders.aggregate([{
        $sort:{name:1}
    },{
        $project:{_id:0,name:1}
    }
])
--result
{ name: 'Aaliyah Bruen' }
{ name: 'Aaliyah Erdman' }
{ name: 'Aaliyah Fahey' }
{ name: 'Aaliyah Gerhold' }
{ name: 'Aaliyah Graham' }
{ name: 'Aaliyah Greenfelder' }
{ name: 'Aaliyah Konopelski' }
{ name: 'Aaliyah Kovacek' }
{ name: 'Aaliyah Kuphal' }
{ name: 'Aaliyah Lueilwitz' }
{ name: 'Aaliyah Maggio' }
......

$sortByCount

Calculate the count value according to a field grouping, and then sort it in descending order according to this value

db.orders.aggregate({
    $sortByCount:"$status"
})
--result
{ _id: 'created', count: 20087 }
{ _id: 'shipping', count: 20017 }
{ _id: 'completed', count: 20015 }
{ _id: 'cancelled', count: 19978 }
{ _id: 'fulfilled', count: 19903 }

Posted by asfaw on Mon, 29 Nov 2021 17:44:57 -0800