mongodb avg聚合数组元素

时间:2022-03-25 17:43:53

I have the following collection structure

我有以下收集结构

{
   "_id": {
     "d_timestamp": NumberLong(1429949699),
     "d_isostamp": ISODate("2015-04-25T08:14:59.0Z")
   },
   "XBT-USD-cpx-okc": [
   {
       "buySpread": -1.80081
   }

I run the following aggregation

我运行以下聚合

$spreadName ='XBT-USD-stp-nex';
$pipe = array(
    array(
        '$match' => array(
            '_id.d_isostamp' => array(
                '$gt' => $start, '$lt' => $end
            )
        )
    ),
    array(
        '$project' => array(
            'sellSpread' =>'$'.$spreadName.'.sellSpread',
        )
    ),
    array(
        '$group' => array(
            '_id' => array(
                'isodate' => array(
                    '$minute' => '$_id.d_isostamp'
                )
            ),
            'rsell_spread' => array(
                '$avg' => '$sellSpread'
            ),
        )
    ),
);

$out = $collection->aggregate($pipe ,$options);

and I get as a result the value 0 for rsell_spread whereas if I run a $max for instance instead of an $avg in the $group , I get an accurate value for rsell_spread , w/ the following structure

我得到rsell_spread的值0,而如果我在$ group中运行$ max而不是$ avg,我得到一个rsell_spread的准确值,具有以下结构

{
  "_id": {
     "isodate": ISODate("2015-04-25T08:00:58.0Z")
  },
  "rsell_spread": [
     -4.49996▼
  ]
}

So I have two questions :

所以我有两个问题:

1/ How come does the $avg function does not work?

1 /为什么$ avg函数不起作用?

2/ How can I can a result not in an array when I use $max for example (just a regular number)?

2 /当我使用$ max(例如常规数字)时,如何才能使结果不在数组中?

1 个解决方案

#1


  1. The $avg group accumulator operator does work, it's only that in your case it is being applied to an element in an array and thus gives the "incorrect" result.

    $ avg组累加器运算符确实有效,只是在你的情况下它被应用于数组中的元素,因此给出“不正确”的结果。

  2. When you use the $max group accumulator operator, it returns the the highest value that results from applying an expression to each document in a group of documents, thus in your example it returned the maximum array.

    当您使用$ max group accumulator运算符时,它返回将表达式应用于一组文档中的每个文档所产生的最高值,因此在您的示例中它返回了最大数组。

To demonstrate this, consider adding a few sample documents to a test collection in mongoshell:

为了证明这一点,请考虑在mongoshell中将一些示例文档添加到测试集合中:

db.test.insert([
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949699),
        "d_isostamp" : ISODate("2015-04-25T08:14:59.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80081
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949710),
        "d_isostamp" : ISODate("2015-04-25T08:15:10.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80079
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949720),
        "d_isostamp" : ISODate("2015-04-25T08:15:20.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80083
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949730),
        "d_isostamp" : ISODate("2015-04-25T08:15:30.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80087
        }
    ]
}
])

Now, replicating the same operation above in mongoshell:

现在,在mongoshell中复制上面的相同操作:

var spreadName = "XBT-USD-stp-nex",
    start = new Date(2015, 3, 25),
    end = new Date(2015, 3, 26);
db.test.aggregate([
    {
        "$match": {
            "_id.d_isostamp": { "$gte": start, "$lte": end }
        }
    },
    {
        "$project": {
            "sellSpread": "$"+spreadName+".sellSpread"
        }
    }/*,<--- deliberately omitted the $unwind stage from the pipeline to replicate the current pipeline
    {
        "$unwind": "$sellSpread"
    }*/,
    {
        "$group": {
            "_id": {
                "isodate": { "$minute": "$_id.d_isostamp"}
            },
            "rsell_spread": {
                "$avg": "$sellSpread"
            }
        }
    }
])

Output:

/* 0 */
{
    "result" : [ 
        {
            "_id" : {
                "isodate" : 15
            },
            "rsell_spread" : 0
        }, 
        {
            "_id" : {
                "isodate" : 14
            },
            "rsell_spread" : 0
        }
    ],
    "ok" : 1
}

The solution is to include an $unwind operator pipeline stage after the $project step, this will deconstruct the XBT-USD-stp-nex array field from the input documents and outputs a document for each element. Each output document replaces the array with an element value. This will then make it possible for the $avg group accumulator operator to work.

解决方案是在$ project步骤之后包含一个$ unwind运算符管道阶段,这将从输入文档解构XBT-USD-stp-nex数组字段,并为每个元素输出一个文档。每个输出文档都使用元素值替换数组。这将使$ avg组累加器运算符成为可能。

Including this will give the aggregation result:

包括这将给出聚合结果:

/* 0 */
{
    "result" : [ 
        {
            "_id" : {
                "isodate" : 15
            },
            "rsell_spread" : -1.80083
        }, 
        {
            "_id" : {
                "isodate" : 14
            },
            "rsell_spread" : -1.80081
        }
    ],
    "ok" : 1
}

So your final working aggregation in PHP should be:

因此,您在PHP中的最终工作聚合应该是:

$spreadName ='XBT-USD-stp-nex';
$pipe = array(
    array(
        '$match' => array(
            '_id.d_isostamp' => array(
                '$gt' => $start, '$lt' => $end
            )
        )
    ),    
    array(
        '$project' => array(
            'sellSpread' =>'$'.$spreadName.'.sellSpread',
        )
    ),
    array('$unwind' => '$sellSpread'),
    array(
        '$group' => array(
            '_id' => array(
                'isodate' => array(
                    '$minute' => '$_id.d_isostamp'
                )
            ),
            'rsell_spread' => array(
                '$avg' => '$sellSpread'
            ),
        )
    ),
);

$out = $collection->aggregate($pipe ,$options);

#1


  1. The $avg group accumulator operator does work, it's only that in your case it is being applied to an element in an array and thus gives the "incorrect" result.

    $ avg组累加器运算符确实有效,只是在你的情况下它被应用于数组中的元素,因此给出“不正确”的结果。

  2. When you use the $max group accumulator operator, it returns the the highest value that results from applying an expression to each document in a group of documents, thus in your example it returned the maximum array.

    当您使用$ max group accumulator运算符时,它返回将表达式应用于一组文档中的每个文档所产生的最高值,因此在您的示例中它返回了最大数组。

To demonstrate this, consider adding a few sample documents to a test collection in mongoshell:

为了证明这一点,请考虑在mongoshell中将一些示例文档添加到测试集合中:

db.test.insert([
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949699),
        "d_isostamp" : ISODate("2015-04-25T08:14:59.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80081
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949710),
        "d_isostamp" : ISODate("2015-04-25T08:15:10.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80079
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949720),
        "d_isostamp" : ISODate("2015-04-25T08:15:20.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80083
        }
    ]
},
{
    "_id" : {
        "d_timestamp" : NumberLong(1429949730),
        "d_isostamp" : ISODate("2015-04-25T08:15:30.000Z")
    },
    "XBT-USD-stp-nex" : [ 
        {
            "sellSpread" : -1.80087
        }
    ]
}
])

Now, replicating the same operation above in mongoshell:

现在,在mongoshell中复制上面的相同操作:

var spreadName = "XBT-USD-stp-nex",
    start = new Date(2015, 3, 25),
    end = new Date(2015, 3, 26);
db.test.aggregate([
    {
        "$match": {
            "_id.d_isostamp": { "$gte": start, "$lte": end }
        }
    },
    {
        "$project": {
            "sellSpread": "$"+spreadName+".sellSpread"
        }
    }/*,<--- deliberately omitted the $unwind stage from the pipeline to replicate the current pipeline
    {
        "$unwind": "$sellSpread"
    }*/,
    {
        "$group": {
            "_id": {
                "isodate": { "$minute": "$_id.d_isostamp"}
            },
            "rsell_spread": {
                "$avg": "$sellSpread"
            }
        }
    }
])

Output:

/* 0 */
{
    "result" : [ 
        {
            "_id" : {
                "isodate" : 15
            },
            "rsell_spread" : 0
        }, 
        {
            "_id" : {
                "isodate" : 14
            },
            "rsell_spread" : 0
        }
    ],
    "ok" : 1
}

The solution is to include an $unwind operator pipeline stage after the $project step, this will deconstruct the XBT-USD-stp-nex array field from the input documents and outputs a document for each element. Each output document replaces the array with an element value. This will then make it possible for the $avg group accumulator operator to work.

解决方案是在$ project步骤之后包含一个$ unwind运算符管道阶段,这将从输入文档解构XBT-USD-stp-nex数组字段,并为每个元素输出一个文档。每个输出文档都使用元素值替换数组。这将使$ avg组累加器运算符成为可能。

Including this will give the aggregation result:

包括这将给出聚合结果:

/* 0 */
{
    "result" : [ 
        {
            "_id" : {
                "isodate" : 15
            },
            "rsell_spread" : -1.80083
        }, 
        {
            "_id" : {
                "isodate" : 14
            },
            "rsell_spread" : -1.80081
        }
    ],
    "ok" : 1
}

So your final working aggregation in PHP should be:

因此,您在PHP中的最终工作聚合应该是:

$spreadName ='XBT-USD-stp-nex';
$pipe = array(
    array(
        '$match' => array(
            '_id.d_isostamp' => array(
                '$gt' => $start, '$lt' => $end
            )
        )
    ),    
    array(
        '$project' => array(
            'sellSpread' =>'$'.$spreadName.'.sellSpread',
        )
    ),
    array('$unwind' => '$sellSpread'),
    array(
        '$group' => array(
            '_id' => array(
                'isodate' => array(
                    '$minute' => '$_id.d_isostamp'
                )
            ),
            'rsell_spread' => array(
                '$avg' => '$sellSpread'
            ),
        )
    ),
);

$out = $collection->aggregate($pipe ,$options);