MongoDB 索引相关知识

时间:2022-01-03 01:59:43

背景:

MongoDB和MySQL一样,都会产生慢查询,所以都需要对其进行优化:包括创建索引、重构查询等。现在就说明在MongoDB下的索引相关知识点,可以通过这篇文章MongoDB 查询优化分析了解MongoDB慢查询的一些特点。

执行计划分析:

因为MongoDB也是BTree索引,所以使用上和MySQL大致一样。通过explain查看一个query的执行计划,来判断如何加索引,explain在3.0版本的时候做了一些改进,现在针对这2个版本进行分析:

3.0之前:

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain()
{
"cursor" : "BtreeCursor b_1_date_1", #游标类型:BasicCursor(全表扫描)、BtreeCursor(BTree索引扫描)、GeoSearchCursor(地理空间索引扫描)。
"isMultiKey" : false,
"n" : , #返回的结果数,count()。
"nscannedObjects" : , #扫描的对象
"nscanned" : , #扫描的索引数
"nscannedObjectsAllPlans" : , #代表所有尝试执行的计划所扫描的对象
"nscannedAllPlans" : , #代表所有尝试执行的计划所扫描的索引
"scanAndOrder" : false, #True:对文档进行排序,false:对索引进行排序
"indexOnly" : false, #对查询的结果进行排序不需要搜索其他文档,查询和返回字段使用同一索引
"nYields" : , #为了让写操作执行而让出读锁的次数
"nChunkSkips" : , #忽略文档数
"millis" : , #执行查询消耗的时间
"indexBounds" : { #索引扫描中使用的最大/小值。
"b" : [
[
"CYHS1301942",
"CYHS1301942"
]
],
"date" : [
[
{
"$minElement" :
},
{
"$maxElement" :
}
]
]
},
"server" : "db-mongo1:27017"
}

3.0之后:在explain()里有三个参数:"queryPlanner", "executionStats", and "allPlansExecution",默认是:queryPlanner。具体的含义见官方文档

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain()
{
"queryPlanner" : {
"plannerVersion" : ,
"namespace" : "cde.newtask", #集合
"indexFilterSet" : false,
"parsedQuery" : {
"b" : {
"$eq" : "CYHS1301942"
}
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN", #索引扫描,COLLSCAN表示全表扫描。
"keyPattern" : {
"b" : ,
"date" :
},
"indexName" : "b_1_date_1", #索引名
"isMultiKey" : false,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"CYHS1301942\", \"CYHS1301942\"]"
],
"date" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "mongo1",
"port" : ,
"version" : "3.0.4",
"gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5"
},
"ok" :
}

3.0要是查看更详细的执行计划请看其他2个参数:

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain("allPlansExecution")
{
"queryPlanner" : {
"plannerVersion" : ,
"namespace" : "cde.newtask",
"indexFilterSet" : false,
"parsedQuery" : {
"b" : {
"$eq" : "CYHS1301942"
}
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"b" : ,
"date" :
},
"indexName" : "b_1_date_1",
"isMultiKey" : false,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"CYHS1301942\", \"CYHS1301942\"]"
],
"date" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : ,
"executionTimeMillis" : ,
"totalKeysExamined" : ,
"totalDocsExamined" : ,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : ,
"executionTimeMillisEstimate" : ,
"works" : ,
"advanced" : ,
"needTime" : ,
"needFetch" : ,
"saveState" : ,
"restoreState" : ,
"isEOF" : ,
"invalidates" : ,
"docsExamined" : ,
"alreadyHasObj" : ,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : ,
"executionTimeMillisEstimate" : ,
"works" : ,
"advanced" : ,
"needTime" : ,
"needFetch" : ,
"saveState" : ,
"restoreState" : ,
"isEOF" : ,
"invalidates" : ,
"keyPattern" : {
"b" : ,
"date" :
},
"indexName" : "b_1_date_1",
"isMultiKey" : false,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"CYHS1301942\", \"CYHS1301942\"]"
],
"date" : [
"[MinKey, MaxKey]"
]
},
"keysExamined" : ,
"dupsTested" : ,
"dupsDropped" : ,
"seenInvalidated" : ,
"matchTested" :
}
},
"allPlansExecution" : [ ]
},
"serverInfo" : {
"host" : "mongo1",
"port" : ,
"version" : "3.0.4",
"gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5"
},
"ok" :
}
zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain("executionStats")
{
"queryPlanner" : {
"plannerVersion" : ,
"namespace" : "cde.newtask",
"indexFilterSet" : false,
"parsedQuery" : {
"b" : {
"$eq" : "CYHS1301942"
}
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"b" : ,
"date" :
},
"indexName" : "b_1_date_1",
"isMultiKey" : false,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"CYHS1301942\", \"CYHS1301942\"]"
],
"date" : [
"[MinKey, MaxKey]"
]
}
}
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : ,
"executionTimeMillis" : ,
"totalKeysExamined" : ,
"totalDocsExamined" : ,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : ,
"executionTimeMillisEstimate" : ,
"works" : ,
"advanced" : ,
"needTime" : ,
"needFetch" : ,
"saveState" : ,
"restoreState" : ,
"isEOF" : ,
"invalidates" : ,
"docsExamined" : ,
"alreadyHasObj" : ,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : ,
"executionTimeMillisEstimate" : ,
"works" : ,
"advanced" : ,
"needTime" : ,
"needFetch" : ,
"saveState" : ,
"restoreState" : ,
"isEOF" : ,
"invalidates" : ,
"keyPattern" : {
"b" : ,
"date" :
},
"indexName" : "b_1_date_1",
"isMultiKey" : false,
"direction" : "forward",
"indexBounds" : {
"b" : [
"[\"CYHS1301942\", \"CYHS1301942\"]"
],
"date" : [
"[MinKey, MaxKey]"
]
},
"keysExamined" : ,
"dupsTested" : ,
"dupsDropped" : ,
"seenInvalidated" : ,
"matchTested" :
}
}
},
"serverInfo" : {
"host" : "mongo1",
"port" : ,
"version" : "3.0.4",
"gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5"
},
"ok" :
}

上面介绍了如何查看执行计划,那么下面介绍下如何管理索引。

索引管理具体请看[权威指南第5章]

1)查看/显示集合的索引:db.collectionName.getIndexes()或则db.system.indexes.find()

zjy:PRIMARY> db.data.getIndexes()
[
{
"v" : ,
"key" : {
"_id" :
},
"name" : "_id_", #索引名
"ns" : "survey.data" #集合名
},
{
"v" : ,
"unique" : true, #唯一索引
"key" : {
"sid" : ,
"user" :
},
"name" : "sid_1_user_1",
"ns" : "survey.data"
},
{
"v" : ,
"key" : {
"sid" : ,
"cdate" : -
},
"name" : "sid_1_cdate_-1",
"ns" : "survey.data"
},
{
"v" : ,
"key" : {
"sid" : ,
"created" : -
},
"name" : "sid_1_created_-1",
"ns" : "survey.data"
},
{
"v" : ,
"key" : {
"sid" : ,
"user" : ,
"modified" :
},
"name" : "sid_1_user_1_modified_1",
"ns" : "survey.data"
}
]
zjy:PRIMARY> db.system.indexes.find({"ns":"survey.data"})
{ "v" : , "key" : { "_id" : }, "name" : "_id_", "ns" : "survey.data" }
{ "v" : , "unique" : true, "key" : { "sid" : , "user" : }, "name" : "sid_1_user_1", "ns" : "survey.data" }
{ "v" : , "key" : { "sid" : , "cdate" : - }, "name" : "sid_1_cdate_-1", "ns" : "survey.data" }
{ "v" : , "key" : { "sid" : , "created" : - }, "name" : "sid_1_created_-1", "ns" : "survey.data" }
{ "v" : , "key" : { "sid" : , "user" : , "modified" : }, "name" : "sid_1_user_1_modified_1", "ns" : "survey.data" }

2)创建索引:db.collections.ensureIndex({...})

普通索引

zjy:PRIMARY> db.comments.ensureIndex({"name":})  #name字段上创建索引,升序。倒序为-。
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
} zjy:PRIMARY> db.comments.ensureIndex({"account.name":}) #内嵌文档上创建索引。
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
} zjy:PRIMARY> db.comments.ensureIndex({"age":},{"name":"idx_name"}) #指定索引名称
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
} zjy:PRIMARY> db.comments.ensureIndex({"name":,"age":},{"name":"idx_name_age","background":true}) #后台创建复合索引
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
} zjy:PRIMARY> db.comments.ensureIndex({"name":,"age":},{"name":"uk_name_age","background":true,"unique":true}) #后台创建唯一索引
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
}
zjy:PRIMARY> db.comments.ensureIndex({"name":,"age":},{"unique":true,"dropDups":true,"name":"uk_name_age"}) #删除重复数据创建唯一索引,dropDups在3.0里废弃。
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
}

哈希索引hashed

zjy:PRIMARY> db.abc.ensureIndex({"a":"hashed"})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
}
zjy:PRIMARY> db.abc.getIndexes()
[
{
"v" : ,
"key" : {
"_id" :
},
"name" : "_id_",
"ns" : "test.abc"
},
{
"v" : ,
"key" : {
"a" : "hashed"
},
"name" : "a_hashed",
"ns" : "test.abc"
}
]

这里还有2个比较特殊的索引:稀疏索引(sparse)和TTL索引(expireAfterSeconds)

TTL索引是一种特定的数据块,请求赋予时间范围的方式,它指定一个时间点,超过该时间点数据变成无效。

zjy:PRIMARY> db.comments.find()
{ "_id" : ObjectId("55ae6b99313fd7b879b5296c"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:09.651Z") }
{ "_id" : ObjectId("55ae6b9a313fd7b879b5296d"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:10.739Z") }
{ "_id" : ObjectId("55ae6b9b313fd7b879b5296e"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:11.555Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b5296f"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:12.267Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b52970"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:12.899Z") }
zjy:PRIMARY> db.comments.ensureIndex({"ts":},{expireAfterSeconds:}) #创建TTL索引,过期时间60秒,即60秒时间生成的数据会被删除。
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
}
zjy:PRIMARY> db.comments.find()
{ "_id" : ObjectId("55ae6b99313fd7b879b5296c"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:09.651Z") }
{ "_id" : ObjectId("55ae6b9a313fd7b879b5296d"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:10.739Z") }
{ "_id" : ObjectId("55ae6b9b313fd7b879b5296e"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:11.555Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b5296f"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:12.267Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b52970"), "name" : "zhoujy", "age" : , "ts" : ISODate("2015-07-21T15:56:12.899Z") } zjy:PRIMARY> db.comments.getIndexes()
[
{
"v" : ,
"key" : {
"_id" :
},
"name" : "_id_",
"ns" : "test.comments"
},
{
"v" : ,
"key" : {
"ts" :
},
"name" : "ts_1",
"ns" : "test.comments",
"expireAfterSeconds" :
}
] zjy:PRIMARY> db.comments.find() #60秒之后查看,数据已经没有

最后有一类索引是text index 文本索引:更多的信息见 [MongoDB大数据处理权威指南第八章]和这里

测试数据:

db.comments.insert({"name":"abc","mem":"You can create a text index on the field or fields whose value is a string or an array of string elements","ts":new Date()})

db.comments.insert({"name":"def","mem":"When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)","ts":new Date()})

db.comments.insert({"name":"ghi","mem":"This text index catalogs all string data in the subject field and the content field, where the field value is either a string or an array of string elements.","ts":new Date()})

db.comments.insert({"name":"jkl","mem":"To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.","ts":new Date()})

db.comments.insert({"name":"mno","mem":"The following example indexes any string value in the data of every field of every document in collection and names the index TextIndex:","ts":new Date()})

创建:

> db.comments.ensureIndex({"mem":"text"})   #创建text索引
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : ,
"numIndexesAfter" : ,
"ok" :
}

使用:$text 操作符

> db.comments.find({$text:{$search:"specifier"}}).pretty()
{
"_id" : ObjectId("55aee886a782f35b366926ef"),
"name" : "jkl",
"mem" : "To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.",
"ts" : ISODate("2015-07-22T00:49:10.350Z")
}
{
"_id" : ObjectId("55aee886a782f35b366926ed"),
"name" : "def",
"mem" : "When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)",
"ts" : ISODate("2015-07-22T00:49:10.346Z")
} > db.comments.runCommand("text",{search:"specifier"}) #3.0之前可以使用,之后无效。
{
"results" : [
{
"score" : 0.8653846153846153,
"obj" : {
"_id" : ObjectId("55aee886a782f35b366926ed"),
"name" : "def",
"mem" : "When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)",
"ts" : ISODate("2015-07-22T00:49:10.346Z")
}
},
{
"score" : 0.5357142857142857,
"obj" : {
"_id" : ObjectId("55aee886a782f35b366926ef"),
"name" : "jkl",
"mem" : "To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.",
"ts" : ISODate("2015-07-22T00:49:10.350Z")
}
}
],
"stats" : {
"nscanned" : NumberLong(),
"nscannedObjects" : NumberLong(),
"n" : ,
"timeMicros" :
},
"ok" :
}

上面大致介绍了各类索引的介绍和使用,具体的信息和注意事项可以找官方文档里查看,特别是要注意text和ttl索引的使用。

3)删除索引:dropIndex

zjy:PRIMARY> db.abc.getIndexes()    #查看索引
[
{
"v" : ,
"key" : {
"_id" :
},
"name" : "_id_",
"ns" : "test.abc"
},
{
"v" : ,
"key" : { #索引字段
"a" : "hashed"
},
"name" : "a_hashed", #索引名
"ns" : "test.abc"
},
{
"v" : ,
"key" : {
"b" :
},
"name" : "b_1",
"ns" : "test.abc"
},
{
"v" : ,
"key" : {
"c" :
},
"name" : "idx_c",
"ns" : "test.abc"
}
]
zjy:PRIMARY> db.abc.dropIndex({"a" : "hashed"}) #删除索引,指定"key"
{ "nIndexesWas" : , "ok" : }
zjy:PRIMARY> db.abc.dropIndex({"b" : }) #删除索引,指定"key"
{ "nIndexesWas" : , "ok" : }
zjy:PRIMARY> db.abc.dropIndex("idx_c") #删除索引,指定"name"
{ "nIndexesWas" : , "ok" : }
zjy:PRIMARY> db.abc.getIndexes()
[
{
"v" : ,
"key" : {
"_id" :
},
"name" : "_id_",
"ns" : "test.abc"
}
] zjy:PRIMARY> db.abc.dropIndex("*") #删除索引,删除集合的全部索引
{
"nIndexesWas" : ,
"msg" : "non-_id indexes dropped for collection",
"ok" :
}

4)重建索引:索引出现损坏需要重建。reindex

zjy:PRIMARY> db.abc.reIndex()   #执行
{
"nIndexesWas" : ,
"nIndexes" : ,
"indexes" : [
{
"key" : {
"_id" :
},
"name" : "_id_",
"ns" : "test.abc"
}
],
"ok" :
}

5)强制使用指定索引。hint

db.abc.find({"c":,"b":}).hint("b_1")  #hint里面是"索引字段"或则"索引名"

总结:

索引可以加快检索、排序等操作的效率,但是对于增删改的操作却有一定的开销,所以不要一味的加索引,在必要的字段上加合适的索引才是需要的。更多的信息请参考官方文档

MongoDB 索引相关知识的更多相关文章

  1. mongoDB索引相关

    参考链接:MongoDB索引管理-索引的创建.查看.删除 索引 db.集合名.ensureIndex({"key":1}) 使用了ensureIndex在name上建立了索引.”1 ...

  2. MongoDB索引相关文章-摘自网络

    索引类型 虽然MongoDB的索引在存储结构上都是一样的,但是根据不同的应用层需求,还是分成了唯一索引(unique).稀疏索引(sparse).多值索引(multikey)等几种类型. 唯一索引 唯 ...

  3. mysql 索引相关知识

    由where 1 =1 引发的思考 最近工作上被说了 说代码中不能用 where 1=1,当时觉得是应该可以用的,但是找不到什么理据, 而且mysql 语句优化这方面确实很薄弱   感觉自己mysql ...

  4. 【Python五篇慢慢弹(5)】类的继承案例解析,python相关知识延伸

    类的继承案例解析,python相关知识延伸 作者:白宁超 2016年10月10日22:36:57 摘要:继<快速上手学python>一文之后,笔者又将python官方文档认真学习下.官方给 ...

  5. MongoDB的基础知识

    本人只是软件开发的一个菜鸟,在学习MongoDB,总结了一点自己学习的知识,监督自己学习. 如果文章中有不足的地方,还请大神指点迷津,纠正改错,谢谢. 一.MongoDB简介 MongoDB是一个基于 ...

  6. 电路相关知识--读&lt&semi;&lt&semi;继电器是如何成为CPU的&gt&semi;&gt&semi;

    电路相关知识–读<<继电器是如何成为CPU的>> */--> *///--> *///--> 电路相关知识–读<<继电器是如何成为CPU的> ...

  7. HTML入门基础教程相关知识

    HTML入门基础教程 html是什么,什么是html通俗解答: html是hypertext markup language的缩写,即超文本标记语言.html是用于创建可从一个平台移植到另一平台的超文 ...

  8. MongoDB&lpar;索引及C&num;如何操作MongoDB&rpar;&lpar;转载&rpar;

    MongoDB(索引及C如何操作MongoDB) 索引总概况 db.test.ensureIndex({"username":1})//创建索引 db.test.ensureInd ...

  9. Python 数据分析(二 本实验将学习利用 Python 数据聚合与分组运算,时间序列,金融与经济数据应用等相关知识

    Python 数据分析(二) 本实验将学习利用 Python 数据聚合与分组运算,时间序列,金融与经济数据应用等相关知识 第1节 groupby 技术 第2节 数据聚合 第3节 分组级运算和转换 第4 ...

随机推荐

  1. ScriptableObject本地序列化后重启Unity后报The associated script can not be loaded&period;Please fix any compile errors and assign a valid script的坑

    踩坑 做编辑器一些设置序列化存在本地的时候,继承自ScriptableObject的类通过 创建的asset文件. 在重启Unity后查看这个asset发现上面的所有序列化属性丢失,报的错就是 在不存 ...

  2. Chrome 调试动态加载的js

    今天有个同事问到我用chrome调试动态加载js的问题,这个问题之前遇到过,只是时间有点长了,有些忘记.在这里做一下记录: 在要调试的源码的后面加上 //@ sourceURL= debug.js 注 ...

  3. nginx源码编译安装

    安装编译所需的包: [root@xaiofan ~]# yum install -y gcc gcc-c++ autoconf automake 安装nginx使用某些功能需要的包: [root@xa ...

  4. VirtualBox 给虚拟机绑定IP

    在VirtualBox中,有时候打开虚拟机,会出现Waiting for 60 seconds more for network configuration这种情况,从而使得开机变得很慢. 通过以下操 ...

  5. GitHub具体教程

    GitHub具体教程 Table of Contents 1 Git具体教程 1.1 Git简单介绍 1.1.1 Git是何方神圣? 1.1.2 重要的术语 1.1.3 索引 1.2 Git安装 1. ...

  6. linux 第二天

    文件权限和用户组 groupadd 组名 groupadd policeman 查看组 vi /etc/group cat /etc/group 创建用户,并同时指定将该用户分配到哪个组 userad ...

  7. 面向对象(this的问题二)

    <!DOCTYPE HTML><html><head><meta http-equiv="Content-Type" content=&q ...

  8. restful的特点

    1. 资源(Resources) REST的名称”表现层状态转化”中,省略了主语.”表现层”其实指的是”资源”(Resources)的”表现层”.                所谓”资源”,就是网络 ...

  9. Drozer快速使用指南

    1.简介: Drozer是一款用于测试android应用程序漏洞的安全评估工具,能够发现多种类型的安全的漏洞,免费版本的相关资源下载地址: https://www.mwrinfosecurity.co ...

  10. &lbrack;ZZ&rsqb;c&plus;&plus;&&num;160&semi;cout&&num;160&semi;格式化输出浮点数、整数及格式化方法

    C语言里可以用printf(),%f来实现浮点数的格式化输出,用cout呢...?下面的方法是在网上找到的,如果各位有别的办法谢谢留下... iomanip.h是I/O流控制头文件,就像C里面的格式化 ...