【elasticsearch】python下的使用

有用链接：

最有用的：http://es.xiaoleilu.com/054_Query_DSL/70_Important_clauses.html

不错的博客：http://www.cnblogs.com/letong/p/4749234.html

其他1：http://www.jianshu.com/p/14aa8b09c789

其他2：http://xiaorui.cc/

上面链接有点老了。新链接

http://elasticsearch-dsl.readthedocs.io/en/latest/

https://elasticsearch.cn/book/elasticsearch_definitive_guide_2.x/_search_lite.html

1.查询索引中的所有内容

#coding=utf8

from elasticsearch import Elasticsearch

es = Elasticsearch([{'host':'x.x.x.x','port':9200}])

index = "test"

query = {"query":{"match_all":{}}}

resp = es.search(index, body=query)

resp_docs = resp["hits"]["hits"]

total = resp['hits']['total']

print total  #总共查找到的数量

print resp_docs[0]['_source']['@timestamp'] #输出一个字段

2.用scroll分次查询所有内容+复杂条件

过滤条件：字段A不为空且字段B不为空，且时间在过去10天~2天之间

#coding=utf8

from elasticsearch import Elasticsearch

import json

import datetime

es = Elasticsearch([{'host':'x.x.x.x','port':9200}])

index = "test"

query = { \

        "query":{ \

            "filtered":{ \

                "query":{ \

                    "bool":{ \

                        "must_not":{"term":{"A":""}}, \

                        "must_not":{"term":{"B":""}}, \

                        } \

                    }, \

                "filter":{

                    "range":{'@timestamp':{'gte':'now-10d','lt':'now-2d'}}

                    }

                }\

            } \

        }

resp = es.search(index, body=query, scroll="1m",size=100)

scroll_id = resp['_scroll_id']

resp_docs = resp["hits"]["hits"]

total = resp['hits']['total']

count = len(resp_docs)

datas = resp_docs

while len(resp_docs) > 0:

    scroll_id = resp['_scroll_id']

    resp = es.scroll(scroll_id=scroll_id, scroll="1m")

    resp_docs = resp["hits"]["hits"]

    datas.extend(resp_docs)

    count += len(resp_docs)

    if count >= total:

        break

print len(datas)

3.聚合

查看一共有多少种@timestamp字段

#coding=utf8

from elasticsearch import Elasticsearch

es = Elasticsearch([{'host':'x.x.x.x','port':9200}])

index = "test"

query = {"aggs":{"all_times":{"terms":{"field":"@timestamp"}}}}

resp = es.search(index, body=query)

total = resp['hits']['total']

print total

print resp["aggregations"]

秒客网

【elasticsearch】python下的使用

相关文章