可以执行全文搜索的原因 Elasticsearch full-text search Kibana RESTful API with JSON over HTTP elasticsearch_action es 模糊查询

时间:2022-08-27 09:50:29

https://www.elastic.co/guide/en/elasticsearch/guide/current/getting-started.html

Elasticsearch is a real-time distributed search and analytics engine. It allows you to explore your data at a speed and at a scale never before possible. It is used for full-text search, structured search, analytics, and all three in combination:

  • Wikipedia uses Elasticsearch to provide full-text search with highlighted search snippets, and search-as-you-type and did-you-mean suggestions.
  • The Guardian uses Elasticsearch to combine visitor logs with social -network data to provide real-time feedback to its editors about the public’s response to new articles.
  • Stack Overflow combines full-text search with geolocation queries and uses more-like-this to find related questions and answers.
  • GitHub uses Elasticsearch to query 130 billion lines of code.

But Elasticsearch is not just for mega-corporations. It has enabled many startups like Datadog and Klout to prototype ideas and to turn them into scalable solutions. Elasticsearch can run on your laptop, or scale out to hundreds of servers and petabytes of data.

No individual part of Elasticsearch is new or revolutionary. Full-text search has been done before, as have analytics systems and distributed databases. The revolution is the combination of these individually useful parts into a single, coherent, real-time application. It has a low barrier to entry for the new user, but can keep pace with you as your skills and needs grow.

If you are picking up this book, it is because you have data, and there is no point in having data unless you plan to do something with it.

Unfortunately, most databases are astonishingly inept at extracting actionable knowledge from your data. Sure, they can filter by timestamp or exact values, but can they perform full-text search, handle synonyms, and score documents by relevance? Can they generate analytics and aggregations from the same data? Most important, can they do this in real time without big batch-processing jobs?

This is what sets Elasticsearch apart: Elasticsearch encourages you to explore and utilize your data, rather than letting it rot in a warehouse because it is too difficult to query.

【Lucene】

Elasticsearch is an open-source search engine built on top of Apache Lucene™, a full-text search-engine library. Lucene is arguably the most advanced, high-performance, and fully featured search engine library in existence today—both open source and proprietary.

But Lucene is just a library. To leverage its power, you need to work in Java and to integrate Lucene directly with your application. Worse, you will likely require a degree in information retrieval to understand how it works. Lucene is very complex.

Elasticsearch is also written in Java and uses Lucene internally for all of its indexing and searching, but it aims to make full-text search easy by hiding the complexities of Lucene behind a simple, coherent, RESTful API.

However, Elasticsearch is much more than just Lucene and much more than “just” full-text search. It can also be described as follows:

  • A distributed real-time document store where every field is indexed and searchable
  • A distributed search engine with real-time analytics
  • Capable of scaling to hundreds of servers and petabytes of structured and unstructured data

And it packages up all this functionality into a standalone server that your application can talk to via a simple RESTful API, using a web client from your favorite programming language, or even from the command line.

It is easy to get started with Elasticsearch. It ships with sensible defaults and hides complicated search theory away from beginners. It just works, right out of the box. With minimal understanding, you can soon become productive.

As your knowledge grows, you can leverage more of Elasticsearch’s advanced features. The entire engine is configurable and flexible. Pick and choose from the advanced features to tailor Elasticsearch to your problem domain.

【work on an abstraction layer to make Lucene easier for Java programmers to add search to their applications】

The Mists of Time

Many years ago, a newly married unemployed developer called Shay Banon followed his wife to London, where she was studying to be a chef. While looking for gainful employment, he started playing with an early version of Lucene, with the intent of building his wife a recipe search engine.

Working directly with Lucene can be tricky, so Shay started work on an abstraction layer to make it easier for Java programmers to add search to their applications. He released this as his first open source project, called Compass.

Later Shay took a job working in a high-performance, distributed environment with in-memory data grids. The need for a high-performance, real-time, distributed search engine was obvious, and he decided to rewrite the Compass libraries as a standalone server called Elasticsearch.

The first public release came out in February 2010. Since then, Elasticsearch has become one of the most popular projects on GitHub with commits from over 300 contributors. A company has formed around Elasticsearch to provide commercial support and to develop new features, but Elasticsearch is, and forever will be, open source and available to all.

Shay’s wife is still waiting for the recipe search…

Once you’ve extracted the archive file, Elasticsearch is ready to run. To start it up in the foreground:

cd elasticsearch-<version>
./bin/elasticsearch

可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询

可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询

Add -d if you want to run it in the background as a daemon.

可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询

If you’re running Elasticsearch on Windows, simply run bin\elasticsearch.bat instead.

Test it out by opening another terminal window and running the following:

curl 'http://localhost:9200/?pretty'

You should see a response like this:

{
"name" : "Tom Foster",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "2.1.0",
"build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
"build_timestamp" : "2015-11-18T22:40:03Z",
"build_snapshot" : false,
"lucene_version" : "5.3.1"
},
"tagline" : "You Know, for Search"
}

This means that you have an Elasticsearch node up and running, and you can start experimenting with it. A node is a running instance of Elasticsearch. A cluster is a group of nodes with the same cluster.name that are working together to share data and to provide failover and scale. (A single node, however, can form a cluster all by itself.) You can change the cluster.name in the elasticsearch.yml configuration file that’s loaded when you start a node.

Elasticsearch is running in the foreground.

Installing Sense

Sense is a Kibana app that provides an interactive console for submitting requests to Elasticsearch directly from your browser. Many of the code examples in the online version of this book include a View in Sense link. When clicked, it opens up a working example of the code in the Sense console. You do not have to install Sense, but it will make this book much more interactive by allowing you to experiment with the code samples on your local Elasticsearch cluster.

To install and run Sense:

To install and run Sense:

  1. Run the following command in the Kibana directory to download and install the Sense app:

    ./bin/kibana plugin --install elastic/sense 

    可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询

    可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询

    Windows: bin\kibana.bat plugin --install elastic/sense.

  2. Start Kibana.

    ./bin/kibana 

    可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询

    可以执行全文搜索的原因 Elasticsearch full-text search  Kibana  RESTful API with JSON over HTTP  elasticsearch_action  es 模糊查询

    Windows: bin\kibana.bat.

  3. Open Sense your web browser by going to http://localhost:5601/app/sense.

https://www.elastic.co/guide/en/elasticsearch/guide/current/_talking_to_elasticsearch.html#_restful_api_with_json_over_http

Java APIedit

If you are using Java, Elasticsearch comes with two built-in clients that you can use in your code:

【数据在集群的哪个节点】

Node client
The node client joins a local cluster as a non data node. In other words, it doesn’t hold any data itself, but it knows what data lives on which node in the cluster, and can forward requests directly to the correct node.
【转发请求】
Transport client
The lighter-weight transport client can be used to send requests to a remote cluster. It doesn’t join the cluster itself, but simply forwards requests to a node in the cluster.

Both Java clients talk to the cluster over port 9300, using the native Elasticsearch transport protocol. The nodes in the cluster also communicate with each other over port 9300. If this port is not open, your nodes will not be able to form a cluster.

节点客户端(Node client)节点客户端作为一个非数据节点加入到本地集群中。换句话说,它本身不保存任何数据,但是它知道数据在集群中的哪个节点中,并且可以把请求转发到正确的节点。

传输客户端(Transport client)轻量级的传输客户端可以可以将请求发送到远程集群。它本身不加入集群,但是它可以将请求转发到集群中的一个节点上。

A request to Elasticsearch consists of the same parts as any HTTP request:

curl -X<VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'

The parts marked with < > above are:

VERB

The appropriate HTTP method or verbGETPOSTPUTHEAD, or DELETE.

PROTOCOL

Either http or https (if you have an https proxy in front of Elasticsearch.)

HOST

The hostname of any node in your Elasticsearch cluster, or localhost for a node on your local machine.

PORT

The port running the Elasticsearch HTTP service, which defaults to 9200.

PATH

API Endpoint (for example _count will return the number of documents in the cluster). Path may contain multiple components, such as _cluster/stats or _nodes/stats/jvm

QUERY_STRING

Any optional query-string parameters (for example ?pretty will pretty-print the JSON response to make it easier to read.)

BODY

A JSON-encoded request body (if the request needs one.)

Document Oriented

Objects in an application are seldom just a simple list of keys and values. More often than not, they are complex data structures that may contain dates, geo locations, other objects, or arrays of values.

Sooner or later you’re going to want to store these objects in a database. Trying to do this with the rows and columns of a relational database is the equivalent of trying to squeeze your rich, expressive objects into a very big spreadsheet: you have to flatten the object to fit the table schema—usually one field per column—and then have to reconstruct it every time you retrieve it.

Elasticsearch is document oriented, meaning that it stores entire objects or documents. It not only stores them, but also indexes the contents of each document in order to make them searchable. In Elasticsearch, you index, search, sort, and filter documents—not rows of columnar data. This is a fundamentally different way of thinking about data and is one of the reasons Elasticsearch can perform complex full-text search.

应用中的一个个对象,不仅仅是键值对的数据结构。

【document oriented】

Elasticsearch是以文档document为中心的,来存储整个应用中的对象、文档。Elasticsearch为每一个文档的内容(contents of each documents)建立索引,使得他们便于被检索到。

【可以执行全文搜索的原因 full-text search】

索引、搜索、排序、过滤的操作对象是文档,而不是传统的列式数据

基础入门 | Elasticsearch: 权威指南 | Elastic https://elasticsearch.cn/book/elasticsearch_definitive_guide_2.x/getting-started.html

Elasticsearch 中没有一个单独的组件是全新的或者是革命性的。全文搜索很久之前就已经可以做到了, 就像早就出现了的分析系统和分布式数据库。 革命性的成果在于将这些单独的,有用的组件融合到一个单一的、一致的、实时的应用中。它对于初学者而言有一个较低的门槛, 而当你的技能提升或需求增加时,它也始终能满足你的需求。

信息检索的概念、分布式系统原理、Query DSL

ElasticSearch 模糊匹配查询 - CSDN博客 http://blog.csdn.net/hemuxiao/article/details/72869105

目前的需求输入:王? 女 济南 20-30

  • 能够查询以王开头的人的名字
  • 性别为女性
  • 地址为济南
  • *年龄为20-30
{
"from": 0,
"query": {
"bool": {
"minimum_should_match": 1,
"must": [
{
"match": {
"XB": "2 "
}
},
{
"wildcard": {
"XM": "王*"
}
}
],
"should": [
{
"range": {
"CSRQ": {
"gte": "2010-09-01",
"lte": "2014-09-01"
}
}
}
]
}
},
"size": 10
}

  

模糊查询

curl '1.2.21.10 : 9200/dvote/kwaddress/_search?pretty=true'  -d '
{
"from": ,
"query": {
"bool": {
"minimum_should_match": ,
"must": [
{
"wildcard": {
"title": "*网*"
}
}
]
}
},
"size":
}
'
https://elasticsearch.cn/book/elasticsearch_definitive_guide_2.x/_distributed_nature.html

分布式特性

在本章开头,我们提到过 Elasticsearch 可以横向扩展至数百(甚至数千)的服务器节点,同时可以处理PB级数据。我们的教程给出了一些使用 Elasticsearch 的示例,但并不涉及任何内部机制。Elasticsearch 天生就是分布式的,并且在设计时屏蔽了分布式的复杂性。

Elasticsearch 在分布式方面几乎是透明的。教程中并不要求了解分布式系统、分片、集群发现或其他的各种分布式概念。可以使用笔记本上的单节点轻松地运行教程里的程序,但如果你想要在 100 个节点的集群上运行程序,一切依然顺畅。

Elasticsearch 尽可能地屏蔽了分布式系统的复杂性。这里列举了一些在后台自动执行的操作:

  • 分配文档到不同的容器 或 分片 中,文档可以储存在一个或多个节点中
  • 按集群节点来均衡分配这些分片,从而对索引和搜索过程进行负载均衡
  • 复制每个分片以支持数据冗余,从而防止硬件故障导致的数据丢失
  • 将集群中任一节点的请求路由到存有相关数据的节点
  • 集群扩容时无缝整合新节点,重新分配分片以便从离群节点恢复

当阅读本书时,将会遇到有关 Elasticsearch 分布式特性的补充章节。这些章节将介绍有关集群扩容、故障转移(集群内的原理) 、应对文档存储(分布式文档存储) 、执行分布式搜索(执行分布式检索) ,以及分区(shard)及其工作原理(分片内部原理) 。

Distributed Nature | Elasticsearch: The Definitive Guide [2.x] | Elastic https://www.elastic.co/guide/en/elasticsearch/guide/current/_distributed_nature.html

Distributed Nature

At the beginning of this chapter, we said that Elasticsearch can scale out to hundreds (or even thousands) of servers and handle petabytes of data. While our tutorial gave examples of how to use Elasticsearch, it didn’t touch on the mechanics at all. Elasticsearch is distributed by nature, and it is designed to hide the complexity that comes with being distributed.

The distributed aspect of Elasticsearch is largely transparent. Nothing in the tutorial required you to know about distributed systems, sharding, cluster discovery, or dozens of other distributed concepts. It happily ran the tutorial on a single node living inside your laptop, but if you were to run the tutorial on a cluster containing 100 nodes, everything would work in exactly the same way.

Elasticsearch tries hard to hide the complexity of distributed systems. Here are some of the operations happening automatically under the hood:

  • Partitioning your documents into different containers or shards, which can be stored on a single node or on multiple nodes
  • Balancing these shards across the nodes in your cluster to spread the indexing and search load
  • Duplicating each shard to provide redundant copies of your data, to prevent data loss in case of hardware failure
  • Routing requests from any node in the cluster to the nodes that hold the data you’re interested in
  • Seamlessly integrating new nodes as your cluster grows or redistributing shards to recover from node loss

As you read through this book, you’ll encounter supplemental chapters about the distributed nature of Elasticsearch. These chapters will teach you about how the cluster scales and deals with failover (Life Inside a Cluster), handles document storage (Distributed Document Store), executes distributed search (Distributed Search Execution), and what a shard is and how it works (Inside a Shard).

【面向对象编程语言如此流行的原因】

面向对象编程语言如此流行的原因之一是对象帮我们表示和处理现实世界具有潜在的复杂的数据结构的实体,到目前为止,一切都很完美!

但是当我们需要存储这些实体时问题来了,传统上,我们以行和列的形式存储数据到关系型数据库中,相当于使用电子表格。 正因为我们使用了这种不灵活的存储媒介导致所有我们使用对象的灵活性都丢失了。

但是否我们可以将我们的对象按对象的方式来存储? 这样我们就能更加专注于 使用 数据,而不是在电子表格的局限性下对我们的应用建模。 我们可以重新利用对象的灵活性。

一个 对象 是基于特定语言的内存的数据结构。 为了通过网络发送或者存储它,我们需要将它表示成某种标准的格式。 JSON 是一种以人可读的文本表示对象的方法。 它已经变成 NoSQL 世界交换数据的事实标准。当一个对象被序列化成为 JSON,它被称为一个 JSON 文档 。

可以执行全文搜索的原因 Elasticsearch full-text search Kibana RESTful API with JSON over HTTP elasticsearch_action es 模糊查询的更多相关文章

  1. Elasticsearch入坑指南之RESTful API

    Elasticsearch入坑指南之RESTful API Tags:Elasticsearch ES为开发者提供了非常丰富的基于Http协议的Rest API,通过简单的Rest请求,就可以实现非常 ...

  2. 使用ElasticSearch服务从MySQL同步数据实现搜索即时提示与全文搜索功能

    最近用了几天时间为公司项目集成了全文搜索引擎,项目初步目标是用于搜索框的即时提示.数据需要从MySQL中同步过来,因为数据不小,因此需要考虑初次同步后进行持续的增量同步.这里用到的开源服务就是Elas ...

  3. 使用ElasticSearch实现搜索时即时提示与全文搜索功能

    <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...

  4. MySQL全文搜索

    http://www.yiibai.com/mysql/full-text-search.html 在本节中,您将学习如何使用MySQL全文搜索功能. MySQL全文搜索提供了一种实现各种高级搜索技术 ...

  5. java使用elasticsearch进行模糊查询-已在项目中实际应用

    java使用elasticsearch进行模糊查询 使用环境上篇文章本人已书写过,需要maven坐标,ES连接工具类的请看上一篇文章,以下是内容是笔者在真实项目中运用总结而产生,并写的是主要方法和思路 ...

  6. 一种安全云存储方案设计(下)——基于Lucene的云端搜索与密文基础上的模糊查询

    一种安全的云存储方案设计(未完整理中) 一篇老文了,现在看看错漏颇多,提到的一些技术已经跟不上了.仅对部分内容重新做了一些修正,增加了一些机器学习的内容,然并卵. 这几年来,云产品层出不穷,但其安全性 ...

  7. ElasticSearch 2 &lpar;14&rpar; - 深入搜索系列之全文搜索

    ElasticSearch 2 (14) - 深入搜索系列之全文搜索 摘要 在看过结构化搜索之后,我们看看怎样在全文字段中查找相关度最高的文档. 全文搜索两个最重要的方面是: 相关(relevance ...

  8. Elasticsearch系列---深入全文搜索

    概要 本篇介绍怎样在全文字段中搜索到最相关的文档,包含手动控制搜索的精准度,搜索条件权重控制. 手动控制搜索的精准度 搜索的两个重要维度:相关性(Relevance)和分析(Analysis). 相关 ...

  9. 全文搜索之 Elasticsearch

    概述 Elasticsearch (ES)是一个基于 Lucene 的开源搜索引擎,它不但稳定.可靠.快速,而且也具有良好的水平扩展能力,是专门为分布式环境设计的. 特性 安装方便:没有其他依赖,下载 ...

随机推荐

  1. ruby中tes-unitt数据初始化方法整理

    在用ruby做测试时,很多时候需要一些数据初始化以及事后的数据恢复还原之类的操作,下面整理了这些方法.require "test/unit" class TestAnion &lt ...

  2. iOS字体换算 PS的字体大小 &lt&semi;&equals;&gt&semi;iOS上字体大小

  3. C&num;记录程序耗时的方法

    用于准确测量运行时间的方法类: System.Diagnostics.Stopwatch 具体使用方法: using System.Diagnostics; Stopwatch stopwatch = ...

  4. Xcode编译错误集锦

    1.在将ios项目进行Archive打包时,Xcode提示以下错误: [BEROR]CodeSign error: Certificate identity ‘iPhone Distribution: ...

  5. LeetCode题解-----Majority Element II 摩尔投票法

    题目描述: Given an integer array of size n, find all elements that appear more than ⌊ n/3 ⌋ times. The a ...

  6. asp&period;net 中json字符串转换

    List<ATTVal> Replys = JsonParser.FromJson<List<ATTVal>>(attrValueStr);

  7. Configuring HTTP and HTTPS

    Configuring HTTP and HTTPS .NET Framework (current version)   Other Versions   WCF services and clie ...

  8. 06-IOSCore - KVC、CoreData

    一. KVC 1. KVC 使用前:黯淡无光 if ([keyPath isEqualToString:@"name"]) { self.labelName.text = self ...

  9. &period;net ef core 领域设计代码转换(上篇)

    一.前言 .net core 2.0正式版已经发布几个月了,经过研究,决定把项目转移过来,新手的话可以先看一些官方介绍 传送门:https://docs.microsoft.com/zh-cn/dot ...

  10. C&num; 消息队列-Microsoft Azure service bus 服务总线

    先决条件 Visual Studio 2015或更高版本.本教程中的示例使用Visual Studio 2015. Azure订阅. 注意 要完成本教程,您需要一个Azure帐户.您可以激活MSDN订 ...