ElasticSearch & Tire:使用映射和to_indexed_json

时间:2022-05-01 12:00:42

While reading the Tire doc, I was under the impression that you should use either mapping or to_indexed_json methods, since (my understanding was..) the mapping is used to feed the to_indexed_json.

在阅读Tire文档时,我的印象是你应该使用map或to_indexed_json方法,因为(我的理解是......)映射用于提供to_indexed_json。

The problem is, that I found some tutorials where both are used. WHY?

问题是,我发现了一些使用它们的教程。为什么?

Basically, my app works right now with the to_indexed_json but I can't figure out how to set the boost value of some of the attributes (hence the reason I started looking at mapping) and I was wondering if using both would create some conflicts.

基本上,我的应用程序现在使用to_indexed_json,但我无法弄清楚如何设置某些属性的提升值(因此我开始查看映射的原因),我想知道是否使用两者会产生一些冲突。

1 个解决方案

#1


50  

While the mapping and to_indexed_json methods are related, they serve two different purposes, in fact.

虽然映射和to_indexed_json方法是相关的,但实际上它们有两个不同的用途。

The purpose of the mapping method is to define mapping for the document properties within an index. You may want to define certain property as "not_analyzed", so it is not broken into tokens, or set a specific analyzer for the property, or (as you mention) indexing time boost factor. You may also define multifield property, custom formats for date types, etc.

映射方法的目的是为索引中的文档属性定义映射。您可能希望将某些属性定义为“not_analyzed”,因此不会将其分解为标记,或者为属性设置特定的分析器,或者(如您所述)索引时间提升因子。您还可以定义多字段属性,日期类型的自定义格式等。

This mapping is then used eg. when Tire automatically creates an index for your model.

然后使用该映射,例如。当轮胎自动为您的模型创建索引时。

The purpose of the to_indexed_json method is to define a JSON serialization for your documents/models.

to_indexed_json方法的目的是为您的文档/模型定义JSON序列化。

The default to_indexed_json method does use your mapping definition, to use only properties defined in the mapping — on a basis that if you care enough to define the mapping, by default Tire indexes only properties with defined mapping.

默认的to_indexed_json方法确实使用您的映射定义,仅使用映射中定义的属性 - 基于如果您足够关注定义映射,默认情况下Tire仅索引具有已定义映射的属性。

Now, when you want a tight grip on how your model is in fact serialized into JSON for elasticsearch, you just define your own to_indexed_json methods (as the README instructs).

现在,当您想要紧紧抓住模型实际上如何序列化为弹性搜索的JSON时,您只需定义自己的to_indexed_json方法(如README指示)。

This custom MyModel#to_indexed_method usually does not care about mapping definition, and builds the JSON serialization from scratch (by leveraging ActiveRecord's to_json, using a JSON builder such as jbuilder, or just building a plain old Hash and calling Hash#to_json).

这个自定义MyModel#to_indexed_method通常不关心映射定义,并且从头开始构建JSON序列化(通过利用ActiveRecord的to_json,使用jbuilder等JSON构建器,或者只构建一个普通的旧Hash并调用Hash#to_json)。

So, to answer the last part of your question, using both mapping and to_indexed_json will absolutely not create any conflicts, and is in fact required to use advanced features in elasticsearch.

因此,要回答问题的最后部分,使用mapping和to_indexed_json绝对不会产生任何冲突,实际上需要在elasticsearch中使用高级功能。

To sum up:

总结一下:

  1. You use the mapping method to define the mapping for your models for the search engine
  2. 您可以使用映射方法为搜索引擎定义模型的映射
  3. You use a custom to_indexed_json method to define how the search engine sees your documents/models.
  4. 您可以使用自定义to_indexed_json方法来定义搜索引擎查看文档/模型的方式。

#1


50  

While the mapping and to_indexed_json methods are related, they serve two different purposes, in fact.

虽然映射和to_indexed_json方法是相关的,但实际上它们有两个不同的用途。

The purpose of the mapping method is to define mapping for the document properties within an index. You may want to define certain property as "not_analyzed", so it is not broken into tokens, or set a specific analyzer for the property, or (as you mention) indexing time boost factor. You may also define multifield property, custom formats for date types, etc.

映射方法的目的是为索引中的文档属性定义映射。您可能希望将某些属性定义为“not_analyzed”,因此不会将其分解为标记,或者为属性设置特定的分析器,或者(如您所述)索引时间提升因子。您还可以定义多字段属性,日期类型的自定义格式等。

This mapping is then used eg. when Tire automatically creates an index for your model.

然后使用该映射,例如。当轮胎自动为您的模型创建索引时。

The purpose of the to_indexed_json method is to define a JSON serialization for your documents/models.

to_indexed_json方法的目的是为您的文档/模型定义JSON序列化。

The default to_indexed_json method does use your mapping definition, to use only properties defined in the mapping — on a basis that if you care enough to define the mapping, by default Tire indexes only properties with defined mapping.

默认的to_indexed_json方法确实使用您的映射定义,仅使用映射中定义的属性 - 基于如果您足够关注定义映射,默认情况下Tire仅索引具有已定义映射的属性。

Now, when you want a tight grip on how your model is in fact serialized into JSON for elasticsearch, you just define your own to_indexed_json methods (as the README instructs).

现在,当您想要紧紧抓住模型实际上如何序列化为弹性搜索的JSON时,您只需定义自己的to_indexed_json方法(如README指示)。

This custom MyModel#to_indexed_method usually does not care about mapping definition, and builds the JSON serialization from scratch (by leveraging ActiveRecord's to_json, using a JSON builder such as jbuilder, or just building a plain old Hash and calling Hash#to_json).

这个自定义MyModel#to_indexed_method通常不关心映射定义,并且从头开始构建JSON序列化(通过利用ActiveRecord的to_json,使用jbuilder等JSON构建器,或者只构建一个普通的旧Hash并调用Hash#to_json)。

So, to answer the last part of your question, using both mapping and to_indexed_json will absolutely not create any conflicts, and is in fact required to use advanced features in elasticsearch.

因此,要回答问题的最后部分,使用mapping和to_indexed_json绝对不会产生任何冲突,实际上需要在elasticsearch中使用高级功能。

To sum up:

总结一下:

  1. You use the mapping method to define the mapping for your models for the search engine
  2. 您可以使用映射方法为搜索引擎定义模型的映射
  3. You use a custom to_indexed_json method to define how the search engine sees your documents/models.
  4. 您可以使用自定义to_indexed_json方法来定义搜索引擎查看文档/模型的方式。