Lucene通配符应用于索引字段

时间:2022-09-12 23:39:58

I have a set of indexed fields such as these:

我有一组索引字段,如下所示:

submitted_form_2200FA17-AF7A-4E44-9749-79D3A391A1AF:true

submitted_form_2200FA17-AF7A-4E44-9749-79D3A391A1AF:真

submitted_form_2398389-2-32-43242423:true

submitted_form_2398389-2-32-43242423:真

submitted_form_54543-32SDf-3242340-32422:true

submitted_form_54543-32SDf-3242340-32422:真

And I get that it's possible to wildcard queries such as

而且我知道可以使用诸如的通配符查询

submitted_form_2398389-2-32-43242423:t*e

submitted_form_2398389-2-32-43242423:T * E

What I'm trying to do is get "any" submitted form via something like:

我想要做的是通过以下方式获得“任何”提交的表格:

submitted_form_*:true

submitted_form _ *:真

Is this possible? Or will I have to do a stream of "OR"s on the known forms (which seems quite heavy)

这可能吗?或者我是否必须在已知形式上进行“OR”流(看起来很重)

1 个解决方案

#1


1  

That's not the intended use of fields, I think. Field names aren't supposed to be the searchable values, field values are. Field names are supposed to be known a priori.

我认为这不是字段的预期用途。字段名称不应该是可搜索的值,字段值是。字段名称应该是先验已知的。

My suggestion is (if possible) to store the second part of the name as the field value, for instance: submitted_form:2398389-2-32-43242423. submitted_from would be the field known a priori, and the value could eventually be searched with a PrefixQuery.

我的建议是(如果可能的话)将名称的第二部分存储为字段值,例如:submitted_form:2398389-2-32-43242423。 submitted_from将是先验已知的字段,并且最终可以使用PrefixQuery搜索该值。

Anyway, you could access the collection of fields' names using IndexReader.getFieldNames() in Lucene 3.x and this in Lucene 4.x. I wouldn't expect search performance there.

无论如何,你可以使用Lucene 3.x中的IndexReader.getFieldNames()和Lucene 4.x中的字段名称集合。我不希望那里的搜索性能。

#1


1  

That's not the intended use of fields, I think. Field names aren't supposed to be the searchable values, field values are. Field names are supposed to be known a priori.

我认为这不是字段的预期用途。字段名称不应该是可搜索的值,字段值是。字段名称应该是先验已知的。

My suggestion is (if possible) to store the second part of the name as the field value, for instance: submitted_form:2398389-2-32-43242423. submitted_from would be the field known a priori, and the value could eventually be searched with a PrefixQuery.

我的建议是(如果可能的话)将名称的第二部分存储为字段值,例如:submitted_form:2398389-2-32-43242423。 submitted_from将是先验已知的字段,并且最终可以使用PrefixQuery搜索该值。

Anyway, you could access the collection of fields' names using IndexReader.getFieldNames() in Lucene 3.x and this in Lucene 4.x. I wouldn't expect search performance there.

无论如何,你可以使用Lucene 3.x中的IndexReader.getFieldNames()和Lucene 4.x中的字段名称集合。我不希望那里的搜索性能。