如何索引Lucene文档中的所有术语?

时间:2021-05-02 04:15:46

The documents that I am indexing are very large. Lucene by default only indexes the first 10,000 terms of a document to avoid OutOfMemory errors. So I am getting incorrect hits while searching the Index. How could I index all the terms in the document?

我正在索引的文档非常大。默认情况下,Lucene仅索引文档的前10,000个术语以避免OutOfMemory错误。所以我在搜索索引时遇到错误的点击。我如何索引文档中的所有条款?

1 个解决方案

#1


0  

IndexWriter.MaxFieldLength. Specifies maximum field length (in number of tokens/terms) in IndexWriter constructors

IndexWriter.MaxFieldLength。指定IndexWriter构造函数中的最大字段长度(以令牌/术语的数量)

You can set maximum value as part of IndexWriter constructor MAX_VALUE - Lucene Recent Versions or UNLIMITED - Lucene Older Versions.

您可以将最大值设置为IndexWriter构造函数MAX_VALUE的一部分 - Lucene最新版本或UNLIMITED - Lucene旧版本。

You could also use IndexWriter.setMaxFieldLength(int) to override the value set by the constructor.

您还可以使用IndexWriter.setMaxFieldLength(int)来覆盖构造函数设置的值。

#1


0  

IndexWriter.MaxFieldLength. Specifies maximum field length (in number of tokens/terms) in IndexWriter constructors

IndexWriter.MaxFieldLength。指定IndexWriter构造函数中的最大字段长度(以令牌/术语的数量)

You can set maximum value as part of IndexWriter constructor MAX_VALUE - Lucene Recent Versions or UNLIMITED - Lucene Older Versions.

您可以将最大值设置为IndexWriter构造函数MAX_VALUE的一部分 - Lucene最新版本或UNLIMITED - Lucene旧版本。

You could also use IndexWriter.setMaxFieldLength(int) to override the value set by the constructor.

您还可以使用IndexWriter.setMaxFieldLength(int)来覆盖构造函数设置的值。