记录solr的一些参数配置

时间:2021-09-29 08:22:21

在schema.xml 文件中的配置,老是会忘记一些参数的作用,所以得记到博客里,就像一位同事所说,好记性不好烂博客。

 

 

   <requestParsers enableRemoteStreaming="true"
                    multipartUploadLimitInKB="2048000" />

配置是否允许远程访问solr的配置文件,比如http://localhost:8080/solr/admin/file?file=schema.xml或者solrconfig.xml

如果设置false,则访问不到。。

 

 

 

<field name="Subject" type="text" indexed="true" stored="true" omitNorms="true"/>

 

其中omitnorms这个和文档长短有关

 

搜索时使用的参数的一些作用

 

bf是文档的boost(可以在作索引的时候设置也可以在搜索的时候动态计算 )

 

qf 字段的权重评分(可以在作索引的时候设置也可以在搜索的时候动态计算 ),qf只能在dismax方式下有效

 

 

mm 命中多少个term 返回结果

 

再续。。。。

 

使用dismax搜索组件:

http://localhost:8080/solr/select/?q=美女&q.op=AND&start=0&rows=20&fl=*&qt=dismax&bf=sum(recip(rord(public_time),1,56,7),recip(rord(public_time),1,112,14),recip(rord(public_time),1,180,30),recip(rord(public_time),1,720,180),recip(rord(public_time),1,720,360))^7+div(log(times),log(4))^30+map(hd,1,1,15,0)^4+div(log(totaltime),log(4))^30&qf=Subject^1+tag^0.3

 

其中默认搜索是text (test=Subject+tag),所以q=美女,等于搜索text:美女,但是由于后面有加个参数qf=Subject^1+tag^0.3 ,所以搜索 的是Subject:美女 OR tag:美女,且每个字段都增加相应的权重

 

 

下面是调试文档boost 和字段的boost

 

<lst name="params">
<str name="debugQuery">on</str>
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">Subject:mm OR tag:mm</str>
<str name="version">2.2</str>
<str name="rows">10</str>
</lst>
</lst>

<result name="response" numFound="3" start="0">

<doc>
<str name="Subject">mm</str>
<str name="id">15</str>

<arr name="tag">
<str>mm</str>
</arr>
</doc>

<doc>
<str name="Subject">mm</str>
<str name="id">13</str>

<arr name="tag">
<str>love you haha</str>
</arr>
</doc>

<doc>
<str name="Subject">love you haha</str>
<str name="id">14</str>

<arr name="tag">
<str>mm</str>
</arr>
</doc>
</result>
<lst name="explain">

<str name="15">

13.277615 = (MATCH) sum of:
  12.204243 = (MATCH) weight(Subject:mm in 0), product of:
    0.78980696 = queryWeight(Subject:mm), product of:
      1.287682 = idf(docFreq=2, maxDocs=4)
      0.6133556 = queryNorm
    15.452185 = (MATCH) fieldWeight(Subject:mm in 0), product of:
      1.0 = tf(termFreq(Subject:mm)=1)
      1.287682 = idf(docFreq=2, maxDocs=4)
      12.0 = fieldNorm(field=Subject, doc=0)
  1.0733722 = (MATCH) weight(tag:mm in 0), product of:
    0.6133556 = queryWeight(tag:mm), product of:
      1.0 = idf(docFreq=3, maxDocs=4)
      0.6133556 = queryNorm
    1.75 = (MATCH) fieldWeight(tag:mm in 0), product of:
      1.0 = tf(termFreq(tag:mm)=1)
      1.0 = idf(docFreq=3, maxDocs=4)
      1.75 = fieldNorm(field=tag, doc=0)
</str>

<str name="13">

6.1021214 = (MATCH) product of:
  12.204243 = (MATCH) sum of:
    12.204243 = (MATCH) weight(Subject:mm in 0), product of:
      0.78980696 = queryWeight(Subject:mm), product of:
        1.287682 = idf(docFreq=2, maxDocs=4)
        0.6133556 = queryNorm
      15.452185 = (MATCH) fieldWeight(Subject:mm in 0), product of:
        1.0 = tf(termFreq(Subject:mm)=1)
        1.287682 = idf(docFreq=2, maxDocs=4)
        12.0 = fieldNorm(field=Subject, doc=0)
  0.5 = coord(1/2)
</str>

<str name="14">

0.5366861 = (MATCH) product of:
  1.0733722 = (MATCH) sum of:
    1.0733722 = (MATCH) weight(tag:mm in 1), product of:
      0.6133556 = queryWeight(tag:mm), product of:
        1.0 = idf(docFreq=3, maxDocs=4)
        0.6133556 = queryNorm
      1.75 = (MATCH) fieldWeight(tag:mm in 1), product of:
        1.0 = tf(termFreq(tag:mm)=1)
        1.0 = idf(docFreq=3, maxDocs=4)
        1.75 = fieldNorm(field=tag, doc=1)
  0.5 = coord(1/2)
</str>
</lst>

 

其中这三个文档的boost在提交的时候都设置为6,Subject权重为2,tag权重为0.3,文档15在两个字段都命中,所以得分最高。

 

文档13命中Subject,14命中tag,所以评分文档13高于14.