记录solr的一些参数配置

在schema.xml 文件中的配置，老是会忘记一些参数的作用，所以得记到博客里，就像一位同事所说，好记性不好烂博客。

配置是否允许远程访问solr的配置文件，比如http://localhost:8080/solr/admin/file?file=schema.xml或者solrconfig.xml

如果设置false，则访问不到。。

其中omitnorms这个和文档长短有关

搜索时使用的参数的一些作用

bf是文档的boost（可以在作索引的时候设置也可以在搜索的时候动态计算）

qf 字段的权重评分（可以在作索引的时候设置也可以在搜索的时候动态计算）,qf只能在dismax方式下有效

mm 命中多少个term 返回结果

再续。。。。

使用dismax搜索组件：

http://localhost:8080/solr/select/?q=美女&q.op=AND&start=0&rows=20&fl=*&qt=dismax&bf=sum(recip(rord(public_time),1,56,7),recip(rord(public_time),1,112,14),recip(rord(public_time),1,180,30),recip(rord(public_time),1,720,180),recip(rord(public_time),1,720,360))^7+div(log(times),log(4))^30+map(hd,1,1,15,0)^4+div(log(totaltime),log(4))^30&qf=Subject^1+tag^0.3

其中默认搜索是text (test=Subject+tag)，所以q=美女，等于搜索text:美女，但是由于后面有加个参数qf=Subject^1+tag^0.3 ,所以搜索的是Subject:美女 OR tag:美女,且每个字段都增加相应的权重

下面是调试文档boost 和字段的boost

<lst name="params">
<str name="debugQuery">on</str>
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">Subject:mm OR tag:mm</str>
<str name="version">2.2</str>
<str name="rows">10</str>
</lst>
</lst>
−
<result name="response" numFound="3" start="0">
−
<doc>
<str name="Subject">mm</str>
<str name="id">15</str>
−
<arr name="tag">
<str>mm</str>
</arr>
</doc>
−
<doc>
<str name="Subject">mm</str>
<str name="id">13</str>
−
<arr name="tag">
<str>love you haha</str>
</arr>
</doc>
−
<doc>
<str name="Subject">love you haha</str>
<str name="id">14</str>
−
<arr name="tag">
<str>mm</str>
</arr>
</doc>
</result>
<lst name="explain">
−
<str name="15">

13.277615 = (MATCH) sum of:
12.204243 = (MATCH) weight(Subject:mm in 0), product of:
    0.78980696 = queryWeight(Subject:mm), product of:
      1.287682 = idf(docFreq=2, maxDocs=4)
      0.6133556 = queryNorm
    15.452185 = (MATCH) fieldWeight(Subject:mm in 0), product of:
      1.0 = tf(termFreq(Subject:mm)=1)
      1.287682 = idf(docFreq=2, maxDocs=4)
      12.0 = fieldNorm(field=Subject, doc=0)
1.0733722 = (MATCH) weight(tag:mm in 0), product of:
    0.6133556 = queryWeight(tag:mm), product of:
      1.0 = idf(docFreq=3, maxDocs=4)
      0.6133556 = queryNorm
    1.75 = (MATCH) fieldWeight(tag:mm in 0), product of:
      1.0 = tf(termFreq(tag:mm)=1)
      1.0 = idf(docFreq=3, maxDocs=4)
      1.75 = fieldNorm(field=tag, doc=0)
</str>
−
<str name="13">

6.1021214 = (MATCH) product of:
12.204243 = (MATCH) sum of:
    12.204243 = (MATCH) weight(Subject:mm in 0), product of:
      0.78980696 = queryWeight(Subject:mm), product of:
        1.287682 = idf(docFreq=2, maxDocs=4)
        0.6133556 = queryNorm
      15.452185 = (MATCH) fieldWeight(Subject:mm in 0), product of:
        1.0 = tf(termFreq(Subject:mm)=1)
        1.287682 = idf(docFreq=2, maxDocs=4)
        12.0 = fieldNorm(field=Subject, doc=0)
0.5 = coord(1/2)
</str>
−
<str name="14">

0.5366861 = (MATCH) product of:
1.0733722 = (MATCH) sum of:
    1.0733722 = (MATCH) weight(tag:mm in 1), product of:
      0.6133556 = queryWeight(tag:mm), product of:
        1.0 = idf(docFreq=3, maxDocs=4)
        0.6133556 = queryNorm
      1.75 = (MATCH) fieldWeight(tag:mm in 1), product of:
        1.0 = tf(termFreq(tag:mm)=1)
        1.0 = idf(docFreq=3, maxDocs=4)
        1.75 = fieldNorm(field=tag, doc=1)
0.5 = coord(1/2)
</str>
</lst>

其中这三个文档的boost在提交的时候都设置为6，Subject权重为2，tag权重为0.3,文档15在两个字段都命中，所以得分最高。

文档13命中Subject，14命中tag，所以评分文档13高于14.

秒客网

记录solr的一些参数配置

相关文章