Query Rescoring in Solr


By: Vijay Mhaskar | June 19, 2015

Introduction

Sometimes relevance requirements are very complex and creates performance issues during execution. There is a very nice feature Introduced in Solr 4.9 called “Query Reranking/Rescoring” (SOLR-6088) which allows us to run our query with a less costly relevance algorithm which contains simple query first and then re-rank the top N documents by a more costly algorithm with complex query second. Here we are applying costly ranking algorithm to top N documents only. It will have less impact on search performance. But we need to choose first query very carefully because the documents which score very low using the first simple query may not be considered during the re-ranking phase, even if they would score very highly using complex query.

The re-ranking is therefore a two-stage scoring mechanism and the “expensive” scoring calculations are generally applied in second step, where only limited documents are involved.

Query Re-ranking syntax

We can use “rq” request parameter to specify ranking query. The “rq” parameter should contain query string that produces a RankQuery. It’s very easy to use the “rerank” parser provided with Solr. It wraps a query specified by an local parameter, along with additional parameters indicating how many documents should be re-ranked (reRankDocs),  and how the final scores should be computed (reRankWeight).

Parameter

Default

Description

reRankDocs 200 Number of documents whose score will be recalculated in the second step
reRankQuery (mandatory) Solr query that is used for the re-ranking – in most cases a variable will be used to refer to another request parameter.
reRankWeight 2.0 A multiplicative factor that will be applied to the score from the reRankQuery for each of the top matching documents, before that score is added to the original score

 Example

q=Digital+group&rq={!rerank reRankQuery=$rqq reRankDocs=1000 reRankWeight=3}&rqq=solrcloud

The above example finds all documents that match on the query “Digital group”, and changes the ranking of the top 1000 documents. Documents that also contain “solrcloud” get an additional weight of 3.

 Conclusion

Re-Ranking is a great way of scoring documents dynamically if there are performance issues in ranking documents. Re-Ranking is not the solution to adjust the scoring, but this feature will be very helpful in many situations.

This post has been viewed 5,472 times

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>