Faceted Search using Solr


By: Dattatraya Patil | July 3, 2015

 

Faceting:

Faceted search (also called faceted navigation, guided navigation, or parametric search) breaks up search results into multiple categories, typically showing counts for each category, and allows the user to “drill down” or further restrict their search results based on those facets.

Solr supports following 6 types of faceting:

  1. Field faceting
  2. Range faceting
  3. Date faceting  (Date faceting is deprecated, so use Range faceting instead.)
  4. Interval faceting
  5. Query faceting
  6. Pivot faceting

When you execute facet query on solr, the output result will have facet related section appended at the end as shown below:

<lst name="facet_counts">
   <lst name="facet_ fields "/>
   <lst name="facet_ ranges "/>
   <lst name="facet_dates"/>
   <lst name="facet_ intervals "/>
   <lst name="facet_queries"/>
    <lst name="facet_pivot">
</lst>

Now we will look into details of each faceting type supported by solr.

1. Field faceting:

Field faceting retrieves counts of all terms, or just the top terms in any given field. The field must be indexed.

 Example: The below query returns, count of terms in the field city.

q=*:*&facet=true&facet.field= city

   Output:

<lst name="facet_counts">
  <lst name="facet_queries"/>
  <lst name="facet_fields">
     <lst name="city">
       <int name="New York">2</int>
       <int name="New Orleans">1</int>
     </lst>
  </lst>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>

2. Range faceting:

Range faceting is applied on any date field or any numeric field that supports range queries.

 Example: The below query returns number of items having price in the rage of [0-100], [100-200], [200-300] and [300-400]

http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.range=price&   facet.range.start=0&facet.range.end=400&facet.range.gap=100

Output:

<lst name="facet_counts">
 <lst name="facet_counts">
 <lst name="facet_queries"/>
 <lst name="facet_fields">
 <lst name="facet_ranges">
   <lst name="price">
      <lst name="counts">
          <int name="0.0">2</int>
           <int name="100.0">0</int>
           <int name="200.0">2</int>
           <int name="300.0">0</int></lst>
           <float name="gap">100.0</float>
            <float name="start">0.0</float>
           <float name="end">400.0</float>
     </lst>
   </lst>
</lst>

Parameters of Range Faceting:

Parameter Description
facet.range Specifies the field to facet by range.facet.range=price&facet.range=age
facet.range.start Specifies the start of the facet range.facet.range.start=0 orf.price.facet.range.start=0.0&f.age.facet.range.start=10
facet.range.end Specifies the end of the facet range.facet.range.end=1000 orf.price.facet.range.end=1000.0&f.age.facet.range.start=99
facet.range.gap Specifies the span of the range as a value to be added to the lower bound.f.price.facet.range.gap=100&f.age.facet.range.gap=10
facet.range.hardend A boolean parameter that specifies how Solr handles a range gap that cannot be evenly divided between the range start and end values. If true, the last range constraint will have the facet.range.end value an upper bound. If false, the last range will have the smallest possible upper bound greater then facet.range.end such that the range is the exact width of the specified range gap. The default value for this parameter is false.For example: facet.range.start=0, facet.range.end=50 and facet.range.gap=15Case1:facet.range.hardend=true , range is Start=0 and end =50, so ranges will be [0-15], [15-30], [30-45], [45-50]Case2:facet.range.hardend=false, range is Start=0 and end =60, so ranges will be [0-15], [15-30], [30-45], [45-60]
facet.range.include Specifies inclusion and exclusion preferences for the upper and lower bounds of the range.

 

 

3. Date faceting :

Date faceting returns number of documents that fall within certain date ranges.

 Note: Use Range faceting instead of Date faceting as it is deprecated .

4. Query faceting :

Query faceting returns number of documents in the current search results that also match the given facet query.

Example: Below query returns count of cars whose price is between :[10 TO 80] &:[90 TO 300]

http://localhost:8983/solr/select?q=name:car&facet=true&facet.query=price:[10 TO 80]&facet.query=price:[90 TO 300]

<lst name="facet_counts">
   <lst name="facet_queries">
      <int name="price:[10 TO 80]">1</int>
      <int name="price:[90 TO 300]">3</int>
   </lst>
   <lst name="facet_fields" />
   <lst name="facet_dates" />
</lst>

5. Interval faceting :

Interval Faceting allows you to set variable intervals and count the number of documents that have values within those intervals in the specified field. In order to use Interval Faceting on a field, it is required that the field has “docValues” enabled.Even though the same functionality can be achieved by using facet query with range queries, the implementation of these two methods is very different and will provide different performance depending on the context. If you are concerned about the performance of your searches you should test with both options. Interval Faceting tends to be better with multiple intervals for the same field, while facet query tend to be better in environments where cache is more effective (static indexes for example).

 

Example: Below query returns count of cars whose price is between :[10 TO 80] &:[90 TO 300]

http://localhost:8983/solr/select?q=name:car&facet=true &facet.interval=price&&f.price.facet.interval.set=[10,80]&f.price.facet.interval.set=(90,300]

<lst name="facet_counts">
   <lst name="facet_queries" />
   <lst name="facet_fields" />
   <lst name="facet_dates" />
   <lst name="facet_intervals">
      <lst name="price">
         <int name="price:[10 TO 80]">1</int>
         <int name="price:[90 TO 300]">3</int>
      </lst>
   </lst>
</lst>

6. Pivot faceting:

Pivot faceting allow breaking down the values by category with additional sub-categories. Thus it allows to analyze your data in multiple dimensions.Imagine that in our store we have products divided into categories. In addition to that, we store information about the stock of the items. Now, we want to show how many of the products in the categories are in stock and how many we are missing. This calculation can be done using pivot faceting.

Now let’s index the following example data:

<add>
<doc>
<field name="id">1</field>
<field name="name">Book 1</field>
<field name="category">books</field>
<field name="stock">true</field>
</doc>
<doc>
<field name="id">2</field>
<field name="name">Book 2</field>
<field name="category">books</field>
<field name="stock">true</field>
</doc>
<doc>
<field name="id">3</field>
<field name="name">Workbook 1</field>
<field name="category">workbooks</field>
<field name="stock">false</field>
</doc>
<doc>
<field name="id">4</field>
<field name="name">Workbook 2</field>
<field name="category">workbooks</field>
<field name="stock">true</field>
</doc>
</add>

Execute following query

http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.pivot=category,stock

Output:

"facet_counts":{
    "facet_queries":{},
    "facet_fields":{},
    "facet_dates":{},
    "facet_ranges":{},
    "facet_pivot":{
      "cat,inStock":[{
          "field":"cat",
          "value":"books",
          "count":2,
          "pivot":[{
              "field":"inStock",
              "value":true,
              "count":2}
              ]},    	 
          {"field":"cat",
          "value":"workbooks",
          "count":2,
          "pivot":[{
              "field":"inStock",
              "value":true,
              "count":1},  
             {"field":"inStock",
              "value":false,
              "count":1}
              ]}]

 

This post has been viewed 11,689 times

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>