Atomic Updates in Solr

Introduction

In Solr Once the data is indexed in changes to the Solr schema or configuration files require complete re-indexing of the data. Also, requirements to add new fields to solr index requires recreating solr documents for all input data and indexing them in solr. To overcome with this Apache Solr introduces Atomic Updates which is a new feature in Apache Solr 4.0 that allows you to update on a field in a document . One major motivation for using atomic updating is being able to change a part of the document without the need to regenerate the entire document.


Pre-requisites
  • First you must be using Solr4.0. Older version do not support atomic updates.
  • Secondly, all the fields in your schema.xml file must be set to stored.


<field name="tm_text" type="text" indexed="true" stored="true" multiValued="false" />
<field name="rd_date" type="datetime" indexed="true" stored="true" multiValued="false" />
<field name="ai_indicator" type="lowercase" indexed="true" stored="true" multiValued="false" />


  •  All data should already be indexed previously, atomic updates  is only for updating the previously indexed data.

solr_configuration

 

SolrJ Modifiers:

Once the Solr is up and running, Solr supports several modifiers that automatically update values of a document.

  • set   – set or replace a particular value,or remove the value of null is specified as a the new value.
  • add -adds an additional value to a list.
  • remove –removes a value from a list.
  • inc -increments a numeric value by a specific amount
  • removeregrex –removes from a list that match the given java regular expression.

Atomic Updates using Solr’s Java client


//Create Solrj client
HttpSolrClient  solrServer= new HttpSolrClient(“http://localhost:8983/solr”);


String solrID = (String) solrDocument.getFieldValue("id");
SolrInputDocument solrDocToIndex = new SolrInputDocument(); //Create Document
solrDocToIndex.addField("id", solrID);
Map<String, String> partialUpdate = new HashMap<>();
partialUpdate.put("set", fileModDate);
solrDocToIndex.addField("srcfile_modification_date", partialUpdate);
solrServer.add(solrDocToIndex); //send it to solr
solrServer.close();
 
Conclusion:

Using atomic updates, we are able to skip data parsing and Solr document creation thus reducing the processing time considerably. Also requirements to changes in the rules or addition of new fields require data to be updated in the Solr index.

Write a comment
Cancel Reply
  • Sanchita Sharma July 8, 2015, 2:27 pm
    This bolg is realy very informative.
    reply