Dikshant Shahi's Blog / Share Your Thoughts
  • Solr vs ElasticSearch

    By: Dikshant Shahi | July 17, 2015

    “Which one should I choose, Solr or ElasticSearch?” The question is quite frequently asked by anyone which is building a search engine or learning one of the two. And why shouldn’t they ask, after all both Solr

  • Solr 5.3: Execute SQL queries

    By: Dikshant Shahi | June 19, 2015

    SQL statement is the most widely used language for querying data and is the natural choice of data analyst.  It’s acceptance is so wide that projects like Apache Hive got birth for the purpose, which

  • Solr HyperLogLog

    By: Dikshant Shahi | June 12, 2015

    Solr 5.2 introduces HyperLogLog, the probabilistic approach for counting distinct values. Solr already had provision to count distinct values using unique facet function or countDistinct LocalParam in stats component. But this approach doesn’t scale well, as

  • Solr: Search using JSON Request API

    By: Dikshant Shahi | June 5, 2015

    Solr 5.1 introduces the new JSON request API! Solr supports multiple query response formats such as xml, json, csv, velocity UI, etc. but for making request you were always supposed to provide Http request parameters. Below

  • Solr Core Discovery

    By: Dikshant Shahi | May 15, 2015

    How does Solr discover a core? If you are a seasoned Solr developer and yet to migrate to latest releases, your answer might be ‘by registering the core by adding <cores> element in solr.xml’. Yes, this

  • Solr: Backed up? Now you can restore soon!

    By: Dikshant Shahi | May 8, 2015

    One of the lesser known but cool features of ReplicationHandler is support for index backup. You must have used ReplicationHandler in your project for replicating index from master to slave instances. if you want to

  • Solr Optimistic Concurrency Unlocked!

    By: Dikshant Shahi | April 30, 2015

    If you have multiple clients updating documents, it’s really critical to ensure that newer version of the document is never overwritten by the older version. To address this problem, what you need is concurrency control,

  • Apache Solr: data processing pipeline

    By: Dikshant Shahi | April 23, 2015

    Index time processing of data is crucial for developing a search engine and its criticality depends on how sophisticated your search engine is. If your search engine is very basic, you still need to perform