Shalin Says...

A couple of weeks back, Apache Lucene Committer and PMC member, Michael McCandless started a discussion on factoring out a shared, standalone Analysis package for Lucene, Solr and Nutch. During the discussions, Yonik Seeley, Solr Creator, proposed merging the development of Lucene and Solr. After intense discussions and multiple rounds of voting, the following changes are being put into effect:

  • Merging the developer mailing lists into a single list.
  • Merging the set of committers.
  • When any change is committed (to a module that “belongs to” Solr or to Lucene), all tests must pass.
  • Release details will be decided by dev community, but, Lucene may release without Solr.
  • Modularize the sources: pull things out of Lucene’s core (break out query parser, move all core queries & analyzers under their contrib counterparts), pull things out of Solr’s core (analyzers, queries).

The following things do not change:

  • Besides modularizing (above), the source code would remain factored into separate dirs/modules the way it is now.
  • Issue tracking remains separate (SOLR-XXX and LUCENE-XXX issues).
  • User’s lists remain separate.
  • Web sites remain separate.
  • Release artifacts/jars remain separate.

So what does it mean for Lucene/Solr users? Nothing much, really. Except that you should see tighter co-ordination between Lucene and Solr development. New Lucene features should reach Solr faster and releases should be more frequent. Solr features may also be made available to Lucene users who do not want to setup Solr use the RESTy APIs.

Already, Solr has been upgraded to use Lucene trunk (in branches/solr) and should soon become the new Solr trunk. There is talk of re-organizing the source structure to better fit the new model. Things are moving fast!

Personally, I feel that this merge is a good thing for both Lucene and Solr:

  • Solr users get the latest Lucene improvements faster and releases get streamlined.
  • Lucene users get access to Solr features such as faceting.
  • The in-sync trunk allows new features to make their way into the right place (Lucene vs Solr) more easily and duplication is minimized.
  • Bugs are caught earlier by the huge combined test suite.
  • More number of committers means more ideas and hands available to the projects
  • Other Lucene based projects can benefit too because many Solr features will be made available through Java APIs.

There are a couple of things to be worked out. For example, we need to decide where the integrated sources should live and whether or not to sync Solr’s version with Lucene’s. All this will take some time but I am confident that our combined community will manage the transition well.

  1. dispensablespir reblogged this from shalinmangar
  2. dfdeshom reblogged this from shalinmangar
  3. shalinmangar posted this
blog comments powered by Disqus