Monday, August 29, 2011

Indexing MongoDB Data for Solr

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project.
MongoDB (from "humongous") is a scalable, high-performance, open source, document-oriented data store. 

I was happy using MongoDB and my very own search engine written using/extending lucene, until the trunks for Solr and Lucene were merged. This merge translated to Solr using the same release of lucene that I was using, unlike the past when there was some disconnect between the two. I realized that a lot of what I was trying to build was available through Solr.
Though Solr is used by a lot of organizations (which can be found here) and I'm sure that at least a few of them using Mongo, for some reason there was/is no straight forward out of the box import handler for data stored in MongoDB.

This made me search for a framework/module/plug to do the same, but in vain.

All said and done, here's a way that I finally was able to index my mongodb data into Solr.
I've used SolrJ to access my Solr instance and a mongo connector to connect to Mongo. Having written my own sweet layer that has access to both the elements of the app, I have been able to inject data as required.

--snip--
public SolrServer getSolrServer(String solrHost, String solrPort) throws MalformedURLException {
    String urlString = "http://"+solrHost+":"+solrPort+"/solr";
    return new CommonsHttpSolrServer(urlString);
}

--/snip--

Fire the mongo query, iterate and add to the index

--snip--

SolrServer server = getSolrServer(..); //Get a server instance

DBCursor curr = ..; //Fire query @ mongo, get the cursor

while (curr.hasNext()) { //iterate over the result set
        BasicDBObject record = (BasicDBObject) curr.next();
        //Do some magic, get a document bean       
        server.addBean(doc);
}
server.commit();
--/snip--

This will get you started on your track to index mongo data into a running Solr instance.
Also, remember to configure Solr correctly for this to run smooth.

Download Resources:

1 comment:

Nethra Muniraju said...

Hi,

It will be very helpful if you can post the entire code here.

Thanks,
Neha