solr - an R client for Apache Solr

Scott Chamberlain (@recology_ / @ropensci)

UC Berkeley / rOpenSci

Supported by:

License: CC-BY 4.0 - You are free to copy, share, adapt, or remix, photograph, film, or broadcast, blog, live-blog, or post video of this presentation, provided that you attribute the work to its author and respect the rights and licenses associated with its components.


A perfect combination


Data is increasingly on the web

API: Application Programming Interface


Reproducibly plug data from the web into research workflows



The Solr R client:

Solr in R

  • Use cases for Solr in R

  • solr R client - Search

  • solr R client - Server management

Solr v5

We're developing against Solr v5

So some things may not work with older versions

Use case: Data exploration

R has all the tools you'll need for data manipulation, vizualization, and statistics

Access to infinite data via Solr makes this a powerful combination

Use case: Data exploration

The data.frame is the most common data structure in R, and the easiest to work with


Use case: Data exploration

in solr R, we make data.frame default output from search

easy downstream use for:

  • vizualization
  • statistics
  • modelling

Use case: Easy R client libraries

Many public web APIs use Solr

R client = easy w/ solr R client

Use case: Easy R client libraries


Use case: Solr Server Management

Probably don't want to do all server mngt. in R, but e.g.,

  • create/delete a collection/core

  • add/delete/update documents from files, and R objects

are, or will be, easy in solr R client

First, let's connect - solr_connect()

You can also toggle:

  • error verbosity
  • whether URLs are printed
  • use a proxy

Additional search functions

  • solr_mlt() - more like this search

  • solr_group() - group search

  • solr_highlight() - highlight search

  • solr_stats() - stats search

Server management

Server management functions

  • core_*() - manage cores

  • collection_*() - manage collections

  • add_*() - add documents from R objects

  • solr_get() - get documents by id

  • update_*() - add documents from files

  • delete_*() - delete documents

  • config_*() - set/unset config params

Three update_*() fxns:

  • update_json()
  • update_xml()
  • update_csv()

- Input: files

- JSON and XML versions can include add & delete for specific documents

In the works...

  • Inspect configuration

  • Write configuration

  • Compatibilty with older Solr versions

  • Support spatial search

  • Plugin handler (if possible)

In closing...

Would love your feedback
kick the tires
let me know what could be better

solr R client on GitHub:


let's talk


I'll be around tomorrow if you want to meet

rOpenSci on the web:

rOpenSci discussion forum:

This talk on the web:

Made w/ reveal.js

Icons by: FontAwesome v4.4.0