solr - an R client for Apache Solr
				
				Scott Chamberlain (@recology_ / @ropensci)
					
UC Berkeley / rOpenSci
				
				
Supported by:
				 
		
		
		
			
License: CC-BY 4.0 - You are free to copy, share, adapt, or remix, photograph, film, or broadcast, blog, live-blog, or post video of this presentation, provided that you attribute the work to its author and respect the rights and licenses associated with its components.
		https://creativecommons.org/licenses/by/4.0/
		
		
		
		
		
			A perfect combination
			
			
				 
				  
				 
			
			
			Data is increasingly on the web 
		
		
			
			API: Application Programming Interface
			
			  
		
		
			
			 Reproducibly plug data from the web into research workflows
		
		
		
		
		
			Solr in R
			
			
				- Use cases for Solr in R
- solr R client - Search
- solr R client - Server management
Solr v5
			
			We're developing against Solr v5
			
			So some things may not work with older versions
		
		
			Use case: Data exploration
			
			R has all the tools you'll need for data manipulation, vizualization, and statistics
			
			Access to infinite data via Solr makes this a powerful combination
		
		
			Use case: Data exploration
			
			The data.frame is the most common data structure in R, and the easiest to work with
			
			
				 
			
		
		
			Use case: Data exploration
			
			in solr R, we make data.frame default output from search
			
			easy downstream use for:
			
				
					- vizualization
- statistics
- modelling
		
		
		
		
			Use case: Easy R client libraries
			
			Many public web APIs use Solr 
R client = easy w/ solr R client
		
		
			Use case: Easy R client libraries
			
			Examples:
			
			
				
			
		
		
		
			Use case: Solr Server Management
			
			Probably don't want to do all server mngt. in R, but e.g.,
			
			
				- create/delete a collection/core
- add/delete/update documents from files, and R objects
			are, or will be, easy in solr R client
		
		
		
			First, let's connect - solr_connect()
			
			You can also toggle:
			
			
				- error verbosity
- whether URLs are printed
- use a proxy
		
		
		
		
		
			Additional search functions
			
			
				- solr_mlt() - more like this search
- solr_group() - group search
- solr_highlight() - highlight search
- solr_stats() - stats search
Server management functions
			
				- core_*() - manage cores
- collection_*() - manage collections
- add_*() - add documents from R objects
- solr_get() - get documents by id
- update_*() - add documents from files
- delete_*() - delete documents
- config_*() - set/unset config params
Three update_*() fxns:
		
		
		
			- update_json()
- update_xml()
- update_csv()
		
		- Input: files
		- JSON and XML versions can include add & delete for specific documents
		
		
		
			In the works...
			
			
				
					- Inspect configuration
					- Write configuration
					- Compatibilty with older Solr versions
					- Support spatial search
					- Plugin handler (if possible)
		
		
			In closing...
			
			Would love your feedback 
 kick the tires 
 let me know what could be better
			
			
		
		
		
			let's talk
			
			
			
			I'll be around tomorrow if you want to meet