Working with Scholarly Text w/ rOpenSci Tools
Scott Chamberlain (@sckottie)
UC Berkeley / rOpenSci
What kinds of questions can we ask?
Does number of authors per article increase through time?
Do p-values on average differ by impact factor?
How do length of methods sections change through time?
How does the use of the word ___ vary through time?
How does code sharing vary by journal/discipline/etc.?
scholarly text data flow
how open will publishers be moving forward?
metadata, including references
OCC: Open Citations Corpus
As of March 12, 2018, the OCC has ingested the references from 302,758 citing bibliographic resources
and contains information about 12.8 million citation links to 6.5 million cited resources.