Skip to Main Content
University of Texas University of Texas Libraries

Linguistics

Important to Know: Text & Data Mining of Library Subscription Databases

Text & Data Mining of Library Subscription Databases

For researchers interested in mining the Libraries' eresources, the following should help with planning.  

You can seek help, so don't try to scrape yourself!  Many publishers and vendors restrict automated data scraping and large scale access to their content and provide access in ways that preserve their copyright - we can help you contact publishers in order to gain access - email me to get started!

It takes time.  We negotiate on a case-by-case basis; build time into your research schedule for acquisition of the target data!  

Negotiations with publisher may be the responsibility of the researcher.  Librarians can help you make initial contact with vendors, but sometimes the responsibility to communicate with the vendor and acquire the data falls to the individual  researcher.  In some cases vendors might ask for an addenddum to the Libraries' license be executed, and we are happy to work with them in these cases.

It might cost you $$ and resources. A few of the libraries' contracts include text or data mining access and a cost recovery fee for data preparation and delivery on physical media. We will advocate on researchers' behalf, wherever possible, to include TDM access in our licenses. Depending on the vendor, access can cost thousands of dollars.  Writing expected costs into your grant proposals is a good way to ensure access.

For some researchers, Open Access journals and repositories can be a good alternative. For a great list of free sources, as well as tools for analysis see this TDM guide from Carnegie Mellon Libraries 

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 2.0 Generic License.