Skip to Main Content
University of Texas University of Texas Libraries

Text Data Mining

Licensed Library Datasets & Resources

Licensed Library Datasets & Resources

If you want to mine content from a library database, you must follow the license agreement that the library negotiated with the database owner. Many library resources do not allow automated or systematic downloading of articles or creation of corpuses for textual analysis. Using scripts or software to download content can result in a loss of access for the entire campus, as well as suspension of individual accounts. Please contact your subject librarian before beginning a project so we can make sure we're complying with our contracts. Below are a few resources that allow text data mining. There are others.

A note about NexisUni

NexisUni (which has news and legal information) has access limits:

  • daily limit of 2,500 documents for each individual at UT with a max of 100 at a time
  • there is an API that allows for bulk data downloads, but this requires an additional charge (data likely available in JSON)
  • it is possible to purchase API access to specific subsets of data to reduce overall cost
  • price is $10,000 to $15,000 annually for full API access to all NexisUni products
  • subset API access might be ~$5,000 to $6,000

Researchers who have funding must speak with our Relationship Manager to arrange their access.

Librarians can contact SRD to find who that person is.

 

Questions?

For help, please contact the librarian for your subject area.  We have a guide to library specialists by subject.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.