You may wish to check if a dataset or corpora is already available before creating your own. This can be useful if you don't have time to extract your own data, or you'd like to practice with a real dataset. What follows is a list of free and open datasets.
If you are interested in licensed content, please check UT's Linguistics Research Guide. If you want to use library databases for text mining, please go to this page. You will have to apply and possibly pay fees to a publisher to access published content.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 Generic License.