Skip to main content
University of Texas University of Texas Libraries

Physical & Thermodynamic Properties

Sources and Quality of Data

The Challenge of Data

Searching for thermodynamic and physical property data of chemical substances can be time-consuming and frustrating. It truly is like searching for a very specific needle in a huge and utterly disorganized mountain of needles -- and the one you're looking for may not even exist. One of the reasons why people tend to take shortcuts and make potentially serious mistakes with data is that the recommended mechanisms for finding reliable data are imperfect, never comprehensive, and often mysterious. Defaulting to more familiar web search engines can be a waste of time as well as risky, because they are not really appropriate tools for this task.

It is usually not difficult to locate reliable data for standard gases, small organic molecules, and common inorganic substances in pure form, along with their aqueous solutions and well-known binary systems. Standard reference tools like the CRC Handbook, the NIST WebBook, and DIPPR can answer many of these basic questions. But if you are looking for a complex or proprietary material -- things like polymers, drugs, biological molecules, exotic molecules, composites, newly synthesized compounds, or commercial products -- published data often don't exist. It can also be difficult to find data covering non-standard conditions such as extreme temperatures and pressures; extrapolation of known data to such conditions may not be reliable. Engineers often rely on property values calculated by estimation programs. These are useful within their stated limits, but are outside the scope of these pages, which focus on finding published literature values.

High-quality data can be found in certain online databases, but these almost always require a subscription or a fee to use. Many printed secondary data compilations have been published over the years, of varying quality and scope. They are all arranged differently, sometimes incomprehensibly. They cover different types of compounds and properties, and they tend to be scattered in a library, making them hard to remember and locate. Data reported in the primary journal and technical report literature can be even more elusive.

About Data Quality

While people often refer to scientific data points as "facts," one scientist has stated that "there are no facts - just measurements embedded within assumptions." (1)

The accuracy of data published in the primary literature (e.g., in peer-reviewed journals) should not be assumed. Reviewers rarely examine such data closely. Experimental and measurement errors can occur. Authors can be sloppy in their use of units and symbols, and errors and typos creep in during the editing process. The pressure to keep articles short and omit "unnecessary" tables and graphs means that some useful data do not appear in published articles at all. If that's not bad enough, errors that make it into the literature can be propogated elsewhere almost indefinitely, creating confusion and uncertainty about even basic properties.

Most but not all data "handbooks" are secondary sources, meaning they are compilations of data previously reported in the primary literature. The reliability of secondary sources obviously depends on both the quality of the original data and on the care taken in compiling and evaluating them. Most compilations provide literature references for the data. Those that don't include such references should be used with caution. The age of the data is also relevant. The enthalpy of a compound is the same today as it was in 1905 - what has changed is the precision of measurement and estimation methods. Older data may be perfectly valid, but they should be compared to more recent values if they can be found.

The same caveats apply to data you might find on the Internet. A value found on a college lab course web page, in an MSDS, or in Wikipedia (2) cannot be treated the same as a value contained in a NIST database. The bottom line is that all sources of data should be viewed with a critical eye. Ask these questions: Is the source cited? When was this work done, and by whom? Were the data determined experimentally or derived by calculation (estimated)? What methods, experimental parameters, or special conditions applied? If you can't answer these questions the data probably should not be trusted.

The term critically evaluated - while occasionally overused - is a useful one to look for in secondary sources. This usually implies that someone has evaluated the data and procedures for internal consistency, and, in cases where conflicting values have been reported, established a set of recommended values. It does not mean that experts have repeated and verified the measurements themselves. Touloukian provides a useful overview of critical evaluation, stating that "while 'critical analysis' always sets a 'level of confidence' for the recommended values, there is no implication whatsoever of high accuracy or precision in these values." (3) Most primary literature and secondary compilations are uncritical, however.



  1. Bradley, J.C. Blog post 6/22/2011.
  2. Walker, M.A. "Wikipedia as a resource for chemistry." ACS Symp. Ser. 1060, 2011, 79-92.
    [Wikipedia chembox: Wikipedia pages on common chemical compounds usually contain an infobox sidebar called a chembox, which provides data values for properties that are not likely to change. Data marked with a checkmark have been "verified" by WikiProject Chemicals. However, any data lacking a specific literature citation, and any value lacking the verfication checkmark, should be verified independently before being trusted.]
  3. Touloukian, Y.S. "Twenty-five years of pioneering accomplishments by CINDAS--a retrospective review." Int. J. Thermophysics 2, 1981, 205-222.

Sources of Critical Data

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 2.0 Generic License.