Here are answers to some in-depth questions about using SciFinder. If you don't see your question here, contact the Chemistry Librarian.
(This page refers only to the "classic" SciFinder web interface, not to the newer SciFinder-n. See SciFinder-n Help Pages.)
The Explore by Research Topic option presents a single text box that looks like Google's, but SciFinder's search algorithm functions very differently from those of standard web search-engines. SciFinder uses a proprietary, complex (and somewhat mysterious) natural language query algorithm that breaks your query into a set of discrete concepts, searches them against the database indexes, and then presents you with a selection of result options.
SciFinder's topic search was designed to maximize retrieval, so some of its imprecision is intentional. It is not designed to be a highly precise search tool for expert searchers. This can sometimes cause frustration when you're trying to isolate records very carefully.
Here is a short list of important searching points unique to SciFinder.
No, but there are some workarounds. SciFinder's stemming algorithm is not obvious or particularly predictable. It does not use truncation symbols such as an asterisk. Searches for particular words are "polluted" by unrelated and unwanted words that share the same stem. For example, a search for "reactivation" will also retrieve records containing "reaction", "reactivity", "reacts", and so forth, making the desired word essentially unfindable. One makeshift way around this problem is to do your initial search on the single problematic word or phrase, and select the "as entered" option from the results table. Then refine/analyze that set with additional words or searches. A more sophisticated method is to do the "as entered" search, save the results with the Save Set feature, and then combine it with further searches also saved as sets. (However, a set has to be less than 20,000 hits before it can be combined.) .
The default sort for references is by the database accession number, which is essentially the same as reverse-date: newest documents are on top, the oldest at the bottom. (Records from Medline are always below records from CAPLUS, also sorted in reverse chronological order.) You can re-sort a results set by author, document title, citing references, or reverse publication year.
SciFinder does not sort reference answers by "relevance" algorithms. To increase relevance you can select the "closely associated with each other" set from the results histogram before displaying your answers, or you can use the Refine, Analyze, and Categorize functions to narrow and focus a results set.
Substance records are default-sorted by relevance, which should bring the closest matches to the top. It is also useful to re-sort by Number of References, which will bring the best-known and most-cited substances to the top. You can re-sort by reverse Registry Number, which means that the most recently reported substances - which usually have few or no references - are on top.
Not directly. SciFinder's retrieval is often very large because of the sheer size of the database and the number and variety of sources indexed. CAS makes no judgments about what the "best" literature might be. A couple of tricks to help you focus results:
Analyze is a tool that allows you to evaluate and review an answer set without having to browse all the entries one by one. You can view histograms of your set based on author name, organization/company (only the first author's organization is indexed), language, document type, journal title, date, CAS Registry Number, and so on. This can help you quickly identify individuals or organizations doing work or getting patents on your topic, chemical substances most frequently mentioned, key publications, trends over time etc. You can click on a histogram bar and view just those records.
Another way to analyze a reference result set is to use the Categorize feature. Choose a broad subject category from the list on the left, then select one or more CA index terms from the right side and view just those records. This technique utilizes the power of CAS' controlled indexing vocabulary and works very nicely for filtering large answer sets.
You can analyze a substance answer set by real-atom attachments, variable group and R-group, precision, and stereochemical precision. This helps narrow down a large substance answer set to zero in on structures of particular interest.
If you are searching by drawing a chemical reaction, you can analyze a reaction result set by catalyst, solvent, number of steps, product yield, as well as bibliographic data such as author, journal, year, document type, etc. You can also group reactions by transformation type, and re-sort by frequency.
Yes, but here's an important tip. Don't use the language limits on the main search page. Do the search without a language limit first, then from your results list open the Analyze tool (not the Refine tool). Pull up the histogram of languages for the set, and click the Show More button to see them all. Choose the language(s) you want AND also the Unavailable option, if present. Many pre-1967 CAPLUS records did not specify a language, and these show as "Unavailable" in the language field and you'll inadvertently exclude them if you choose only specific languages. Records with LA=Unavailable are predominantly in English.
Under Explore Substances, select the "Substance Identifier" option. Chemical nomenclature in general is very complex, and follows different sets of rules. CAS uses its own nomenclature rules to assign systematic names to chemical substances, and their rules have changed substantially over the years. The Registry database indexes the current official CA Index Name for all substances, along with any former CA Index Names and various synonyms and trade names that have been used in the literature. While Registry is the largest source of chemical names and trade names available, it is not totally comprehensive.
It's straightforward to search by well-known common names (ex. acetic acid, cyclohexane, acetaminophen), familiar trade names (ex. Taxol), and common abbreviations (ex. MTBE). Searching systematic names is less reliable because of the many possible variations in a name string. In general, the longer the systematic name the less likely you'll find it by typing it exactly.
SciFinder looks first for an exact match to the name as you type it. If it finds an exact match, it displays only that compound, and no others. For example, if you search for "Gallopamil" it will retrieve the one compound that has that exact name, but it will NOT retrieve compounds where "Gallopamil" is a segment of a longer name, such as "Gallopamil hydrochloride" (or any salts or multicomponent compounds).
If it doesn't find an exact match, it next looks for the string you entered as a segment within a name. It will retrieve all such partial matches. For long names, you'll have a better chance of getting a hit if you break it up into discrete segments than if you type it all as a single unbroken string. SciFinder will retrieve all the compounds that have names including all the segments, and you can browse these for the one you want. If you get too many hits, add locants to some segments to narrow the possibilities. For example, to search for
type some of the identifiable functional group segments, in any order, separated by spaces:
3-buten-1-yl 2,3,4,9-tetrahydro 1H-pyrido 1-carboxylic
and you'll get a table of matches to browse.
As with all chemical database tools, the chemical name is not the ideal way to search for a compound because of the complexity and inconsistency of chemical nomenclature and the diversity of synonyms and trade names used in the literature. Never rely on a name search when doing a comprehensive search for a compound. The rule of thumb is, when in doubt, draw it!
Yes. Generally, the results you'll get are the same as if you entered the exact substance name/synonym because the search algorithm maps names to corresponding RNs. Using the RN is more straightforward if you already have it. However, there are exceptions to this. Older documents (pre-1967) were not as thoroughly indexed for tangential substance information. If an RN was not originally assigned to an article's indexing, you won't find that article by entering that RN; but you would find it if the substance name appeared in the original abstract (i.e., as a keyword). This particularly affects substances such as reagents, by-products, intermediates, and other compounds that may have been secondary to the article's main focus. When in doubt, try the search both ways.
There are two ways to do this. If you already know the Registry Number (or SMILES or InChI), just click the add-to-editor button in the structure drawing window, and enter the RN or other identifier directly. In addition, you can do a search for a compound and view the record with the structure in question. Click on the structure diagram and choose "Explore by chemical substance" (or reaction). The structure will be imported into the drawing module and can be modified for further searching. This is a useful shortcut to drawing a complex structure.
Molecular formulas, while imprecise, can give you searching options not possible any other way. But you do have to understand how CAS derives and indexes the MF in Registry records.
Molecular formula searches often retrieve large numbers of hits. You can use the Analyze/Refine tool and draw part of the structure to narrow them down, or try another kind of search.
Since SciFinder doesn't allow you to search for partial or variable molecular formulas, you have to use the structure drawing tool. Just "draw" all the atoms you want present as unconnected points, then apply filters to reduce the retrieval. A substructure search will retrieve substances containing all of those elements plus any others. An exact search will retrieve substances containing only the elements drawn. For the search to work, it needs to contain some less common elements: if you just search for C, H, O, N, S, for example, it will obviously retrieve many millions of substances and fail.
Salts are often best searched by molecular formula, and CAS treats most of them as multicomponent substances composed of a free acid and a base. Simple salts like NaCl (= ClNa) are straightforward. The formulas of more complex inorganic, organometallic and organic salts are indexed as dot-disconnect compounds under the following scheme:
F(c) . N F(a)
where F(c) is the molecular formula of the cation (or acid), F(a) is the anion (or base) formula, and N is the number of anions - which can be a whole number or a fraction. Note that the acid's hydrogen(s) are retained in its component formula. Examples:
This unique format of formula parsing is based on the sorting in the old CA printed formula indexes, where all salts would be grouped under the parent acid's formula. This policy doesn't make as much sense in the digital environment, but it is still the operating principle.
You can also search for salts by drawing the exact structure of either the free acid or the base, or both together as separate fragments. Since MF is an exact search, a search for the simple salt will not also find any hydrates. A hydrate can be searched by adding the . N H2O as a third component separated by the dot.
For more information, see a useful overview of searching for inorganic substances.
Doing structure searches for organometallic and coordination compounds in SciFinder is a challenge. Due to the complexity of the nomenclature, exact name searches are rarely successful. If you're looking for an exact structure, it's sometimes easier to use the molecular formula option instead.
If you're drawing or modifying a substructure, always select "Show precision analysis" before running the search. Then select the "conventional structure" subset of the results. (Note: precision analysis is not available if there is stereochemistry, or for similarity searches.) There's also a limiter for coordination compounds (hidden under Advanced options) that you can check before doing the search.
Another tip: Turn off valency analysis in the drawing applet's preference pane so that the system doesn't keep bothering you about nonstandard valencies as you draw. It will still ask you to confirm the nonstandard valency before doing the search, and just click OK to proceed as drawn. If there are many results, sort them by number of references to bring the best-described complexes to the top of your list. If you start with just the organic ligand structure, or draw a complex with an M (any metal) shortcut, you can use the Analyze by elements tool to sift a results set based on the coordinating metal.
For more information, see a useful overview of searching for organometallic substances.
CAS registers most polymers as multicomponent substances composed of one or more monomers. You can search by name, monomer structure or formula and limit retrieval to polymers in the Class(es) menu. Search for general classes of polymers as a research topic.
Molecular formulas of polymers are indexed as monomer formula(s) within parentheses, followed by an x: (C8H8)x. Copolymers are expressed as dot-disconnect formulas with monomers separated by a period: (C8H8 . C4H6)x etc.
Structure searching for copolymers in SciFinder involves executing a search for one monomer as an exact structure, then refining the results by adding another monomer. It's better to start with the least common monomer. Polymers with undefined structures obviously can't be searched by structure, but must be found with name and class terms instead.
By structure: Search for all labeled analogs of a given (sub)structure by drawing and searching the structure, then using Refine to limit to isotope-containing substances. It's not possible in SciFinder to search for a specific label at a specific position, however.
By molecular formula: Deuterated or tritiated analogs can be searched as D or T within the formula; these isotopes are cross-posted with H in the MF field. For example, C5 H D6 N will also be found as C5 H7 N.
When you draw a substructure, you should get in the habit of checking the "Show Precision Analysis" box before running the search. (It's not the default, nor is it currently possible to analyze by precision after running the search.) When you check the box and click Search, SciFinder will show you a pop-up box with a selection of "candidates" to choose from, to make your results more precise. Precision analysis is not available if there is stereochemistry, or for similarity searches. But it's important when searching for metal-containing organic compounds.
The Refine and Analyze tools allow you to narrow your substance results by a number of criteria, including additional structure component, metal- or isotope-content, property data or commercial supplier information, etc. If your results contain many multicomponent substances, you can remove them. Under the Refine tab, choose Chemical Structure, then check the single component box below. You can also specify "Single component" from the Characteristics menu before you run the search.
Registry contains records for millions of biosequences, and will display the sequences up to a certain length, but they are not directly searchable in SciFinder. Use the Substance Identifier search to enter names of organisms, GenBank IDs, etc. For sequence searching, try the NCBI databases such as GenBank.
SciFinder offers a few ways to do this.
There are two approaches in SciFinder to search for documents that might contain specific kinds of property data for a compound. The most straightforward way is to use the compound's Registry Number in a research topic query, e.g. "vapor pressure of 104-76-7", and select the results where the concepts are "closely associated with one another."
If you don't already know the RN, use Explore Substances to find the compound record, click Get (all) References, and then use the Refine/Topic feature to enter the name of the desired property to narrow the results further. (TIP: Don't select the Properties role when getting references unless you're fairly sure the data was reported after 1967. The substance roles have not been applied retroactively to the pre-1967 segment of the CAPLUS file.) See the Properties guide for more details.
In addition to searching for literature with property data, SciFinder's Registry database contains substantial property data within the substance records themselves. Experimental property data posted in Registry derive mostly from the literature. Predicted properties are generated by algorithms from ACD Labs, and are related primarily to pharmaceutical discovery.
Yes, but only for a limited number of properties. Many substance records in the Registry file contain calculated or experimental property data. Select "Property" in the Explore Substances tab and then choose a property from the menu and enter a specific value or range of values (closed or open-ended). You can also refine a substance set by experimental property values. See the help pages for more information.
There are two ways to look for spectral information in SciFinder.
Search for a compound by name, structure, formula, Registry number, etc. View the record for that compound. Click on the "Get Commercial Sources" button to view a sortable table of commercial suppliers and addresses. Some prices may be given. You can save your preferred suppliers so they'll always be at the top of your list.
Search for the substance and locate its Registry record. Then click on the "Get Regulatory Info" link or the button to pull up that compound's CHEMLIST record. CHEMLIST provides inventory information about 350,000 chemical substances that are regulated in key markets across the globe..
Although most Registry substance records come from CAS' indexing of the literature, some compounds are registered from other sources and are not necessarily represented by any indexed literature. Third-party chemical libraries, catalogs, and databases from various external agencies, as well as hypothetical ring parents are also included in the database. You can exclude compounds with zero references from your results using the Refine by Chemical Structure tool, or you can re-sort your set by number of references to move them to the bottom.
Observant power-searchers may notice that the level of substance indexing for older literature is not as detailed as it is for more recent literature. There's a reason for this. Substance indexing in the pre-1967 segment of the CAPLUS bibliographic file was created retrospectively. Indexing data for chemical substances from the printed 1st through 7th Collective Subject Indexes (1907-66) were added and algorithmically matched with Registry Numbers.
If you start from a chemical substance record, click Get References, and then select from the Roles menu (Adverse effect, Analytical study, etc.), you will retrieve results only from the 1967+ file segment. This is because Chemical Roles for Registry Numbers have NOT been retroactively applied to pre-1967 CA records. The only exception is the Preparation role, which has been added back to 1907. The interface and the help pages do not make these important distinctions clear.
Select "Explore Reactions" from the task menu. Draw a reaction scheme including one or more reactants/reagents, a reaction arrow, and a product (sub)structure. You can focus your substructure search more narrowly and avoid error messages or too many hits by using the locking tools, mapping atoms, and defining reaction sites (bonds broken or formed) in the drawing module. You can also apply pre-limits such as solvent, number of steps, classification, year, etc. Click "Get Reactions" to run the search in the CASREACT database. Results are sorted by relevance.
The CASREACT file primarily contains reaction information derived from journals indexed in the Organic sections (which include organometallics) of Chemical Abstracts since 1985 and patents since 1991. This content is augmented by a selection of smaller third-party reaction files stretching back to 1840: VINITI/ZIC; INPI; Wiley reference works, etc. Reaxys is superior for its reaction coverage before 1985, but SciFinder is better thereafter. For thoroughness you should use both sources.
Cited references (the works appearing in an article's bibliography) are listed in CAPLUS bibliographic records for most indexed Latin-alphabet journals and basic patents since 1997. Click the "Get Cited" button to see them. Most references are hyperlinked to their corresponding SciFinder records -- just click on the citation to go to that record.
To find citing references (later documents that cite a specific work or group of works, such as by author), select one or more or all items from your results set, and click "Get Citing" in the task bar. This will pull up a set of post-1997 documents that cited the selected original(s).
Abstract records from 1967 to present can be searched by abstract number (e.g. 101:59753) in the Document Identifier tab. Abstract numbers from 1907-1966 are not searchable or displayable in SciFinder. These are the numbers that look like: 53:2185a, indicating that the original abstract appeared in volume 53 of printed Chemical Abstracts, in column number 2185, position "a" (top of the column). These positional numbers did not necessarily correspond to a specific abstract and thus one-to-one matches aren't possible.
The "CAN" number displayed in the CAPLUS file for records prior to 1967 is a computer-generated accession number which does not correspond to the printed CA. If you need to look up or identify an old CA abstract number, ask the librarian for assistance.
Up to about 1995 CAS indexed only the original Russian journals, rather than their often-delayed English translation versions. Starting in 1995 CAS began to switch to selected translations when they were apparently simultaneous and complete. If you need to locate the English translation of a Russian SciFinder reference, note that the page numbers will be different. Searching the full Russian journal title in the Library Catalog will pull up the English translation if the library owns it.
Generally not. The CAS definition of "books" is non-intuitive, though. CAS does index the chapters of some review serials/annuals such as Organic Reactions and major reference works, on a selective basis, but these are assigned the Document Type (DT) of "Review" or "Journal." Beyond that, CAS covers relatively little monographic content. Worth noting: The Document Type field in the Refine/Analyze tools menu is not very useful in this case: "Book" is used only for entire books (and book announcements); while book chapters are assigned the "Conference" DT - even those that obviously aren't from conferences.
Google Books is a useful alternative to search full text content from many historical and recent books. Most publisher platforms also allow free full text searching of their e-book content - sometimes via their own platforms and sometimes via Google's main search. However, if you want to see the actual content you'll usually need to locate either a copy of the e-book licensed by the library or a print copy. .
Worldwide chemical patents and applications are thoroughly indexed and cross-referenced in SciFinder. CAS policy is to index the first published patent document (usually a non-U.S. application) in a family, and subsequent granted patents and other applications are listed in the Patent Family table in the full record. For coverage details, see the CAS Patent Coverage page.
The SciFinder interface is not intended for comprehensive patentability (prior art) searching, which should be done by experienced patent searchers using specialized databases.
To remove patents from your results, click the Refine tab, then choose Document Type, and select from the menu only those types of documents you wish to see.
Markush structures are generic chemical structures drawn according to patent claims conventions and found in chemical patents worldwide. They are distinct from the precise structures found in the Registry database, and are searched separately in the MARPAT file using the Markush structure option under the Explore Substances tab. An example:
The MARPAT file contains more than 1 million searchable Markush structures from patents covered by CAS from 1961 to the present (records from 1961-87 are derived from French INPI data), and is updated daily. Markush structures include organic and organometallic molecules reported in patents from countries covered by CA, except Korea. Not included are alloys, metal oxides, inorganic salts, intermetallics, and polymers.
Drawing a Markush structure uses the same applet, but some options are not available. Unlike a normal structure search, results from a Markush search in SciFinder are CAPLUS bibliographic records for patents, rather than actual structure hits, which you never actually see. (You don't see the source MARPAT records either.) Sometimes it will not be obvious why a particular CAPLUS patent record was retrieved based on the structure you entered. Registry numbers are not highlighted in the indexing. You may or may not see a matching structure graphic embedded in the abstract. You may have to refer to the full text of the patent itself to determine its relevance. When you do a Markush structure search, you are NOT searching the Registry file; to do a complete novelty search, you must search a structure using both options. While Markush structure conventions are easy for chemists to understand, the computer algorithms that match them to specific target structures are very complex.
Remember that professional patentability searching should be carried out by patent experts using databases designed for that purpose.
Medline records are sorted separately in results sets, and come after the hits from CAPLUS. After you get a list of references, click the Refine tab, then select the Database button and select CAPLUS. Medline records are often duplicates of CAPLUS records in the same set. SciFinder allows you to remove duplicates from your results set. You can also set your Preferences to automatically remove Medline duplicates.
Not directly - SciFinder does not have an email function. First you must export selected records to a file on your computer, then email the file as an attachment. RTF is a good format to save in. Or you can just copy and paste text directly into the body of an email.
Yes. You can save a results set (references, substances, or reactions) on the CAS server, or export them to a local disk, then combine a future results set with that saved set if you wish. By employing various options of Combine, Intersect, or Remove, you can manipulate and customize the information contained in these combined answer sets.
Yes. Select records from your results and click the Export link, and select the "Citation Export Format" for the .ris output file. Then open your EndNote library and import the file. The SciFinder (CAS) filter is not required for this format.
It has been reported that EndNote's "Search for Full Text" feature does not always work properly with many types of imported records, including SciFinder's. It's usually a better idea to retrieve full text manually rather than rely on functions that must retransmit messy metatdata across multiple layers of link resolvers.
The full text link does not mean that an electronic version of that document exists -- it just begins a search for one. In practice the link is functional only for journal articles and patents, and generally won't lead to other types of materials such as books, conferences, dissertations, preprints, tech reports, etc., even if they happen to exist digitally somewhere. Furthermore, the existence of electronic full text does not guarantee that you (as a UT-Austin patron) will have access to it.
A link appears with every record on the results summary ("Other Sources") and full record displays ("Link to Other Sources"). This takes you to the actual article or, in some cases, to a menu of local access options for journals. When we do not have access to electronic full text, the menu includes an option to search for the journal in the library catalog.
In order to view the full text, the article must either be:
If neither of these is true, then you will not have direct access to a digital copy via SciFinder. You can search for a print version in the Library Catalog, or submit a Get a Scan request via Interlibrary Services, and we'll get you a copy.
To retrieve patent documents, click the Link to Other Sources button in the Full Record for the patent, and you'll be taken directly to the indexed document in the USPTO or Espacenet systems. Other documents listed in the Patent Family box must be retrieved manually. The PatentPak button in the brief record is not functional because UT-Austin has not added this full-text manipulation feature to our subscription due to cost.
SciFinder's links will reach a dead end for almost everything else. This is because some sources (preprints, databases, repositories, theses, etc.) are not typically included in our local Library Catalog system. Try searching for them instead in Google Scholar or another web search engine. Other sources (books, conference proceedings, tech reports, et.) might be in our Catalog but the OpenURL linking process isn't sophisticated enough to find them. All of these will require additional effort or assistance to locate.
SciFinder's internal editors (non-Java and Java) are the only options for drawing a structure/reaction query inside SciFinder. Users of ChemDraw version 14+ can draw a structure in that software and then click a SciFinder search button to initiate a search directly in SciFinder. You can also import .cxf files created in other tools.
This work is licensed under a Creative Commons Attribution-NonCommercial 2.0 Generic License.