|
Government of Canada
Core Subject Thesaurus
Indexing Guidelines
Introduction
The CST is designed to be a broad, high-level post-coordinated thesaurus.
Post-coordination means that most concepts are represented by single words;
compound terms are used only when there is no alternative or when that
phrase is commonly used to represent that concept. Subjects to be identified
and indexed are represented by one or more terms as appropriate.
For example, “Reports” can be combined with any other appropriate
term ‘X’ in order to represent a report on subject ‘X’.
Post co-ordination restricts the size of the vocabulary thereby facilitating
maintenance and use. Boolean searching by either the searcher or the search
engine will combine (“post-coordinate”) terms at the time
of search.
Under TBITS
39.2, the Government of Canada Core Subject Thesaurus is the default
controlled vocabulary for federal departments. “Default” means
that if an alternate, authorized vocabulary is not available, then the
Core Subject Thesaurus (CST) must be used as a source of indexing terms.
In any case, it is highly recommended that one or more terms from the
CST be applied to <dc.subject>. The reason for this is that the
CST has been developed on the basis of a number of indexing projects that
extracted broad, high-level terminology used in GOC publications. Therefore,
it represents the language generally used in information resources across
government. By design, it does not include specialized terminology used
in specific and limited disciplines.
Basic rules
Users of the GoC Core Subject Thesaurus should apply the following basic
rules to ensure consistency of resource content representation within
and across organizations.
- Use the thesaurus structure to find descriptors
- Choose the most specific descriptor available
- Choose as many descriptors as needed
- Entering descriptors
- Contact the Thesaurus Manager if you have any queries
- Use the Thesaurus Relational Structure
to Find Descriptors
Browse through the alphabetical display of “lead-in” terms
(labelled USE: in the full term display) and indexing terms. Then consult
Broader terms (BT:), Narrower terms (NT), and Related terms (RT) attached
to each descriptor (indexing term) that appear to exactly or closely represent
the concept to be represented in indexing or metadata. These relationships
define the meaning of a descriptor, and they suggest other index terms
that may be relevant. In addition, the full record for indexing terms
(descriptors) may include Scope notes (SN:) that further elucidate or
restrict the meaning of the term for indexing purposes.
- Choose the Most Specific Descriptor
Available
Using “Libraries” as an example, suppose that descriptors
(indexing terms) were required for a document dealing with the activities
of the National Library of Canada (such as an annual report). In that
case, the narrower term “National libraries” is more specific
than “Libraries” and should be used instead of “Libraries”.
Note that proper names such as “National Library of Canada”
are not included in the thesaurus and may not be used to populate <dc.subject>.
If the document was “about” national museums, libraries,
art galleries and the like, then the Broader term “Cultural institutions”
might be a better choice. Whichever term is chosen, the full term record
for that term should be consulted to ensure that Broader, Narrower or
Related terms within those records are not more appropriate.
It is important to bear in mind that a given term record only refers
to one level up or down. A given term record does not display the narrower
terms of its narrower terms.
- Choose as Many Descriptors as Needed
Use as many authorized descriptors as needed to fully describe the contents
of an information resource. Subject descriptors are not mutually exclusive.
More than one subject descriptor will be needed to describe most resources.
For example, to index an information resource concerned with the transportation
by rail or by truck of toxic waste and other dangerous products will
be represented by the following set of descriptors:
Dangerous products
Hazardous waste
Rail transport
Road transport
- Entering Descriptors
Enter the authorized descriptor exactly as it appears in the Core Subject
Thesaurus. Generally, this means that only the first letter of the first
word is capitalized. In English, concrete (countable) nouns are presented
in the plural form (e.g. Airports), while collective or abstract nouns
are displayed in the singular form. In French, most nouns are presented
in the singular form (e.g. Aéroport). These accepted thesaurus
conventions and must be followed to ensure consistent use of terms.
In GOC metadata, separate descriptors with a semi-colon (a semi-colon
is used as a separator rather than a comma because a controlled subject
term may include other punctuation marks within it). For more information
on GOC metadata standards, consult TBS and the Government of Canada
Metadata Implementation Guide for Web Resources, 2nd ed. http://www.nlc-bnc.ca/6/37/s37-4016-e.html
One may index using either the English thesaurus (CST) or the French
thesaurus (TSB). However, English terms must be applied to English language
documents, French terms to French language documents and both English
and French terms to bilingual documents. In some cases, there is more
than one equivalent term in the other language. For example, the English
term “Education” has two equivalents in French, “Éducation”
and “Enseignement” which are displayed in the term record
separated by a slash (“/” ) as follows:
Education
FRENCH:
Éducation
/ Enseignement
The indexer must be careful to select the correct and appropriate term
for the French document.
- Contact the Thesaurus Manager
If you have any questions about the thesaurus please contact
the Thesaurus Manager. This includes general queries, additional
information about thesauri and their use and suggestions either for
new terms or for changes to existing terms.
|