National Research Council CanadaSkip all menusSkip first menu Menu
National Research Council Canada Government of Canada
NRC-IIT - Institute for Information Technology
NRC-IIT - Institute for Information Technology
Research Programs
3D Technologies
Artificial Intelligence Technologies
Broadband Visual Communication
Computational Video
e-Learning
Health Initiative
High Performance Computing
Human-Computer Interaction
Information Analysis and Retrieval
Adverb
EurekaSeek
Extractor4Speech
Uqausiit: Inuktitut Language Technologies
Lexical Semantics from Web Mining
LitMiner
Interactive Language Technologies
Internet Logic
People-Centred Technologies
Security and Privacy
Software Engineering
Research in NRC-IIT Locations
Research Success Stories
Printable version Printable
version
Home | Research | Research Programs | Information Analysis and Retrieval | Uqausiit: Inuktitut Language Technologies

Information Analysis and Retrieval

Uqausiit: Inuktitut Language Technologies

Image representing the Uqausiit project

Inuktitut is the major language of the Circumpolar region stretching from Alaska to Greenland. It is also the main language for Nunavut.

The Government of Nunavut is committed to making Inuktitut the language of work and for it to be taught throughout the primary and secondary curricula. For the Government to reach this objective, however, Inuktitut must work with basic computer programs. For example, there are currently no spell-checkers, grammar checkers or state-of-the-art search engines, and the telephone book is still sorted manually.

The challenge lies in the fact that Inuktitut is a polysynthetic, agglutinative language, meaning that words are very long and are made by gluing meaning fragments together. Similar languages include Turkish, Hungarian, and Finnish.

To date, most Natural Language Processing (NLP) work has concentrated on inflected languages. As a result, there are still many unanswered questions about non-IndoEuropean languages.

The NRC Institute for Information Technology (NRC-IIT) is initially targeting specific, key language tools that are not currently available for Inuktitut. Researchers intend to build these tools and thereby provide a foundation for an Inuktitut language industry.

Given the polysynthetic, agglutinative nature of the language, however, building such tools will not mean simply translating words from one language to another. Because Inuktitut is fundamentally different from English and French, the technology for spell checkers and search engines is also not directly comparable.

NRC-IIT’s first tool is a morphological analyzer, capable of changing with a particular language and of learning differences among a variety of dialects. The analyzer, in turn, will support spell checkers and search engines.  Currently, the Inuktitut Morphological Analyzer is developed for the Inuktitut dialects of Eastern Nunavut and offers the following functionality:

  1. Produces English and French descriptions of an Inuktitut word
    Enter an Inuktitut word into the processing system and the word is automatically decomposed into its component root, infixes, and endings.  The tool will also provide meanings for each component in order to generate a global meaning from its composites
  2. Generates lists outlining all possible forms that appear for each category of word component (i.e. - roots, demonstratives, suffixes, noun endings, verb endings, demonstrative endings)
    From such lists, users can select individual components and obtain related information, including meanings and rules governing how to combine components (morpho-phonological behaviours)

Other tools for studying the Inuktitut language:

  1. Perform a query for a specific word component
    If you search for a select word component, the system can automatically tag all words with the occurrence of that component These queries will assist in the statistical study of the Inuktitut language.
  2. Offer access to Government of Nunavut Legislative Proceedings
    Includes the Nunavut Hansard dating from 1990 to 2002.

Researchers are also developing a suite of simple text tools to help teachers create classroom materials. Please visit the InuktitutComputing.ca web site for more information.

While the tools and techniques NRC-IIT researchers develop will be immediately applicable to Inuktitut, Nunavut and Canada, they will also be useful for any other agglutinative languages.

Research Contact

Dr. Joel Martin
Group Leader
Interactive Information

NRC Institute for Information Technology
1200 Montreal Road
Building M-50, Room C-335
Ottawa, ON K1A 0R6
Telephone: +1 (613) 990-0113
Fax: +1 (613) 952-7151
E-mail: Joel.Martin@nrc-cnrc.gc.ca

Business Contact

Randall Milburn
Business Development Officer
Business Development Office, NCR

NRC Institute for Information Technology
1200 Montreal Road
Building M-50, Room 201
Ottawa, ON K1A 0R6
Telephone: +1 (613) 990-6590
Fax: +1 (613) 952-0074
E-mail: Randall.Milburn@nrc-cnrc.gc.ca


Date Modified: 2006-05-19
Top of Page