National Research Council CanadaSkip all menusSkip first menu Menu
National Research Council Canada Government of Canada
NRC-IIT - Institute for Information Technology
NRC-IIT - Institute for Information Technology
Research Programs
3D Technologies
Artificial Intelligence Technologies
Broadband Visual Communication
Computational Video
e-Learning
Health Initiative
High Performance Computing
Human-Computer Interaction
Information Analysis and Retrieval
Interactive Language Technologies
PORTAGE: Machine Learning for Translation
Internet Logic
People-Centred Technologies
Security and Privacy
Software Engineering
Research in NRC-IIT Locations
Research Success Stories
Printable version Printable
version
Home | Research | Research Programs | Interactive Language Technologies | PORTAGE: Machine Learning for Translation

Interactive Language Technologies

PORTAGE: Machine Learning for Translation

Project Summary

The aim of the PORTAGE project is to develop technology for allowing a computer to translate from one human language to another, and for providing a rough assessment of the quality of translations produced by human beings. The project started in September 2004, and is expected to end in April 2008.

As explained in more detail in the technical overview, there are two main approaches to machine translation (MT): an older approach in which human experts write a set of translation rules for the computer based on their knowledge of how to translate from one language to another, and a newer approach in which the computer itself learns such rules from a huge bilingual corpus. The PORTAGE technology is based on the second, newer approach, often called "statistical machine translation". Provided a bilingual corpus for the two languages involved - the language one wishes to translate from (source) and the language one wishes to translate into (target) - is available, the statistical MT approach enables one to build a translator between the two languages much more quickly and economically than with the older approach. Thus, although our research has focused on English, French, Arabic, and Chinese as the main languages of interest, the PORTAGE technology is applicable to all human languages for which there is interest and the necessary bilingual corpus from which the technology 'learns' how to translate.

To ensure that PORTAGE is competitive with the world's best translation systems, we participate in several international competitive evaluations of MT performance, including:

  • The US National Institute of Standards and Technology (NIST) MT evaluations in 2005 and 2006;
  • The NAACL Workshop on Building and Using Parallel Text (WPT) in 2005, and the NAACL Workshop on Machine Translation (WMT) in 2006;
  • The TC-STAR Workshop in 2006 (sponsored by the European Community).

The PORTAGE technology's international visibility has been heightened by our participation, starting in October 2005, in the multimillion dollar GALE project sponsored by the US Government's Defense Advanced Research Projects Agency (DARPA). The goal of GALE (Global Autonomous Language Exploitation) is to make foreign language (Arabic and Chinese) speech and text accessible to English monolingual people, particularly in military settings. As members of the Nightingale consortium, one of the three consortia participating in the project, our role is to supply MT technology for translation from Arabic and Chinese into English. See the Nightingale consortium announcement for more details.

PORTAGE's state-of-the-art MT software (executable and source code) will soon be made available to Canadian academic institutions interested in carrying out research in statistical MT. For the announcement of this new initiative, see PORTAGEshared.

Possible applications of the project might include:

  • Tools for increasing the productivity of human translators;
  • Tools for multilingual education;
  • Web-based software for multilingual browsing;
  • Software for checking that texts in different languages on a multilingual website remain “in synch”;
  • Multilingual, interactive e-mail composition.

Thus, the project is expected to have an impact on several sectors: translation, second-language education, and e-business.

In terms of technology transfer, we welcome discussion with potential industrial partners interested in any of the possible application areas listed above (productivity tools for translators, tools for multilingual education, and others).

Related NRC-IIT Publications

For additional information, please consult the technical overview of this project.

Research Contacts

Dr. Roland Kuhn
Research Officer
Interactive Language Technologies

NRC Institute for Information Technology
University of Quebec en Outaouais, Lucien Brault Pavilion
101 St-Jean-Bosco Street
Gatineau, QC K1A 0R6
Telephone: +1 (819) 934-4222
E-mail: Roland.Kuhn@cnrc-nrc.gc.ca

Dr. George Foster
Research Officer
Interactive Language Technologies

NRC Institute for Information Technology
University of Quebec en Outaouais, Lucien Brault Pavilion
101 St-Jean-Bosco Street
Gatineau, QC K1A 0R6
Telephone: +1 (819) 934-3275
Fax: +1 (819) 934-2607
E-mail: George.Foster@cnrc-nrc.gc.ca

Business Contact

Michel Mellinger
Business Development Officer
Business Development Office, NCR

NRC Institute for Information Technology
University of Quebec en Outaouais, Lucien Brault Pavilion
101 St-Jean-Bosco Street
Gatineau, QC K1A 0R6
Telephone: +1 (819) 934-2602
Fax: +1 (819) 934-2607
E-mail: Michel.Mellinger@cnrc-nrc.gc.ca


Date Modified: 2006-08-14
Top of Page