Home | Research | Research Programs | Information Analysis and Retrieval | Lexical Semantics from Web Mining
Information Analysis and Retrieval
Lexical Semantics from Web Mining
With some limited understanding of word meaning – lexical semantics – computers will be able to perform many tasks that are not yet within their capabilities.
Since almost every human activity involves knowledge of word meaning in some way, the more semantic information computers are able to manage usefully, the more they will be able to assist people in their daily activities.
Using algorithms from the fields of machine learning, computational linguistics, natural language processing and statistics, it is becoming possible to extract information about aspects of the meaning of words by the computational analysis of huge quantities of text – web mining.
The NRC Institute for Information Technology (NRC-IIT) has successfully developed algorithms for extracting the following semantic information:
- Synonym recognition – for example, "levied" is synonymous with "imposed"
- Semantic orientation – for example, "integrity" is a positive, praising word, but "disturb" is a negative, criticizing word
- Analogy and metaphor – for example, "traffic in the street" is analogous to "water in the river," in that both "flow"
- Lexical cohesion – for example, the terms ”math” and “statistics” go together naturally (they “cohere”), but “math” and “food” do not
Applications
Applications for semantic processing are unlimited. Below is a sample of applications drawn from the areas in which NRC-IIT has already developed semantic algorithms:
- Synonym recognition can lead to improved search engines.
- for example, a query for "cars" will also return a document that mentions only "automobiles"
- Semantic orientation can lead to tracking public opinion by analyzing online discussions. For example,
- politicians could gauge public reaction to policy changes
- investors could track public opinion about stocks
- consumers could evaluate reaction to new products
- Analogy and metaphor can lead to better online help systems.
- for example, “I was in Word and the fonts went crazy" does not literally mean that the user was inside the computer, nor that fonts have mental states. Since metaphors are ubiquitous, help systems will be more useful if they are not limited to literal meanings.
- Lexical cohesion can lead to better automatic text summarization.
- for example, automatically generated summaries can be improved by filtering out incoherent phrases and sentences
Related NRC-IIT Publications
Research Contact
Dr. Peter Turney
Research Officer
Interactive Information
NRC Institute for Information Technology
1200 Montreal Road
Building M-50, Room C-339
Ottawa, ON K1A 0R6
Telephone: +1 (613) 993-8564
Fax: +1 (613) 952-7151
E-mail: Peter Turney
Business Contact
Randall Milburn
Business Development Officer
Business Development Office, NCR
NRC Institute for Information Technology
1200 Montreal Road
Building M-50, Room 201
Ottawa, ON K1A 0R6
Telephone: +1 (613) 990-6590
Fax: +1 (613) 952-0074
E-mail: Randall.Milburn@nrc-cnrc.gc.ca