CGC Identifier
Main MenuPublicationsFrançais

Conference Paper

RECENT ADVANCES IN NEAR-INFRARED APPLICATIONS
FOR THE
AGRICULTURE AND FOOD INDUSTRIES

Presented at the International Japanese Conference
on Near-Infrared Reflectance,
Nov. 20-21, 1996

Phil Williams
Canadian Grain Commission
Grain Research Laboratory


Brief Contents

Introduction
What has helped?
What has not helped?
What is needed?
What is new?
What of the future?

RECENT ADVANCES IN NEAR-INFRARED
APPLICATIONS FOR THE AGRICULTURE AND FOOD INDUSTRIES.

Phil Williams, Canadian Grain Commission, Grain Research Laboratory, 1404 - 303, Main Street, Winnipeg, Manitoba, R3C 3G8.

Introduction

Where to begin - the Sleeping Giant is well and truly awake, and running by now?!! What strides has it taken? Many of the most recent were described at NIR-95 in Montréal last year, but a lot of the modern advances have been made in industries other than ours. The technology has more than "Come of Age", since it is more than 21 years since the first "Real World" applications were documented.

During its life NIR technology has reached several "plateaux". The first was the appearance of the "stand-alone" instruments, which were the original Workhorses - the Neotec Model 31, the DICKEY-john GAC III, the Technicon Models 300 and 400, and the Percon Inframatic Model 8100. These gained almost instant acceptance in Grain industries in North America, Australia and Europe, and gave NIR technology its early credibility. Next came the era of the computerized scanning NIR spectrophotometer, introduced by the Neotec/Pacific Scientific Model 6350, and closely followed by the LT "Quantum" series, and the Technicon Model 500. The third saga saw the diversification of software, mainly to operate scanning spectrophotometers. The fourth "dimension" was the introduction of whole-grain analyzers, by the ever-innovative Trebor company, followed several years later by Tecator and Foss Electric, both European companies. All three whole-grain analyzers used different systems, but all of them operated over the wavelength range of 850 - 1100 nanometers (nm). During the last three segments of the NIR saga, software has constantly been improved, with the adoption of the Norwegian-born Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression , and more recently Neural Network software.

What this paper is mainly concerned with is the next plateaux, those of the immediate present and the future. The paper will discuss factors which have worked for and against its acceptance and its expansion as an analytical tool. Next the paper will consider some features which would assist in acceptance of NIR. Some recent permutations in application will be reviewed, followed finally by some "over the horizon" considerations.

But first, perhaps we should consider what the technology has already achieved. Following its discovery, and original development nearly 30 years ago in Karl Norris' laboratory at the USDA in Beltsville, Maryland, the actual practical application of the technology experienced very humble beginnings, at first simply to the prediction of protein and moisture in wheat. Since then NIR technology has pervaded all of the most important industries on the planet. Probably the most important achievement has been the degree to which the technology has been accepted in the Grain Industry, world-wide. In Canada practically all wheat and barley is now tested for protein content by NIR, at colossal savings in time and expense, and of considerable benefit to the environment. Meat analysis by NIR is becoming expanding, and due to its high moisture content presents different problems to overcome than grain. Applications in the food industry have usually involved measurements of "simple" constituents such as water, protein, fat or sugar. Of these many have incorporated unique sample presentation systems for making measurements "at-line" (continuously during product manufacture or packing), and the diversity of the hardware and software is protected.

More recently the technology has found favour in less familiar areas, such as clinical investigations, and monitoring features of the environment, including water quality, and changes in lake, river and estuarine sediments. Most of these new applications have been introduced as the result of the innovativeness of investigators, anxious to test the effectiveness of the technique in replacing time-honoured, but often slower and expensive reference methods.

Page up

What has helped?

What has helped, and is helping these pioneers? Probably the main feature affecting the development of new applications has been improvements in software, and in the speed and storage capacity of computers. The early days of the computerized spectrophotometer were plagued by very long (30 - 48 hour) computing times, sometimes with unreliable software and hardware, which could fail in the middle of a lengthy computation of even a four-wavelength calibration equation. Advances in computer technology and software now enable simultaneous computation of calibrations of several options of optical data treatment with up to nine wavelength points per equation all in a few seconds. Table 1. Illustrates the influence of optimizing the first derivative treatments for the NIR prediction of protein content in wheat flour.

Table 1. Optimization of First Derivative Treatment for Prediction of Flour Protein Content.

Segment

Gap 2* 6 10 20 40
2 0.370 0.140 0.127 0.120 0.094
6 0.173 0.131 0.120 0.101 0.100
10 0.139 0.118 0.109 0.105 0.113
20 0.119 0.105 0.107 0.101 0.111
40 0.109 0.106 0.111 0.114 0.113

Computation of Principal Component Analysis (PCA) and partial Least Squares Regression (PLS), which took several hours at the time of their introduction to NIR technology, can now be completed in a minute or so. There are now at least five different software packages available to operators of computerized spectrophotometers. Some of these incorporate very comprehensive graphics, while others have excellent systems for scatter correction. All have capabilities for computing multiple linear regression (MLR) , PCA and PLS. Some of them, such as the GRAMS series, are generic, while others, such as NSAS, are dedicated to a particular series of instruments. The Neural Networks feature is the latest addition to NIR software, and will likely find more applications in the future, as application probes more complicated spheres. Software for at-line applications has often been developed for specific situations, not applicable elsewhere.

Developments in hardware include improvements in grating monochromators, and in sample presentation systems, which have become very diversified. Several options are now available, either for presenting the sample to the instrument, or the instrument to the sample. At-line applications are increasing, where composition is monitored continuously during production, using fibre-optics to scan the analate (material to be analyzed) and other methods. Fibre optics cables can also be used to protect the instrument from extremes in temperature, or hazardous materials. The diversity in applications have caused the evolution of a plethora of different sample presentation systems, many of which have been custom - tailored to specific undertakings. Samples include powders, slurries, liquids, whole grains and seeds, fresh fruits and vegetables, forages, plastics in various forms, textiles, and many other commodities and materials.

Another feature which combines hardware and software is the operation and control of instruments via networks. A company can employ a large number of instruments in different, even remote, locations from a single personal computer (PC) operating from a convenient central laboratory (or office). An instrument is designated as the "Master" instrument, and is calibrated by as comprehensive a method as possible. All of the other ("Satellite", or "Slave") instruments are calibrated from the Master instrument via the PC. This virtually assures that all of the instruments will give the same results for individual samples, and will all be as accurate as the Master instrument allows. The technique was pioneered in Denmark, with collaboration with the Swedish company Perstorp Analytical, but was soon adopted in Australia, and later in the U.S. and Canada.

Page up

What has not helped?

Perhaps a paper on "Recent Advances" should include some comments on recent advances in the philosophy of NIR Technology, as well as the more practical aspects, and a word about our Non-achievements should be included. Despite the obvious advantages of this very rapid, flexible, chemical-free and non-invasive technique, NIR technology has only penetrated to about 2% of the total amount of analytical work carried out in all industries. Classical visual and UV spectrophotometry, Gas-liquid Chromatography (GLC) and High Performance Liquid Chromatography (HPLC) are generally more widely accepted than NIR. For moisture testing of grains dielectric meters are preferred, despite their sensitivity to temperature and bulk density of the commodity.

The most important reasons underlying non-acceptance of NIR are summarized in Table 2.

Table 2. Factors that have Not Helped the Acceptance of NIR Technology.

  1. The need for calibration for all parameters
  2. Instrument instability
  3. Sample preparation
  4. Reference analysis
  5. Lack of knowledge of the technology
  6. Skepticism

Calibration. The need for separate calibrations for every parameter for every commodity has been the most important obstacle to acceptance of NIR technology ever since its introduction. Early comparisons between NIR and the Kjeldahl reference test for protein emphasized that whereas separate calibrations were necessary for commodities as different as barley grain and straw, the same Kjeldahl test could be applied to both. As well, NIR calibrations required updating every season, and samples from all areas had to be included in the calibration and validation sets. The Kjeldahl test did not vary, and recognized no differences between samples from different growing areas or seasons. This aspect of NIR technology became increasingly apparent as operators wished to apply it to increasingly complex analyses, such as fibre components. Dielectric moisture meter calibrations are stable from year to year, and are not noticeably affected by things like growing location, or grain variety. For example there are over 100 varieties of canola seed grown in western Canada, but all of them use the same calibration chart for the Labtronics Model 919 capacitance moisture meter.

Instrument instability. No two NIR instruments are exactly alike. Minor differences in filters and monochromator gratings cause correspondingly minor differences among instruments. A fraction of a nanometer is sufficient to cause a significant difference if that wavelength is involved in an equation. Differences among instruments are minimized by the manufacturers, but they do occur. Also some of the components of instruments age and change in output with time, so that the instruments require monitoring, and occasional corrections have to be applied to calibrations. Operators tend to overlook the fact that reference methods also require regular checking. Usually the standard error of NIR testing is lower than that of the reference tests, but there remains an overall tendency to regard NIR instruments as requiring excessive monitoring.

Sample preparation. This is an area in which NIR has been effective in causing a gradual improvement. Earlier workers pointed out the importance of paying attention to routine operations such as removal of foreign material and grinding, and there has been a general trend toward improvement in sample preparation. The introduction of whole-seed NIT and NIR analyzers, which eliminated the need for grinding was welcomed by all operators in the grain industry.

Reference methods. The above statement also applies to reference analysis, where early in the saga the erratic results obtained on interposing blind duplicates in the verification of NIR analysis caused numerous laboratories to re-evaluate the accuracy and reproducibility of their time-honoured reference methods, which they had always regarded as being very infallible. The reference method is still fundamental in establishing reliable calibrations. Surprisingly, absolute accuracy is not essential to the development of the calibration equation itself, provided the results are reasonably close (within one standard deviation) of the true result. But unfailing accuracy is essential to the evaluation of the calibration after its development, since it is upon these validation and monitoring results that the integrity of the NIR method is based.

Lack of knowledge of the technology. Near-infrared technology has become a household word in the grain and food industries, and has attracted the attention of multifarious would-be users. Many of these experience surprise, disappointment and disillusionment early in their attempts to apply the technology, and may revert to their trusted methods with out further exploration. In most cases these frustrations can be traced to mediocre reference analysis, or unfamiliarity with the basic principles of the technology. Seasoned workers in the field have probably spent several years working with, and studying the technology, and factors which affect its accuracy. Neophytes often expect to become experts after a few days, or even hours of training, during which they learn the rudiments of operation and calibration, and possibly some of the most basic aspects of the physics and chemistry (to which they are less likely to pay attention). Texts by Osborne et al, Burns and Ciurczak, Shenk and Westerhaus, and Williams and Norris have been prepared in attempts to explain the technology. Getting people to read such literature is more difficult than preparing it!

Skepticism. A lot of laboratories are still doubtful that the technique works. They are discouraged by the high price of equipment, and still further by the knowledge that they face a lot of work and further expense in assembling samples and carrying out reference analysis before they can use it in their operation. Some companies acquire second-hand instruments to save the expense of buying new equipment, not realizing that their purchases represent instruments that have been discarded by the previous owners, who have usually up-dated their NIR equipment. The new owners often receive only mediocre performance from their instrument, and conclude that the technology is at fault. Still another source of discomfort may arise as a result of attendance at NIR conferences, where several people talk learnedly about the advantages of new versions of software. This can become confusing to potential users, who are left uncertain as to which version to buy, but under no illusion that all of the packages represent a cost of several thousands of dollars. Rarely does anyone enlighten them that most of the packages would work equally well with their instrument.

Page up

What is needed?

Features which would be welcomed are tabulated in Table 3. These are not listed in order of importance, since all of them are important. All of them are feasible, and several involve software, rather than the more expensive engineering changes.

Table 3. Features that would assist NIR Technology.

  1. Simpler calibration
  2. Calibration transfer
  3. More User-friendly software
  4. Fully-automated optimization of optical signals and wavelength range
  5. More reliable reference methods
  6. Education in NIR technology
  7. Dedication - it takes a long time to be really good at any technique.

Simpler calibration. Modern instrument manufacturers recommend the use of hundreds of samples in calibration development, particularly for whole-grain instruments. This is time-consuming and expensive, and a calibration procedure which involved fewer samples would be welcomed. Several years ago an applications specialist at Pacific Scientific (formerly Neotec, and now NIRSystems) developed an algorithm which allowed calibration with only two samples of grain, which represented the range of composition and spectral variance to be anticipated in future samples. Far-fetched? Not really. The system worked for the simple instruments for which it was designed. The procedure required preparation of two samples which combined a wide range of variance in factors such as growing location and season, together with the highest and lowest composition levels.

The main factor which creates the requirement for large numbers of samples is the need to accommodate all variations of interactions between instrument and the optical signals generated from the sample. These are affected by differences in the absorption coefficient and the scatter coefficient. At least one instrument company is working at developing a system for resolving these differences, and building the algorithm into the instrument. When this is achieved it should greatly simplify calibration, and should also improve the efficiency of networking.

Calibration transfer. At present calibration transfer can be achieved by the network system referred to earlier. Even the best-engineered instruments differ slightly among instruments, and small changes in slope and bias, determined by testing the same reference samples on each newly-calibrated networked instrument, are applied as required. This is not a major problem, and despite the criticisms of some purists, slope/bias corrections usually work well, provided they are not abused by operators over-reacting to small daily fluctuations in accuracy (for example +/- 0.1% protein). Further improvements in instrument design may be expected to reduce the requirements for slope/bias correction even further.

More user-friendly software. Modern NIR software is becoming increasingly comprehensive, but as it improves it tends to become increasingly complicated to the user, who has compelled to spend considerable time in learning all of its features. Teaching the software is equally arduous. Software manuals may spread into several hundreds of pages. Furthermore, some software calls for creation of separate files of optical data, transformed from the original spectra, which is another source of confusion. A useful adjunct to a software ensemble would be a short stepwise summary of what has to be done to set up and edit files, change mathematical treatments, optimize wavelength ranges, perform regressions and other frequently- required operations. The text could explain these, and the underlying principles in greater detail later in the manual, but the operator would be enabled to carry out the most important steps in developing calibrations quickly and without the frustrations which accompany some software.

From the point of view of the software developers, their job is constantly being affected by changes in computer technology, such as the introduction of Windows 95, which introduced many changes in the way PCs work. All of these innovations are aimed at improving the overall efficiency of any operation which requires a computer, but when a company is informed that it will need to replace all of their PCs to enable running of new software they are often discouraged from updating by the extra expense, particularly when their analytical systems are working well.

Fully-automated optimization of optical signals and wavelength range: Table 1 presented results of optimization of the first derivative of the optical signals of a set of wheat flour samples for the prediction of protein content. This exercise occupied about one hour, including prediction of the validation set with all combinations of the "segment" and "gap". The same exercise for optimization of the second derivative would take about equal time, while optimization of the log 1/R signals would take less, since no gap is involved. During development of calibrations for computerized spectrophotometers this step is essential. The next step is to optimize the wavelength range, and the final step is to determine whether the application of scatter correction will improve the calibration.

The total operation can occupy more than a working day, much of which is key operation and documentation. Provided all options are available in the software - for example, ISI software incorporates all of the required options - a relatively simple addition could enable the computer to carry out all of the optimizing steps, and output the best combination. The operator would need to prepare calibration and validation sample sets, and instruct the computer as to the methods to use. The computer would then carry out the optimization. A modern Pentium PC could probably achieve this in one or two hours, leaving the operator free to do something else.

More reliable reference methods: The "wet chemist" is still needed! All NIR applications have the objective of predicting the results for time-consuming and expensive reference methods. Even the Kjeldahl test for protein takes two hours between the initial weighing and the determination of the end result by titration. The dietary fibre test takes nearly two days, and is of little practical value to an operation which requires to know the dietary fibre level during preparation of a food or feed. Improvements are continually being made to many of the reference methods. Some can be related with confidence to the original procedure, whereas others may not be. For example, there is a gradual world-wide move away from the Kjeldahl test to the Dumas, or Combustion Nitrogen Analysis (CNA) method as a reference for protein and total nitrogen content in foods and feeds, and their components. Dumas instruments are calibrated with pure chemicals of known nitrogen content, and the future of the accuracy of protein testing is assured. On the other hand the dietary fibre test is constantly being changed, and cannot be directly related to what it is supposed to represent, which is an influence of non-digestible food/feed constituents on digestibility and animal metabolism. Moisture is another area of concern, since there is no consensus as to the most reliable reference method. In some cases NIR is probably more reliable than the reference methods against which the instruments are calibrated, but for legal and regulatory reasons agreement has to be reached as to the reference methods. Often conflicts induced by politics and personalities result in disagreements on reference methods which are difficult to resolve.

Education in NIR technology: The need for education in NIR technology is increasing, as more companies, and government and universities seek to use NIR. Education is needed to explain the options offered by new software, such as Neural Networks. There is a lot of confusion as to where features like PCA and PLS and Neural Networks belong, together with more confusion as to the suitability of the much-maligned Multiple Linear Regression (MLR). The foundation stone upon which NIR was originally based, MLR still has many applications, and in many cases is equal or slightly superior to PCA/PLS. There is a growing need for a formal course in NIR technology at university or alternate institutions of post-secondary education. The Council for Near-infrared Reflectance offers a taped course in NIR technology. This course was developed at Kansas State University, but to date no other educational institution appears to have taken on the responsibility. NIR News, an excellent English newsletter, is establishing an Internet website, which has the objective of answering questions on any aspect of the technology. This website will become functional early in 1997.

Dedication: Success in NIR application cannot be achieved overnight. The operator and sponsoring organizations must be prepared to assign the right people to develop their applications, and allocate them sufficient time to devote to becoming expert in their use.

Page up

What is new?

This section will discuss some new tools and applications for, and methods of evaluation of NIR testing. New tools include new hardware and software. The introduction of PCA/PLS to development of NIR calibrations is not new, having been introduced about 15 years ago. The software originated in Sweden a long time before it was applied to NIR analysis. The theory was that by utilizing all available wavelengths the risk of over-fitting data, inherent in MLR calibration, development is eliminated. While this is true of itself, a different type of hazard associated with PCA/PLS has emerged. The PCA/PLS software usually enables the computing of equations using the scores derived from 15 or more principal components. Most of the variance is usually accounted for by the first five, but the most successful equations (based on prediction of validation samples) often call for 8-10 factors. If the "weights", or scores of the individual components are plotted the later components show increasing system noise, and the use of them may cause the calibration to become sample-sensitive. Equations using the smallest number of factors are usually the most reliable, which makes PCA/PLS calibrations analogous to MLR calibrations, where equations based on the fewest wavelength points are preferred. Table 4 illustrates the degree to which variance is accounted for in PCA/PLS calibrations for predicting ash and protein in wheat flour.

Over 98% of the variance in the protein components was accounted for by only 3 components, whereas this degree of variance was not accounted for in the ash calibration, even by all 15 components. On the other hand, the calibration for protein with the lowest SEP and highest r2 values required 9 factors, while the best ash calibration used 12.

Most new applications refer to applications of NIR or NIT (Near-infrared Transmittance) technology to different commodities and processing conditions, rather than to new parameters for a commodity. The new uses often consist of prediction of sugar, starch, moisture, protein or fat in a different medium. The features of the new application may involve engineering a unique system of sample presentation, a new system for

Table 4. Flour Protein and Ash Prediction by PCA/PLS Regression: Proportion of Spectral Variance Accounted for by Principal Components.

Cumulative
Proportion (%)
Cumulative
Proportion
Component Protein Ash Component Protein Ash
1 37.1 0.7 9 99.4 81.6
2 80.1 7.3 10 99.5 85.9
3 98.1 15.0 11 99.5 86.8
4 98.4 32.6 12 99.6 87.9
5 99.3 57.0 13 99.6 88.7
6 99.4 65.0 14 99.7 91.1
7 99.4 79.2 15 99.7 92.3
8 99.4 81.1

incorporating the NIR data into the process, or something different, but unique to the operation. Over the past eight years several presentations have been made describing the merits of new instrument design, including the Acousto-optical Tunable Filter (AOTF), and Liquid Optical Tunable Filter (LOTF) instruments, Fourier-Transform NIR (FTNIR), and Diode Array instruments. Of these FTNIR and AOTF instruments are now available, with most applications in transmittance. These instruments operate with a small beam, and are not well-adapted to reflectance measurements. Also they are high energy instruments, and focus the beam onto a small area. As a result they would run the risk of over-heating the sample, if operated in reflectance mode.

A Diode Array device, the DA-7000, has been introduced by Perten Instruments (North America). The DA-7000 has been engineered to operate in reflectance mode, although it can also operate in transmittance, and can accept both large (over 100g) and small samples. Tests are complete in about one second, during which the instrument scans a wavelength range of 400 - 1700 nm 600 times. The speed can be improved by scanning over a selected wavelength range, which enables the instrument to scan a moving belt, and "stop" it, in a manner analogous to a high speed camera. Material on the belt can be subjected to continuous analysis. The instrument itself can be protected from dust, extremes of temperature, etc. by using fibre optic probes to gather the optical signals, and the optical data transferred to a PC at a location as remote as necessary from the moving belts, for data-processing. The DA-7000 can also be used as a scanning Visible-NIR spectrophotometer, where it could serve as a research instrument for a wide range of applications. Some results for the analysis of canola seed are presented in Table 5.

Table 5. Preliminary Data for Prediction of Canola Parameters by the Perten DA-7000 Diode Array Visible-NIR Instrument.

Statistic Oil Protein Chlorophyll
r2 0.96 0.96 0.92
SEP 0.56 0.39 2.92
RPD 8.1 9.0 4.7
N* 74 74 74

A RPD value of 2.5 - 3.0 is regarded as adequate for rough screening. A value of above 3.0 is regarded as satisfactory for screening (for example in plant - breeding), values of 5 and upward are suitable for quality control analysis, and values of above 8 are excellent, and can be used in any analytical situation. These results were achieved without further optimization, by PCA/PLS regression.

Probably the most dramatic of the most recent new applications have been in fields other than food and agriculture. The non-invasive capability of NIR technology is finding a number of applications in clinical diagnostics, and in direct analysis of body fluids with out the need of taking samples. The other area which is creating a lot of interest is the application of NIR to analysis of environmental materials, including water from lakes, rivers and the oceans, and the sediments underlying them. Earlier applications lead to substantial reductions in the time and costs for testing for nitrogen, carbon, phosphorus and total organic matter. But the most revolutionary discovery was that it was possible to make predictions of heavy metals in the sediments with satisfactory accuracy. Using Principal Components, the workers were able to conclude that their measurements were based on associations between the metals and organic material in the sediments.

As well as prediction of composition, of recent years more attention has been paid from to prediction of functionality of raw materials and processed materials, such as flour, and feed grains. The familiar physical dough characteristics as measured by the Brabender Farinograph and Extensigraph, and the Tripette/Renaud (formerly Chopin) Alveograph can be predicted with reasonable success. Fibre components, in vitro digestibility and metabolizable energy, have been predicted in feed grains. Two of the main difficulties with NIR prediction of nutritional factors are a). the tendency for reference methods to change, and b). some degree of reluctance among nutritionists as to which dietary fibre method is of most value in characterizing a prepared food or feed, or their ingredients. A significant change in a reference method may mean repeating the whole calibration exercise, including accumulation of samples, while disagreement as to which fibre test is the most appropriate may render a calibration practically useless. Table 6 summarizes some functionality parameters for which NIR calibrations have been developed.

Table 6. NIR Prediction of some Functionality Factors:

Commodity Parameter r2* SEP RPD
CWRS Wheat Far. Stability 0.74 2.3 2.9
CWRS Wheat Loaf Volume 0.88 30 3.5
CWRS Wheat Water Absorption 0.76 0.9 2.7
Soft Wheat Alveograph "W" 0.79 21 3.8
Soft Wheat AWRC 0.71 0.4 13.6
Soft Wheat Cookie Spread 0.18 2.1 1.1

Another useful, and recent application of NIR concerns prediction of damage to grains due to weathering or fungal infestation. The objective is to provide a system for screening grain deliveries into material which does not require further testing, because it is either too high or too low in the parameter for which it is being screened, or in an area of composition which calls for verification by the reference method. At the Canadian Grain Commission NIR has been applied to the prediction of Fusarium Head Blight (FHB), and sprouting damage. The respective reference tests are the determination of deoxynivalenol (DON) by GLC, and the Falling Number (FN) test. Deoxynivalenol is a mycotoxin produced by Fusarium species. Here the criterion which must be equaled by NIR is the degree to which visual grain inspection can predict the presence of damaged kernels. Table 7 summarizes some recent results for predicting FHB and FN, together with the corresponding coefficients of correlation between visual inspection, and DON and FN.

Table 7. NIR prediction of Grain Damage to CWRS Wheat by Fusarium sp. and Sprouting.

DON (ppm)* FN (secs).
Statistic Visual NIR Visual NIR
r2 0.74 0.80 0.90
SEP NA 1.49 NA 17
RPD NA 2.7 NA 2.0
High 30.4 10.6 420
Low 0 0 0 270

* Legend: DON = deoxynivalenol; FN = Falling Number; r2 = coefficient of determination; SEP = standard error of prediction; RPD = ratio of standard deviation of reference data of prediction sample set to SEP.

Recent progress in interpretation of NIR calibrations include a assessment of the value of cross-validation, particularly for use with small sample sets, and examination of the "scores", or "weights" generated during the development of PCA/PLS calibrations, as a means of studying factors which affect calibrations. Early in the evolution of NIR it was fashionable for scientists to publish papers describing the results of calibrations derived by MLR from 20-30 samples, with 10-12 further samples being used for validation. Recent research has indicated that this approach may be misleading. The samples used in validation have been selected from the total population to represent the full range of composition. This practice tends to "build in" correlation between the composition factor and the optical NIR data. As a result the SEP and r2 statistics appear to be better than they really are. A more reliable method for use with sample sets of up to 50 or 60 is to use cross-validation. By this method all of the samples are used, both in calibration and validation. The first sample is eliminated, and a calibration developed with the remaining samples. The first sample is restored to the file and the result predicted, using the calibration developed in its absence. The second sample is removed, and the process repeated until all samples have been used in calibration and validation, but the predictions have been based on calibrations developed in the absence of the predicted sample.

Table 8 illustrates this concept using a "real world" study carried out to determine the efficiency with which a series of flour additives could be predicted using NIR.

Table 8. Comparison of Cross-validation and MLR Calibrations using Small sample Sets.

Statistic

MLR

1 2 3 4 5 6 7
r2 0.92 0.96 0.99+ 0.95 0.98 0.99+ 0.79
SEP 0.01 0.01 0.11 0.03 0.11 0.19 0.02
RPD 5.4 8.6 19.1 7.8 10.1 13.1 2.7
Cross - validation
r2 0.76 0.78 0.99 0.91 0.95 0.98 0.12
SEP 0.02 0.03 0.16 0.05 0.17 0.24 0.04
RPD 2.7 3.0 13.4 4.5 6.6 10.2 1.2

The cross-validation results were compared with those obtained using MLR. For the MLR calibrations 24 samples were used in calibration and 6 for validation. The MLR exercise was used to optimize the mathematical treatment of the log 1/R signals. The cross-validation calibrations were developed using the mathematical treatment which gave the best MLR result. The MLR results implied that calibrations could be developed for all constituents. While the cross-validation results were satisfactory for 6 of the constituents, the data were not satisfactory for constituent No. 6. The implication was that more samples would be necessary to determine whether an acceptable calibration could be developed for this constituent. Cross-validation is laborious, unless software is available to automate the operation. Otherwise it is necessary to repeat calibration and development for all samples, which is time-consuming.

A feature of modern NIR technology which is becoming useful in interpretation of the factors which may be influencing the way in which calibrations are developed is the display of the weights derived during the development of PCA/PLS equations. Figures 1 - 3 illustrate the results of plotting these weights for some functionality parameters of grains and derived products. The weights indicate areas of wavelength where the variance used in computing the equations was most significant. Both positive and negative influences can be determined. For log 1/R PCA/PLS calibrations "peaks" in the display indicate positive influences and "valleys" negative influences on the computation of the calibration equation. The reverse is true for the weights derived from PCA/PLS calibrations developed using the second derivative if the log 1/R data.

The magnitude of the displayed weights should be related to the degree to which each component accounted for the overall variance in the population. For example, in the case of the flour protein content data of table 4 the first, second and third component respectively accounted for 37%, 43% and 18% of the total variance, whereas the fourth component only accounted for an extra 0.3%. Accordingly less significance should be attributed to the positive and negative influences of the fourth and subsequent components (despite the fact that when displayed, the peaks and valleys fill the screen).

Figure 1 shows the weights for the first and second component of an equation for prediction of wheat kernel hardness by the Particle Size Index (PSI), grinding/sieving method. The first component looks like an "upside-down" spectrum of wheat, and was interpreted to indicate a primary influence of particle size on the prediction of PSI. The second component showed strong positive influence at 1430 and 1930 nm, and confirmed the strong influence of moisture on the determination of wheat hardness. Figure 2 shows the first and second components of a PCA/PLS calibration developed for the prediction of Farinograph Mixing Tolerance Index, using the second derivative of the log 1/R. Here the influence of water was again very important in development of the equation. Positive influences of protein (1690 and 1734 nm) and oil (2306 and 2346 nm) are also indicated. The influences of some constituents on the development of equations for the prediction of barley True Metabolizable Energy (TME) are indicated in Figure 3, which represents the third component of a second derivative PCA/PLS calibration. Here a strong "peak" at 2270 nm points to a negative influence of cellulose on TME, while valleys at 2306 and 2346 again suggest a positive influence of oil. This interpretative work is continuing at the GRL, with particular reference to improving our understanding of factors which affect functionality.

Page up

What of the future?

The future of NIR technology is assured, and full of promise. The main reasons are its flexibility, virtual freedom from chemicals, non-invasiveness and ease of sample preparation. New instruments will appear, using improved engineering, and new software to drive them, and interpret the results. The technique will expand into fields of research which utilize the technique to improve our understanding of functionality. Methods for electronic classification and grading of grains and other commodities will also utilize NIR, mainly because of its speed, its ability to process many samples in a short time, and its applicability to at-line processes. Other techniques, such as FTNIR will expand, but NIR can be applied to the analysis of practically anything, and can be used in transmittance of reflectance mode with materials ranging from liquids through slurries to powders and whole grains, pellets or tablets. The limitations which apply to NIR with regard to lack of applicability to things like trace elements are also applicable to FTNIR (for example).

An important question which affects instrument manufacturers is which direction to take when they are considering the development of a new instrument, or a feature of an instrument. Development is very expensive and the investment can only be justified if sales are assured. For example, it may not be a good policy to invest in improvements in monochromator-type instruments if an alternative principle, such as the diode-array approach, promises to be superior over the long term. The same applies to research institutions which embark on long-term research (up to 3 years). If they invest time and money in assembling samples and perfecting and carrying out reference analysis for use with monochromator-based technology, they may, when their research is complete, find that it has been made obsolete by the escalation of another technique. One solution to this is for the research institute to purchase both types of instrument, and carry out the research simultaneously, but it is often difficult to justify the extra expense for an instrument which has not been widely accepted at the time of the commencement of the research.

"There is nothing more difficult to take in hand, more hazardous in its undertaking, and more uncertain in its success than to take the lead in the development of a new way of things".

N. Macchiavelli. Il Principe.

Page up


For questions about the Canadian Grain Commission,
Phone: (204) 983-2770 or Fax: (204) 983-2751

Copyright. Canadian Grain Commission,
600-303 Main Street, Winnipeg, Manitoba  R3C 3G8
For comments or suggestions about this web site, E-mail: webadmin@grainscanada.gc.ca
Last updated:  December 31, 1996