SIAGIS Newsletter

SIAM ACTIVITY GROUP ON IMAGING SCIENCE

Volume 1, Issue 1, April 2002

Editor: Bernard Mair
Department of Mathematics
University of Florida
Gainesville, FL

Please send submissions to: [email protected]
More information on this activity group can be found at SIAGIS Home Page

Table of Contents

Introduction
An Approach to the Task-based Assessment of Image Quality
ONR Research Areas related to Imaging Science

Introduction
Bernard Mair, Editor
This is the first issue of the newsletter for the SIAM Activity Group on Imaging Science. The aim of this newsletter is not only to provide information on activities and publications, but also to be a forum for a discussion of issues and views related to imaging science. Please submit items that may be relevant to the imaging science community to [email protected]

The SIAM Activity Group on Imaging Science (SIAGIS) began operations in January 2000 and organized Minisymposia at the SIAM 2000 Annual Meeting in Puerto Rico. Its first full scale conference, Imaging Science 2002, was held March 4 - 6, 2002 in Boston, Massachusetts. In an effort to facilitate the interplay between imaging and other areas of research, this conference overlapped one day with Life Sciences 2002.
For information on the conference program see Imaging Science 2002
Pictures from the conference are at Imaging Science 2002 Pictures

Over 220 scientists participated in Imaging Science 2002. The organizing committee would like to thank all the speakers, organizers, and participants for their excellent contributions to this meeting. We have heard many encouraging reports on the quality of the presentations. Look for a "report" on this conference in the May issue of SIAM News.

As a result of discussions of the funding panel at this conference, Dr. Wen Masters has submitted a list of research areas in imaging science that are of interest to the Office of Naval Research, see 3.

This issue features the following interesting article on the thorny issue of assessing image quality.

An Approach to the Task-based Assessment of Image Quality
Eric Clarkson and Harrison H. Barrett
Department of Radiology/Optical Sciences Center
University of Arizona
Tucson, Arizona
Email: [email protected]

As medical imaging systems and reconstruction algorithms proliferate, the objective measurement image quality becomes more and more important. How can we determine when imaging system and/or reconstruction algorithm A is producing better images than imaging system and/or reconstruction algorithm B? One commonly used method is simply to look at some images produced by A and B, and decide which ones "look better". This comparison is usually performed by researchers who have invented a new system or algorithm, and they usually decide that their new system is better by this criterion.

A slightly more objective approach is to show the images from A and B to some radiologists and ask them which ones "look better". One would hope that a radiologist would base this determination on the usefulness of the image for the detection of abnormalities or the estimation of clinically important parameters, as opposed to the simple visual appearance of the image, but there is no guarantee that this is the case. This approach does have the virtue of focusing our attention on the observers of the images, the radiologists, and on the tasks that they are interested in performing. It is then a small step to require that the performance of the observers on the relevant tasks be measured in some objective manner in order to compare the quality of the images produced by A and B.

This line of reasoning leads to the concept of task-based measures of image quality. In order to assess the quality of the images produced by a given imaging system we specify three things: the task, the observer and the statistics of the output of the system. We then measure how well the observer performs the task on average. To specify the task we must state clearly what information we wish to extract from the images produced by the system. To specify the observer we must describe the process by which this information will be extracted from the images. To specify the statistics, we must provide a statistical description of the ensemble of objects that are being imaged and the measurement noise in the imaging system itself. Finally, we need a figure of merit that objectively measures the average performance of the observer on the given task or tasks.

Tasks of interest in medical imaging can be roughly divided into two categories. Many times the observer is interested in detecting an abnormality. An example of this kind of detection or classification task is a mammographer looking for evidence of a tumor in a mammogram. At other times the observer is trying to estimate some numerical parameter that is correlated with the health status of the patient. An example of such an estimation task is the measurement of the cardiac ejection fraction, the fraction of the blood in the left ventricle that is ejected on contraction, for a heart patient. There are also times when the task of interest is a combination of a detection and estimation. Often with cancer, for example, we want to detect a tumor and estimate its size, or some other quantity related to malignance. Of course, many medical imaging systems are used for more than one task, in which case the performance of the observer on all of the tasks must be taken into account. It is also true, however, that one current trend in medical imaging is the design of specialized systems with more narrowly defined missions. Mammography is an example of this approach.

The imaging system hardware together with the reconstruction algorithm produces images for humans to view. Thus, in medical imaging at least, the most important observer for determining image quality for the whole system is the human observer. This is especially true for detection and classification tasks, since many estimation tasks are now performed automatically by programs that work with the digital images stored on the computer. One way to measure the performance of human observers on a detection task is to compute their performance on two-alternative forced-choice tests (2AFC). In a 2AFC test the observer is presented with many pairs of images, one of which is drawn from the ensemble of patients that has the abnormality of interest (the signal present hypothesis), while the other is drawn from an ensemble of patients without the abnormality (the signal absent hypothesis). All images are created using a single imaging system. For each image pair, the observer must decide which image corresponds to the signal present class. The fraction of correct decisions is then a number between zero and one that measures the performance of the observer-system combination on the detection task. This type of measurement is called an observer study.

Observer studies are usually expensive and time consuming. For this reason it is useful to have a machine observer, a computer program, whose performance on detection tasks matches that of human observers. We can then let the machine observer perform the 2AFC test in order to arrive at our measure of image quality. In general, such a machine observer computes a test statistic, a real-valued function of the digital image that results from the reconstruction algorithm, and compares this number to a threshold. We may regard a digitized image as a vector in a large-dimensional vector space, and a machine observer is then a real-valued function on this space. If this function is linear, we say that it determines a linear observer. A figure of merit that often correlates well with performance on a 2AFC test for linear observers is the signal-to-noise ratio (SNR). This is the absolute value of the difference in the mean value of the test statistic under the two hypotheses, divided by the square root of the average of the corresponding variances. There is some evidence that the human visual system uses frequency channels to reduce the dimension of the image as part of the detection process. Indeed, for some detection tasks, if we filter the reconstruction through a relatively small number of channels and then find the linear test statistic on the low-dimensional channel space that maximizes SNR, we get a machine observer whose performance correlates well with human performance on the same tasks. This "channelized Hotelling observer" with some independent "internal noise" added can then be used as a model human observer.

For the optimization of the hardware component of the imaging system we would like a figure of merit that is independent of the reconstruction algorithm. This precludes the use of human or model-human observers and, for detection tasks, leads us to consider the ideal observer. The test statistic for the ideal observer is the likelihood ratio, the ratio of the probability density for the raw data under the signal-present hypothesis to the corresponding density under the signal-absent hypothesis. This observer has the property that it performs better on the 2AFC test than any other observer, and therefore it represents the maximum performance we could expect from the imaging system on the given detection task. Since the ideal observer uses the raw data, the reconstruction algorithm has no bearing on its performance. The main problem with the ideal observer is that the likelihood ratio is difficult to calculate for any realistic medical imaging task.

The reason why the likelihood ratio is difficult to compute is related to the complexity of the statistics of the raw data. In general the normal anatomy creates a randomly varying background for the signal. There are structures in this background, such as bones, veins and major organs, as well as fine-scale textures. Then the signal itself has random variations. A tumor, for example, may have random size, location and shape. The statistics of these random variations in background and signal, the object statistics, are not well understood. Fortunately, it is only the statistics of the resulting data that we need for the likelihood ratio. This at least reduces the object statistics problem to a density estimation problem on a finite-dimensional space. Finally, there is the noise from the imaging system itself, which is often well understood. In single photon emission computerized tomography (SPECT), for example, the detector outputs are, to a good approximation, independent Poisson random variables when conditioned on a fixed object. Computing the likelihood ratio then comes down to computing two large-dimensional integrals, one for the numerator and one for the denominator, where part of the integrand must be estimated from an ensemble of objects. At this time, combining Markov chain Monte Carlo methods with multivariate density estimation seems to be the most promising approach for the computation of the likelihood ratio in this situation.

In summary, task-based assessment of image quality offers an objective means for comparing medical, and other, imaging systems. The figures of merit we have discussed are all relevant to the actual tasks for which the systems are designed. For hardware comparisons on detection tasks we may use the performance of the ideal observer on 2AFC tests as the criterion. For software or total system comparisons we may use the performance of model human observers on 2AFC tests. We have not discussed estimation tasks in detail here, but similar considerations apply. If we have a reliable model for the object statistics, then computing the average performance of an estimator, as measured by the mean squared error (MSE), for example, is a reasonable approach. Unfortunately, the MSE is usually computed for a fixed object, which raises questions of estimability that compromise its significance. We will leave the discussion of this issue for another time.

ONR Math and Computer Science Research Interest in Areas Related to Imaging Science
Wen Masters, ONR Program Manager
ONR's math and computer sciences programs sponsor basic research to develop rigorous mathematical underpinnings for perceived future Navy applications. Related to imaging science, we are interested in the following areas of research:

Imaging as an inverse problem

Object location in 3D inhomogeneous media
Object classification from scattering data
Object detection, including new waveforms, new geometry, etc.

Image processing and analysis

Data representation
Fundamental framework for image understanding
Image enhancement, segmentation, feature extraction, target recognition, registration, compression, etc.
Steganography and detection of steganography
Object similarity measure, integration of shape, function, and other factors
Computational theory for segmentation and perceptual grouping
Reconstruction and visualization

For contact information and a top-level program description, please see the ONR web site http://www.onr.navy.mil/sci_tech/information/311_math/default.htm