| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Full paper |
1 Klinikum Benjamin Franklin, Freie Universität Berlin, Department of Radiology and Nuclear Medicine, Hindenburgdamm 30, 12200 Berlin and 2 Institute of Medical Physics (IMP), University of Erlangen, Krankenhausstr. 12, 91054 Erlangen, Germany
| Abstract |
|---|
|
|
|---|
) scores used to report diagnostic agreement were calculated for tertiles and "equivalent T-scores". The tertiles divide the cohort on both scanners into the same number of subjects above and below a given T-score. Diagnostic agreement using tertiles was poor to moderate (
0.51). Diagnostic agreement using equivalent T-score agreement, again, was poor to moderate for BUA but fair to good for SOS and stiffness/QUI (0.59

0.73). We conclude that diagnostic agreement between the two devices is at best comparable to the agreement of a dual X-ray absorptiometry measurement using the same densitometer at two different skeletal sites. It is therefore insufficient to compare directly two measurements of an individual patient on both ultrasound devices. Standardization of quantitative ultrasound is very much needed. | Introduction |
|---|
|
|
|---|
Two prospective studies using water based calcaneal QUS devices [1, 2] showed that BUA and SOS can be used to predict future hip fracture risk independent of hip bone mineral density (BMD). However, there is considerable diagnostic disagreement when comparing BUA or SOS values from different devices, even at the same skeletal site [3]. As stated by the International Quantitative Ultrasound Consensus Group [4], studies comparing different QUS devices in their performance in the assessment of skeletal status are needed. In this cross-sectional study we investigated the diagnostic agreement of two different ultrasound systems in determining BUA and SOS at the calcaneus. Diagnostic agreement was determined with regard to the same diagnosis for the same patients irrespective of diagnostic route.
| Materials and methods |
|---|
|
|
|---|
Ultrasound scans
At both study centres a waterless QUS system (Sahara bone sonometer; Hologic Inc., Bedford, MA) and a water based system (Achilles+; Lunar, Madison, WI) were used. Both systems measure BUA and SOS. In addition the Achilles+ calculates a stiffness index and the Sahara a quantitative ultrasound index (QUI), which are parameters derived from linear combinations of BUA and SOS. Stiffness is not equivalent to biomechanical stiffness.
In both study locations, in vivo short-term precision was assessed in young healthy volunteers (Erlangen right foot; Berlin dominant foot, which was mainly the right foot). In Berlin, five volunteers were measured consecutively five times, and in Erlangen, 10 volunteers were measured six times each. After each scan the foot was repositioned. Precision was calculated in two ways: (1) by taking only the results of the first two measurements into account; and (2) by taking all five or six, measurements, respectively, into account. The first calculation method was included to avoid a potential deterioration of precision owing to temperature changes in the foot during the five or six repetitions.
Daily quality assurance tests of both ultrasound systems, as recommended by the manufacturers, were performed to monitor system stability at both centres. At each location both devices remained in the same room throughout all measurements. For the cross-sectional study, one foot per volunteer was measured on both ultrasound devices. For each volunteer, measurements on both devices were carried out on the same day.
Shoe and calf size
Foot size was assumed to be equivalent to shoe size and calf size was determined as the 50% value of the length measured between the medial malleolus and the protuberantia tibiae of the lower leg. Calf and foot size were measured in Berlin only.
| Statistics |
|---|
|
|
|---|
|
|
Both response rates must be given in the same units, typically the per cent loss per year is used. If the CV is given as a percentage then the CVS is also given as a percentage.
CVS is a relative quantity as it depends on the selection of the reference method. So far there is no consensus which method should be used. In this contribution we are primarily interested in comparing the two ultrasound devices, therefore we simply assume that the response rate of the reference method is 1% per year. The response rates of the Achilles+ and Sahara devices were calculated from the linear regressions y=(a x age)+b according to: response rate=a/y(age=30) x 100 where a and b are given in Figure 1.
|
The dependence of the BUA and SOS results on body height, body weight, calf size and foot size were analyzed by multiple regression and stepwise multivariate regression analysis. For analysis of diagnostic agreement between the two QUS devices, we used Kappa (
) scores applied to two different scenarios; tertiles and T-scores. In both scenarios, diagnostic agreement was calculated separately for SOS, BUA and stiffness/QUI. We used T-scores defined relative to the young normal (YN) population of our study (age range 2535; n=58):
|
|
where SD denotes the standard deviation.
To use the same absolute T-value to compare the two ultrasound devices, the age related change, YN mean and SDYN should be identical. As this is not the case the "equivalent T-scores" were calculated. A T-value was first selected for the Achilles+ and then the equivalent T-value for the Sahara bone sonometer was calculated based on the condition that the equivalent T-value would separate the total group into the same number of patients above/below the equivalent T-value as did the corresponding T-value for the Achilles+. Equivalent T-values were calculated separately for SOS, BUA and stiffness.
| Results |
|---|
|
|
|---|
The measured short-term in vivo precision values are shown in Table 1
. Overall they varied from 1% to 4% for BUA and from 0.3% to 0.8% for SOS. Precision for some parameters differed in Erlangen and Berlin. In particular the precision error for Achilles+ BUA was twice as high in Erlangen than in Berlin, whereas the opposite result was observed for the Sahara SOS. Results are shown separately for the first two of the five or six consecutive measurements. For the Achilles+, in vivo precision was comparable for BUA but higher for SOS and stiffness, when using only the first two measurements. For the Sahara, the difference between the first two measurements and the total five or six measurements was less obvious. For a comparison of the two devices, the CVS values should be used as they also consider rates of change of a given parameter. This influences SOS in particular.
|
|
|
|
scores evaluating diagnostic agreement of both devices with respect to BUA, SOS and stiffness/QUI are shown in Tables 4 and 5
scores for tertiles of the measured data are given. Results for the equivalent T-scores are displayed in Table 5
score of 0.811.0 indicates nearly total diagnostic agreement, 0.610.8 indicates strong, 0.410.6 moderate and 0.10.4 low agreement. A
score <0.1 indicates no diagnostic agreement.
|
|
| Discussion |
|---|
|
|
|---|
The interactions of a sound wave and real bone with its cortical and trabecular components, and with the additional soft tissue components, have not yet been fully understood. The existing theoretical models cannot incorporate the complexity encountered in vivo. However, there is plenty of empirical evidence from in vitro studies that BUA and SOS are associated with bone structure and material properties [69], and subsequently with bone strength and failure load [10, 11]. Therefore QUS in conjunction with bone density measurements might provide a more comprehensive way to assess skeletal status and fracture risk, but this remains to be demonstrated in vivo. In the two prospective studies mentioned above, a combination of QUS and BMD measurements did not improve the hip fracture prediction achieved by a BMD measurement alone.
Currently, a variety of QUS devices is commercially available using different technologies and measuring at different skeletal sites. From the clinical point of view it is highly relevant whether these different scanners and the parameters they measure are comparable in their diagnostic power in assessment of osteoporosis. In this investigation, data from two different centres were pooled to compare age-related changes and diagnostic agreement of the osteoporotic status using the Lunar Achilles+ and the Hologic Sahara bone sonometer. Although the study protocols used in Berlin and Erlangen differed slightly with respect to recruitment, determination of short term precision and the additional measurement of foot and calf carried out in Berlin only, we showed that there were no significant differences in the two populations and a pooling of data was appropriate.
Short-term precision in younger volunteers varied between 0.3% and 4.0%. Data are summarized in Table 1
and are comparable to previously published results [12, 13]. Precision data may differ in elderly patients but these were not available for repeated measurement. As expected, higher precision differences between the two measurements and the five or six consecutive measurements were observed for the water based Achilles+ system. The constant water bath apparently causes more rapid temperature changes within the foot than the gel based system. Moreover, the difference in precision data between the two centres can be mainly explained by the variance of data achieved in the individuals selected for precision measurements. There was no difference between the performance of the devices at the two centres.
The coefficients for the determination (r2) between the two devices were 0.47 for BUA, 0.78 for SOS and 0.69 for stiffness and QUI. Grampp et al [14] reported similar values for a comparison between the Lunar Achilles and the Walker Sonics ultrasound devices. In comparison, DXA measurements at the same skeletal site using equipment of different manufactures typically result in r2 of greater than 0.9 at the spine and the femur [15, 16].
For analysis of diagnostic agreement, the population was divided by equivalent T-values and tertiles. We are well aware that the World Health Organization definition of osteoporosis as a T-score of below -2.5 is not valid for QUS measurements. Moreover, the same absolute T-scores may result in very different discriminations of two devices, in particular if they use different measuring technologies. However, it is still desirable that a measurement of a patient on different devices should give comparable results. Therefore we used equivalent T-values.
By calculating equivalent Sahara T-values we ensured that the separation of our total population based on a given Achilles+ T-value was identical to the Sahara in terms of numbers above and below the selected value.
Obviously this approach can be criticised: (1) the equivalent T-value is specific for our population; (2) it is further specific for a given parameter; (3) the criterion used to derive the equivalent T-value, which is based on identical relative numbers of patients in the groups above and below the selected T-value for both scanners, is somewhat arbitrary and may artificially optimize diagnostic agreement; and (4) the function of equivalent Sahara T-value vs the Achilles+ T-value may not be continuous. Therefore we do not propose equivalent T-values as a basic new concept, however, it serves as an ad-hoc approach to indicate what diagnostic agreement can be expected by using comparable T-values.
An additional difficulty inherent in the T-score concept is the dependence on the population variance of a YN population. In this case the population variance of the Sahara was comparable to the Achilles+ for SOS but considerably larger for BUA and QUI. Furthermore, T-scores should perhaps be calculated from a narrower young normal range, e.g. 2530 years. For some parameters, Table 2
indicates a varying population variance in the two age decades between 20 years and 40 years. However, larger datasets are required to analyze the effect statistically.
As an alternative to T-scores we therefore used tertiles to calculate
scores (see Table 4
). We found a poor to moderate diagnostic agreement between the two different ultrasound devices (
<0.51). With the equivalent T-value concept we found stronger agreements yielding
scores between 0.59 and 0.73 for QUI/stiffness and SOS (see Table 5
). Again for BUA, the diagnostic agreement was only moderate (
<0.53), except for the smallest T-value of -0.5. However, at T-values between 0 and -1.0 the fracture risk is low and therefore the lower T-values are more clinically useful. Here diagnostic agreement is more important and
scores between 0.6 and 0.7 were obtained.
Both the tertile and equivalent T-value technique show a trend towards lower BUA agreement with higher diagnostic relevance while the SOS agreement is more consistent.
scores were, in general, lower with the tertile method. As discussed above, our equivalent T-value approach can overestimate agreement. Conversely, the tertile method that divides the population into three groups underestimates agreement compared with a method dividing the population into only two groups.
Similar
scores are observed in X-ray based densitometry when comparing different techniques at the same skeletal site or identical techniques at different sites [14]. The
scores for quantitative CT (QCT) measurements at the lumbar spine and measurements with the Lunar Achilles in the same study showed ranges from 0.03 (BUA calcaneus vs QCT lumbar spine) in osteoporotic patients to 0.41 (SOS calcaneus vs QCT lumbar spine) in osteopenic patients. Our results indicate that BUA parameters and, to a lesser degree, SOS and QUI/stiffness parameters of the two QUS devices are not identical quantities. In part this difference can be explained by the different technology. The Achilles+ always uses a fixed transducerreceiver distance, whereas in the Sahara only the sound wave travels the distance of the heel width.
The relatively moderate diagnostic agreement can also be influenced by the different age dependence of the measured parameters. As shown in Table 3
the per cent decrements per age group are very similar for the two instruments for SOS and QUI/stiffness, but varied more substantially for BUA, in particular for the older subjects showing low BUA values. Interestingly, diagnostic agreement for BUA was lowest for the lowest T-values.
Finally, the investigated regions of interest (ROIs) are different. For example, the ROI in the Achilles+ is located approximately 1 cm superior to the ROI measured by the Sahara. This may be particularly important for BUA, which seems more susceptible to repositioning errors. From in vitro studies it is known that BUA depends on trabecular orientation [7] and reflects aspects of bone structure such as trabecular separation and connectivity [6]. Therefore, different ROI locations probably characterize different parts of the anisotropic bone. Also, the correlation between the two ultrasound devices was lower for BUA than for SOS. Apparently BUA is more sensitive to small changes in structure and/or density than SOS. This may explain why there is a greater age-related change for BUA than for SOS (Figure 2ad
) and why BUA was superior to SOS in predicting fracture risk in the two prospective studies mentioned above [1, 2]. Further studies investigating structural inhomogeneities of the calcaneus and their impact on quantitative ultrasound measures should be carried out. For example, the areal distribution of ultrasound parameters can be measured with calcaneal imaging ultrasound devices [17].
From a clinical perspective, it is important to note that the
scores reported in Table 5
indicate the potential for agreement. However, in practice lower values will be observed because a currently usable concept similar to our equivalent T-values has not yet been developed. Obviously, standardization as well as a better understanding of the observed differences is required. Using tertiles will result in poor diagnostic comparability but our results show that there is room for improvement. However, it remains questionable whether diagnostic agreement reported in DXA can be achieved in the ultrasound field. From a practical point of view, at present measurements should only be compared if acquired on identical ultrasound devices.
Received for publication September 5, 2001. Revision received April 8, 2002. Accepted for publication May 8, 2002.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
C D Economos, J M Sacheck, W Wacker, K Shea, and E N Naumova Precision of Lunar Achilles+ bone quality measurements: time dependency and multiple machine use in field studies Br. J. Radiol., November 1, 2007; 80(959): 919 - 925. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Nayak, I. Olkin, H. Liu, M. Grabe, M. K. Gould, I. E. Allen, D. K. Owens, and D. M. Bravata Meta-analysis: accuracy of quantitative ultrasound for identifying patients with osteoporosis. Ann Intern Med, June 6, 2006; 144(11): 832 - 841. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M DeHart and E. H Gonzalez Osteoporosis: Point-of-Care Testing Ann. Pharmacother., March 1, 2004; 38(3): 473 - 481. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| BJR | DMFR | IMAGING | ALL BIR JOURNALS |