| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Full Paper |
1 Department of Surgery, 2 Department of Clinical Epidemiology & Medical Technology Assessment, 3 Department of Radiology, Maastricht University Hospital, P. Debyelaan 25, NL-6229 HX Maastricht and 4 Department of Radiology, St. Maartens Gasthuis, PO Box 1926, NL-5900 BX Venlo, The Netherlands
Correspondence: A M Bosch, Maastricht University Hospital, Dept. of Surgery, PO Box 5800, NL-6202 AZ Maastricht, The Netherlands
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
The aim of this study was to assess the interexamination variation of the complete US procedure.
| Patients and methods |
|---|
|
|
|---|
Between March and August 2000, patients were selected from those referred for mammography to the Department of Radiology of the University Hospital Maastricht. After giving informed consent, the patients consecutively underwent a standard physical examination of the breasts, mammography and three US examinations of the breasts by three different sonographers.
Selection was made to increase the number of abnormalities at breast US and to obtain a more even distribution of the different US results compared with the normal population. Patients were selected when they were referred by a surgeon, or when the request form for mammography stated the presence of a palpable lump or local pain, or an increased risk of breast cancer.
A resident with special interest in sonology performed physical examination in the Department of Radiology. The presence of a mass, as well as its location, consistency, size and adherence to the surrounding tissues were recorded. A final probability of malignancy was assigned in a 5-point scale: no abnormalities; benign finding; probably benign finding; malignancy suspected; and malignant finding.
Mammography was performed using craniocaudal and mediolateral oblique projections (Bennet Contour Plus, Oldelft-Benelux, Delft, The Netherlands and Kodak Min-R film screen combination). When indicated, coned-down views, magnification views or views in a third direction were added. Mammography was interpreted by a radiologist with special interest and experience in breast imaging. The mammography images were scored for the density of the breast tissue (4-point scale as described in the BI-RADS reporting system [12]), the presence of masses, the location, size, type of the masses, and presence of microcalcifications. On the basis of this evaluation the probability of malignancy was scored on a 5-point scale (based on the BI-RADS lexicon for mammography) [12]: (1) no abnormalities; (2) benign finding; (3) probably benign finding; (4) malignancy suspected; and (5) malignant finding.
US examination was carried out using an ATL ultrasound scanner (HDI 5000, ATL, Bothell, Washington, USA) and a 125 MHz linear array transducer. There was no time limit for performance the whole breast US examination. The three sonographers were two experienced radiologists and one resident. The radiologists had 3 years and over 5 years of experience, respectively. The radiologist with over 5 years of experience was also participating in the National Breast Cancer Screening Program. The resident had carried out 500 breast US examinations prior to this study. Each sonographer was informed about the results of the physical examination and had access to the mammogram report. The ultrasound exams were performed in random order by the three sonographers while they were blinded to the ultrasound results of their colleagues. The presence of any lesion was noted by each sonographer, as well as its location, size, margins, posterior echoes and echogenicity. The US diagnosis and the final probability of malignancy was scored on a 5-point scale (based on the BI-RADS-score under development for ultrasound [13]): no abnormalities; benign finding; probably benign finding; malignancy suspected; and malignant finding were recorded.
In order to determine whether US adds consistency to mammography and physical examination, the three sonographers independently interpreted the mammograms with knowledge of the physical information about the breasts. This assessment was made 3 months after the initial diagnostic procedure. The same items as described above were scored.
Finally, we studied the interexamination agreement in subgroups of patients based on: (1) the presence and absence of a palpable mass; (2) the presence and absence of a lesion on mammography; (3) the density of the breast on mammography (75% dense breasts compared with less then 50% dense breasts); and (4) the presence or absence of an accepted indication for breast ultrasound. Breast ultrasound was considered indicated when there was a palpable lesion, and/or a mammographic lesion with a BI-RADS-score of 3 or higher and/or inconclusive mammography because of high density breasts.
The final diagnosis for each breast was established by histology and follow-up for 12 months. Pathology results were retrieved from the hospital department of pathology and the Dutch Network and National Database for Pathology (PALGA). As all national hospital pathology departments are linked to this database, complete coverage of the study population was assured, including patients who were diagnosed elsewhere. The final diagnoses were divided into: (1) no abnormalities; (2) benign cystic findings; (3) benign solid lesions; and (4) malignant findings.
Statistics
To measure the extent of agreement between the examinations linearly weighted kappa values were calculated. The kappa statistic measures the proportion of decisions in which observers agree while accounting for the possibility of agreement based on chance. Perfect agreement results in a kappa value of 1.0, and a kappa value of 0 indicates the level of agreement expected based on chance alone. Landis [14] indicated kappa values of 0.2 or less as slight agreement, 0.210.40 as fair, 0.410.60 as moderate, 0.610.80 as substantial, and 0.811.00 indicates almost perfect agreement between observers. Other researchers consider kappa values of 0.50 or less as poor and values of 0.75 or more as excellent reproducibility [15]. Differences in kappas between the mammographic and US results were tested using the jack-knife method [16].
Diagnostic performance was evaluated using the five levels of suspicion categories in receiver operating characteristic (ROC) analysis. ROC analysis was carried out for the combined result of the physical examination, mammography and US of each sonographer. The area under the ROC curves (AUC-ROC) was used as a measure of diagnostic performance. The differences between the areas under two ROC-curves were compared, taking into account that both curves were derived from the same cases [17].
A p-value of
le;0.05 was considered as statistically significant.
| Results |
|---|
|
|
|---|
68 breasts contained 1 or more lesions (60%), out of which 11 were malignant. The mean radiological size of the lesions was 14 mm. The mean histological size of the lesions was 15 mm. Normal or benign lesions were confirmed by histology in 13 cases, by repeated mammography in 27 cases and by a follow-up of 12 months in 62 cases.
The interexamination agreement (kappa value) between the three sonographers in diagnosing the probability of malignancy for the US examination of the breasts ranged from 0.72 to 0.75 (all three with a standard deviation (sd) of 0.04). The interobserver agreement for reading the mammography images of the same patients, and by the same observers ranged from 0.63 to 0.65 (sd=0.06) (Table 1
). The total consistency (US with physical and mammographic information) increased compared with physical and mammographic information only. These differences were statistically significant for sonographer 2 and 3 (p=0.008).
|
For the subgroups based on clinical information the mean kappas of the three sonographers were calculated (Table 2
). There was a significant difference in kappa value for the subgroups based on the density of the breast on mammography.
|
|
| Discussion |
|---|
|
|
|---|
Studies examining the total interexamination variability of whole breast US examination, being the US scanning procedure combined with the interpretation of the images are, to our knowledge, not reported in the literature. For consistency in reading breast US images alone, the kappa values range from 0.32 to 0.62 [6, 8, 10, 11]. In Table 3
a comparison of some features of these studies is presented. Those studies determined interobserver examination retrospectively, from images with known lesions and a high cancer prevalence. These images had been obtained previously by a sonographer who was not the interpreting observer. The observers retrospectively determined the characteristics of the lesions and the probability of malignancy. In our study, we not only determined the lesion characteristics, but also examined whether or not a lesion in a breast was detected. Two possible sources of disagreement: obtaining images and interpreting images were included in our study. In spite of the enhanced chance of disagreement by introducing the extra source, we obtained substantial agreement. An explanation might be the number of normal breasts (40%) in our study compared with other studies (0%). Excluding the normal breasts did not affect the mammographic interexamination agreement and decreased sonographical interexamination agreement slightly.
|
Despite the differences in experience of the three sonographers the kappa-values did not differ significantly between the three sonographers. The resident sonographer seems to have reached the required breast US experience plateau after 500 US.
Except for the 24 breasts for which we had a pathological diagnosis and the 27 cases, which underwent radiological follow-up examination, the diagnosis after a clinical follow-up period of 12 months was considered as a reference test (n=62). No false-negatives were found in this group, but the observation period of 12 months is short. However, our main goal was to study the interexamination agreement and not the diagnostic accuracy.
Table 2
, showing kappa-values for the subgroups, includes no significant difference within the subgroups, except for the dense/non-dense breast tissue group. Non-dense breasts showed high interobserver agreement on the mammography findings and therefore after US an increased interexamination agreement. Dense breasts often yield inconclusive mammograms. US might be expected to increase the sensitivity and diagnostic accuracy of the radiological imaging [2, 3]. Our results showed a substantially lower agreement after US in the group of patients with dense breasts on the mammography.
| Conclusion |
|---|
|
|
|---|
| Acknowledgments |
|---|
| Footnotes |
|---|
Received for publication August 29, 2002. Revision received January 2, 2003. Accepted for publication February 13, 2003.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
W. A. Berg, J. D. Blume, J. B. Cormack, E. B. Mendelson, D. Lehrer, M. Bohm-Velez, E. D. Pisano, R. A. Jong, W. P. Evans, M. J. Morton, et al. Combined Screening With Ultrasound and Mammography vs Mammography Alone in Women at Elevated Risk of Breast Cancer JAMA, May 14, 2008; 299(18): 2151 - 2163. [Abstract] [Full Text] [PDF] |
||||
![]() |
E.-K. Kim, K. H. Ko, K. K. Oh, J. Y. Kwak, J. K. You, M. J. Kim, and B.-W. Park Clinical Application of the BI-RADS Final Assessment to Breast Sonography in Conjunction with Mammography Am. J. Roentgenol., May 1, 2008; 190(5): 1209 - 1215. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. A. Berg, J. D. Blume, J. B. Cormack, and E. B. Mendelson Operator Dependence of Physician-performed Whole-Breast US: Lesion Detection and Characterization. Radiology, November 1, 2006; 241(2): 355 - 365. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. A. Berg, J. D. Blume, J. B. Cormack, E. B. Mendelson, and E. L. Madsen Lesion Detection and Characterization in a Breast US Phantom: Results of the ACRIN 6666 Investigators Radiology, June 1, 2006; 239(3): 693 - 702. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| BJR | DMFR | IMAGING | ALL BIR JOURNALS |