British Journal of Radiology (2003) 76, 885-890
© 2003 British Institute of Radiology
doi: 10.1259/bjr/57437508
Evaluation of the Commission of the European Communities quality criteria for the paediatric lateral spine
A C Offiah, BSc, MBBS, FRCR1,2 and
C M Hall, DMRD, FRCR2
1 Institute of Child Health, 30 Guilford Street, London WC1N 1EH and 2 Department of Radiology, Great Ormond Street Hospital for Children, London WC1N 3JH, UK
 |
Abstract
|
|---|
The study aimed to evaluate the Commission of the European Communities (CEC) quality criteria for paediatric lateral spine radiographs, and to use these to assess and compare the quality of filmscreen and digital images. 286 paediatric lateral spine radiographs (89 filmscreen and 197 digital) were independently analysed by two observers according to the CEC criteria. Based on fulfilment of criteria, images were assigned two scores, an image criteria score and a visual grading analysis score. Sensitivity values (S) on digital radiographs were recorded and correlated with image quality. Variability for assignment of scores between observers was lower for the image criteria than the visual grading analysis technique. Analysis of variance for fulfilment of criteria between techniques, and (for digital images) age and sensitivity values was calculated. Filmscreen did significantly better (p<0.05) than digital imaging for Criterion 6 (visually sharp reproduction of the cortex and trabecular markings consistent with age), but significantly worse for Criterion 7 (reproduction of the adjacent soft tissues). There was a significant difference in mean S values for each age group when Criterion 6 was or was not met. Results show that although interpretation between two observers was ambiguous, the CEC criteria were able to detect differences in quality of filmscreen and digital images. It is also possible to use them when optimizing target S values.
 |
Introduction
|
|---|
Every radiology department, be it filmscreen or digital, hard copy or filmless, optimizes and maintains the quality of the radiographs it produces. When we ask what is the quality of a given radiograph we are asking what degree of excellence that radiograph has attained. Unavoidably there is a subjective element to the assignment of image quality.
To standardize image quality throughout Europe, the Commission of the European Communities (CEC) published guidelines on quality criteria for diagnostic radiographs in adult [1] and paediatric practice [2]. These criteria were developed by a panel of expert European radiologists and are based on the visualization of certain anatomical structures. Studies have been performed to evaluate these criteria [37]. Most of these studies have been in the adult population. Some have involved members of the original panel. One paediatric study that did not, found that modification of the criteria was required in order to meet the authors' purposes [7].
The quality of a radiograph may be influenced by a number of factors, not least the radiation dose incurred by the patient. Studies have been performed assessing the relationship between dose and image quality [811]. It is recognized that a degree of compromise is required. Some loss of quality is acceptable in order to limit radiation exposure. In the case of digital radiography the relationship between image quality and dose is further confounded. This is because of the lack of a direct correlation between film density and exposure. To overcome this, manufacturers have defined "exposure indices", and their relationship to plate exposure [12]. When (as in usual practice) the read mode of the system is set at "auto", the system reader optimizes the exposure index and latitude values. This produces radiographs of almost constant density regardless of plate exposure [13]. The latitude and more significantly the exposure index appear on both hard and soft copy images of the radiograph. This gives an idea of the radiation exposure to the patient.
Manufacturers suggest reference ranges for exposure indices for each examination. Fuji Film Co. Ltd, Japan, has called their exposure index "sensitivity" (S). Its relationship to plate exposure is given by the following equation:
A rise in S of 33% is equivalent to a 25% decrease in radiation dose.
A fall in S of 33% is equivalent to a 50% increase in radiation dose [12].
The S range recommended by Fuji to the authors' Department for the paediatric lateral spine (entire or segmental) is 50600. Given that patients may range from pre term to 16 years of age, such a wide range is not helpful for the individual case.
The purpose of this study was to (a) evaluate the applicability of the CEC criteria with reference to the paediatric lateral spine by applying them to digital and filmscreen radiographs and (b) evaluate potential relationships between S and the CEC quality criteria.
 |
Materials and methods
|
|---|
The study involved a retrospective analysis of 286 paediatric lateral spine radiographs.
Patients
125 patients from each of 4 years were randomly selected from a computer printout of over 1000 patients. All patients had a skeletal survey performed between January 1998 and December 2001. Of these, 286 lateral spine radiographs were available from the patients' film packets for inclusion in the study. Reasons for the unavailability of 214 radiographs included no lateral spine as part of survey (n=98), lateral spine missing from packet (n=16), film packet not located for various reasons (n=89), and exclusion of radiograph from study (n=11) because (a) patient greater than 16 years of age at time of examination (n=7) or (b) severe pathology (osteoporosis, osteosclerosis or scoliosis) in patient (n=4).
Mean age at time of the examination was 4 years (range <1 month to 15 years). Radiographs were subdivided into 3 groups based on patient's age as follows, Group 1=<1 year (n=100), Group 2=15 years (n=97), Group 3=615 years (n=89). Skeletal surveys were performed for the exclusion of a wide range of constitutional bone disorders as well as for suspected non-accidental injury (NAI).
32 post mortem radiographs were included in the study, with age distribution Group 1 (n=22), Group 2 (n=7) and Group 3 (n=3). Indication for all patients in Groups 1 and 2, and one patient in Group 3 (age=5 years) was for the exclusion of NAI with or without a history of sudden infant death. The indication for two patients in Group 3 (aged 9 and 10 years) was road traffic accident.
In a minority of patients (n=15) the indication for the survey was a rheumatological condition. The vast majority of rheumatology patients belonged to the group (n=98) in which a lateral spine was not performed as part of the survey.
Radiographs
The 4 years from which radiographs were selected were divided into two groups based on imaging modality, and included 1998 (FS=last year of filmscreen in the authors' Department) and 19992001 (DR=first 3 years of digital radiography). Numbers of radiographs within the two groups included: FS (n=89; Age Group 1=22, Group 2=36, Group 3=31); DR (n=197; Age Group 1=78, Group 2=61, Group 3=58).
Filmscreen images were obtained using film of medium speed (400), and digital images with a Fuji 5000R CR system. Imaging parameters are shown in Table 1
. Images were obtained in one of two rooms, Room 1 (Siemens Optilix (Siemens, UK); nominal focal spot size fine/broad=0.6/1 mm, inherent tube filtration 1.5 mm Al, additional filtration 0.1 mm Cu) and Room 2 (Wolverson Comet (Wolverson, UK); nominal focal spot size fine/broad=0.6/1 mm, inherent tube filtration 1 mm Al, additional filtration 1.5 mm Al).
Image analysis
Two observers (a paediatric clinical radiology research fellow and a consultant of over 20 years experience in paediatric radiology) assessed each image independently. Assessment of images was based on the CEC quality criteria for the paediatric lateral spine radiograph (Table 2
, "criterion" column). Prior to the study, the observers discussed in detail their understanding of the criteria. A consensus opinion for the interpretation of each was then reached (Table 2
, "comments on interpretation").
Images (within their film packets) were shuffled in an attempt to achieve some randomization in reading order between imaging modality (FS and DR) and age groups. Observer 1 read the radiographs in reverse order to Observer 2. This was to reduce effects on image quality as a result of a possible learning curve in the application of the criteria. Images were read under standardized conditions as recommended by the CEC guidelines. A Wardray viewing light box with a maximum film illuminance of 4000 cd m-2 was used. The illumination colour was white. Restriction of illumination to the area of the radiograph was by the use of cardboard sheets. A magnifying glass of magnification factor x 3 was available. Overexposed areas on the image were viewed with an additional spotlight. Low levels of ambient light were achieved.
Each image was assigned two scores, an image criteria score (ICS) and a visual grading analysis score (VGAS). For the ICS, each image was assigned a score of 1 if a given criterion was fulfilled, and 0 if it was not. The ICS was the number of criteria fulfilled divided by the total number of criteria (7 for the lateral spine). For the VGAS, each image was compared with the reference image, and for a given criterion scores ranged from +2 (clearly better than) to -2 (clearly worse than). The VGAS was the sum of scores divided by the total number of criteria. See Almén et al [6].
For the purposes of the VGAS, the reference image was a filmscreen lateral spine radiograph of a 3 year old chosen at random from the original computer printout. Over collimation causing the skin surface to be excluded was recorded. This occurred in 33 instances, distribution by technique included FS (n=24), DR (n=9) and by age group distribution included Group 1 (n=11), Group 2 (n=11), Group 3 (n=11).
Observer 1 documented S for all digital images (n=197). In order to reduce observer bias, Observer 2 was unaware of this aspect of the study.
Statistical analysis
Statistical analysis was performed using SPSS 10.1 (SPSS, Chicago, IL) for Windows.
Interobserver variability was calculated for each criterion, ICS and VGAS using Cohen's kappa. Analysis of variance (ANOVA) was performed between Criteria 1 to 7 and imaging modality. ANOVA was also performed between S, age group and Criteria 6 and 7. All analyses were performed for both observers individually. When analysing Criterion 7, those cases (n=33) in which the skin surface was omitted due to over-collimation were excluded. The results of statistical analyses concerning S and fulfilment of Criteria 6 and 7 by Observer 2 were given more weight. This was because at the time of image assessment Observer 2 was not aware that S values were being recorded. Levels of significance were set at p
0.05.
 |
Results
|
|---|
Evaluation of the CEC criteria
The percentage of radiographs fulfilling individual criteria is shown in Figure 1
. Note that this figure illustrates the mean values for both observers. Figures 2 and 3
demonstrate, respectively, the ICS and VGAS for Observers 1 and 2. There was no significant difference in the means for each observer. Two of the 286 radiographs (1%) in the study scored zero by at least one observer. The lumbar spine in one image with a score of zero was obscured by contrast in a child who had undergone a barium study in the preceding 24 h, highlighting the need to rationalize radiographic investigations. The other radiograph with a score of zero was associated with poor collimation and movement artefact. Neither image was of diagnostic quality.

View larger version (18K):
[in this window]
[in a new window]
|
Figure 1. Percentage of radiographs fulfilling Commission of the European Communities (CEC) criteria (mean Observers 1 and 2). Criterion 1 was the criterion most frequently fulfilled74 out of 89 (83%) for filmscreen (FS) and 189 out of 197 (96%) for digital images (DR). The least fulfilled criteria were Criterion 749 out of 89 (55%) for filmscreen, and Criterion 6120 out of 197 (61%) for digital radiographs.
|
|

View larger version (36K):
[in this window]
[in a new window]
|
Figure 2. Image criteria scores. There was no significant difference in the mean image criteria scores for Observers 1 and 2. SD, standard deviation.
|
|

View larger version (30K):
[in this window]
[in a new window]
|
Figure 3. Visual grading analysis scores. As regards visual grading analysis, the majority of images were equal to or slightly better than the reference image (scores 01). There was no significant difference in mean visual grading analysis scores between the observers. SD, standard deviation.
|
|
Table 3
illustrates that interobserver reliability was fair to moderate for the majority of criteria. Interobserver reliability tended to be better for the ICS than the VGAS.
Digital compared with filmscreen radiographs
Figures 1 and 4
compare filmscreen with digital radiographs. For both observers there was a significant relationship between the fulfilment of Criteria 6 (visually sharp reproduction of the cortex and trabecular markings consistent with age) and 7 (reproduction of the adjacent soft tissues) on the one hand and imaging modality on the other. There were no significant relationships between fulfilment of Criteria 15 and imaging modality. Digital images scored better for Criterion 7 and worse for Criterion 6 than filmscreen radiographs.

View larger version (18K):
[in this window]
[in a new window]
|
Figure 4. Visual grading analysis score (VGAS) (Observer) filmscreen (FS) versus digital radiographs (DR). This figure depicts clearly how the Commission of the European Communities criteria may be used to detect differences in image quality based on imaging technique. Note particularly the differences in fulfilment of Criteria 6 and 7 between filmscreen and digital radiographs.
|
|
Digital image quality and sensitivity values
13 out of 197 radiographs (6.6%) had an S value less than 50 (age Group 1 n=8, age Group 2 n=5) and 17 out of 197 radiographs (8.6%) had a value greater than 600 (age Group 1 n=1, age Group 2 n=2, age Group 3 n=14). Mean S values for each age group was significantly related to the fulfilment of Criterion 6 (visually sharp reproduction of the cortex and trabecular markings consistent with age). Although there was some overlap, for each age group the standard deviation of S was smaller when Criterion 6 was met compared with when it was not (Figure 5
). Means, standard error of the means, standard deviations and quartile values for both groups (Criterion 6 fulfilled and Criterion 6 not fulfilled) are shown in Table 4
.

View larger version (16K):
[in this window]
[in a new window]
|
Figure 5. Sensitivity values by age group and fulfilment of Criterion 6. Selecting the 25th and 75th quartile S values (for digital images) for each age group when Criterion 6 was fulfilled (image criteria technique Observer 2who was blinded to this aspect of the study) allowed narrower target ranges to be set.
|
|
There was no significant relationship between S and fulfilment of Criterion 7 (reproduction of the adjacent soft tissues).
 |
Discussion
|
|---|
There is a subjective element to the assessment of image quality. The CEC has published guidelines [1, 2] aimed at standardizing image quality throughout Europe at acceptable radiation doses. Previous studies [3, 5, 7] have shown that over 90% of films fulfil the CEC criteria, and advise their stricter application. A strict approach was attempted in this study. The two observers involved reached a consensus regarding the interpretation of each criterion. This approach yielded a fulfilment rate for 6 or 7 criteria of only 50% for filmscreen radiographs and 56% for digital radiographs. Despite this low fulfilment rate, 99% (284 out of 286) of radiographs were of diagnostic quality. These results highlight the fact that while image quality scoring may be useful for audit purposes and optimization of radiographic parameters, when strictly applied they do not necessarily impact on patient diagnosis.
The mean image criteria and visual grading analysis scores masked differences between groups and between observers. Furthermore they did not indicate which particular criterion had not been fulfilled. Presently it is advisable to present results for individual criteria.
Despite discussion between the observers regarding interpretation of the criteria, overall interobserver reliability was moderate or better in only 6 out of 14 comparisons (Table 3
). This suggests that there is considerable room for interpretation of these criteria. The different levels of experience of the two observers may have also contributed. A second reading of a proportion of films to evaluate intraobserver reliability might have helped to define the source of the low kappa scores.
There was a tendency towards higher interobserver reliability for the image criteria compared with the visual grading analysis technique. Subjectively however the latter was felt by both observers to be the easier to apply. Despite this, both scoring methods showed similar relationships to patient age, imaging modality and S. If a department uses the visual grading analysis technique, then it is advisable that the same reference image should be used in any future studies. This will allow direct comparison of results between studies. Clearly, different departments will use different reference images. It is therefore uncertain if visual grading analysis results between departments can be directly compared. For this reason, and for improved interobserver reliability, it is suggested that the image criteria technique is that of choice.
If the CEC criteria are to be used as a measure of clinical image quality then some modifications are required. Currently they do not allow for the presence of artefact as a reason for failing to fulfil a criterion. This may confound relationships between age, imaging modality etc. These relationships may also be masked by over collimation, which would be a cause of failing to fulfil Criterion 7 (reproduction of the adjacent soft tissues) not related to exposure parameters or imaging modality. Such cases were eliminated from statistical analysis in this study. The presence of severe pathology in the patient may be another cause of failure to fulfil a criterion. Such patients were also excluded from this trial (see "patients" in materials and methods section). Finally the guidelines do not state the number of vertebral bodies that should meet a given criterion. In this study the authors agreed that all vertebral levels had to meet each criterion (except for Criterion 1) in order to consider that criterion fulfilled. This may explain the relatively low fulfilment rate of all criteria demonstrated. It also explains the high incidence of films of diagnostic quality, as the majority of radiographs were performed for the diagnosis of constitutional bone disorders. These conditions can be diagnosed even if one or two vertebral bodies are obscured or exposure is less than adequate.
Cook et al [7] developed their own scoring system for the assessment and optimization of clinical image quality. The experience from this study also suggests that modification of the criteria is required when clinical quality is being assessed. However it should be noted that the CEC intend the criteria to be used for the optimization of radiographic technique and reduction of patient dose. In this regard they have previously been shown to be useful [13].
Compared with digital radiography, traditional filmscreen radiography has improved spatial resolution [14]. In this study this was reflected in the significant numbers of filmscreen radiographs fulfilling Criterion 6 (visually sharp reproduction of the cortex and trabecular markings consistent with age) compared with digital radiographs. Conversely, digital techniques have improved contrast resolution compared with traditional filmscreen techniques [14], as demonstrated by the significant numbers of digital radiographs fulfilling Criterion 7 (reproduction of the adjacent soft tissues) compared with filmscreen radiographs. It is relevant to note that the American College of Radiology (ACR) guidelines for the limiting spatial resolution in the investigation of suspected non-accidental injury (NAI) is 10 line pairs per millimetre for all anatomical sites [15]. This degree of spatial resolution is not achievable by digital radiography [14]. For the diagnosis of constitutional bone disorders these differences are probably of no clinical significance. However careful investigation is required to determine the full implications of the reduced spatial resolution of digital imaging in the diagnosis of NAI. It should be mentioned that digital systems might compensate for their reduced spatial resolution compared with filmscreen systems by an improved detective quantum efficiency (DQE), which leads to a reduction in noise and improved contrast.
Given that it is a proxy measure of radiation dose [12], the potential relationship between S and digital image quality as assessed by the CEC criteria was evaluated. CEC Criteria 6 (visually sharp reproduction of the cortex and trabecular markings consistent with age) and 7 (reproduction of the adjacent soft tissues) are also related to radiation dose. The lack of a significant relationship between S and fulfilment of Criterion 7 at first glance appears surprising. Perhaps S is related to gradations of soft tissue visualization, which was masked by the use of a bright light for overexposed radiographs.
Fuji have suggested that the authors' department aim for S values within the range of 50600 for the lateral spine radiograph over the entire paediatric age group. However S is significantly related to patient age as confirmed by this study. The results also indicate a significant relationship between mean S values and CEC Criterion 6 (visually sharp reproduction of the cortex and trabecular markings consistent with age) within individual age groups. However, despite the significance between mean S levels when Criterion 6 was or was not fulfilled, there was a large standard deviation with overlap between the two groups. This renders the S value, when taken in isolation, an insensitive measure of clinical image quality. However, selecting the 25th and 75th quartile values for each age group when Criterion 6 was fulfilled (see Table 4
and Figure 5
), allowed the department to set narrower S ranges for each age group for the lateral paediatric spine as follows:
- <1 year 70153
- 15 years 80245
- 615 years 142348
There is a trade off between image quality and radiation dose [811]. Lower S values for a given patient age and size imply higher radiation exposure. Radiation dose incurred by patients undergoing lateral spine radiographs in the authors' department have previously been found to be well within diagnostic reference levels. The indication for the radiograph also affects what is deemed an acceptable level of quality [16, 17]. A skeletal survey performed for NAI should of necessity be of the highest possible quality even at the risk of increased exposure [15]. For most indications, an upper limit for S (lower radiation dose) does not need to be strictly adhered tounless a level of dose reduction is reached when pathology becomes obscured by increased quantum mottle. The constraints of a retrospective study are such that the target S ranges set by the department are somewhat arbitrary. Prospective studies relating S values directly to radiation dose and quality criteria are required.
Received for publication January 24, 2003.
Revision received June 17, 2003.
Accepted for publication July 31, 2003.
 |
References
|
|---|
- Commission of the European Communities. European guidelines on quality criteria for diagnostic radiographic images, EUR 16260 EN. Brussels: CEC, 1996.
- European Commission. European guidelines on quality criteria for diagnostic radiographic images in paediatrics, Report EUR 16261. Luxembourg: Office for Official Publications of the European Communities, 1996.
- Maccia A, Ariche-Cohen M, Severo C, Nadeau X. The 1991 CEC trial on quality criteria for diagnostic radiographic images. Radiat Prot Dosim 1995;57:1117.[Abstract]
- McNeil EA, Peach DE, Temperton DH. Comparison of entrance surface doses and radiographic techniques in the West Midlands (UK) with the CEC criteria, specifically for lateral spine images. Radiat Prot Dosim 1995;57:43740.[Abstract]
- Vañó E, Guibelalde E, Morillo A, Alvarez-Pedrosa CS, Fernández JM. Evaluation of the European image quality criteria for chest examinations. Br J Radiol 1995;68:134955.[Abstract]
- Almén A, Tingberg A, Mattsson S, et al. The influence of different technique factors on image quality of lumbar spine radiographs as evaluated by established CEC image criteria. Br J Radiol 2000;73:11929.[Abstract]
- Cook JV, Kyriou JC, Pettet A, Fitzgerald MC, Shah K, Pablot SM. Key factors in the optimization of paediatric X-ray practice. Br J Radiol 2001;74:103240.[Abstract/Free Full Text]
- Vañó E, Oliete S, González L, Guibelalde E, Velasco A, Fernández JM. Image quality and dose in lumbar spine examinations: results of a 5 year quality control programme following the European quality criteria trial. Br J Radiol 1995;68:13325.[Abstract]
- Jonsson A, Herrlin K, Jonsson K, Lundin B, Sanfridsson J, Pettersson H. Radiation dose reduction in computed skeletal radiography. Effect on image quality. Acta Radiol 1996;37:12833.[Medline]
- Almén A, Loof M, Mattsson S. Examination technique, image quality and patient dose in paediatric radiology. A survey including 19 Swedish hospitals. Acta Radiol 1996;37:33742.[Medline]
- Hufton AP, Doyle SM, Carty HM. Digital radiography in paediatrics: radiation dose considerations and magnitude of possible reduction. Br J Radiol 1998;71:18699.[Abstract]
- British Institute of Radiology. Assurance of quality in the diagnostic imaging department (2nd Edn). London: BIR, 2001.
- Mooney R, Thomas PS. Dose reduction in a paediatric X-ray department following optimization of radiographic technique. Br J Radiol 1998;71:85260.[Abstract]
- Cowen AR, Workman A, Price JS. Physical aspects of photostimulable phosphor computed radiography. Br J Radiol 1993;66:33245.[Abstract]
- American College of Radiology. Standards for skeletal surveys in children. ACR (Res. 22) 1997.
- Lams PM, Cocklin ML. Spatial resolution requirements for digital chest radiographs: an ROC study of observer performance in selected cases. Radiology 1996;158:119.
- Murphey MD, Bramble JM, Cook LT, Martin NL, Dwyer SJ. Non-displaced fractures: spatial resolution requirements for detection with digital skeletal imaging. Radiology 1990;174:86570.[Abstract/Free Full Text]
This article has been cited by other articles:

|
 |

|
 |
 
M Bath and L G Mansson
Visual grading characteristics (VGC) analysis: a non-parametric rank-invariant statistical method for image quality evaluation
Br. J. Radiol.,
March 1, 2007;
80(951):
169 - 176.
[Abstract]
[Full Text]
[PDF]
|
 |
|