If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Following HPV positive cytology negatives up with two tests lead to the best balance between sensitivity and specificity.
•
Sending HPV positive cytology negative individuals back to the routine screening induce lower sensitivity for CIN II+.
•
All HPV screening algorithms with a cytology triage increased colposcopy volume more than the cytology algorithm.
•
Cytology algorithm had a low sensitivity compared to the HPV screening algorithms with a cytology triage.
Abstract
Objective
Primary HPV screening programmes for cervical cancer have been implemented in many European countries using a cytology triage. Nonetheless, the optimal cytology triage strategy for minimizing the harms and maximizing the benefits is yet unclear. We identified key characteristics of different algorithms for HPV screening with cytology triage.
Methods
Using the Finnish randomized HPV screening trial data, we formulated five post-hoc algorithms for HPV screening with a cytology triage, one for HPV screening without a triage and one for cytology screening. Sensitivity, specificity, positive predictive value, colposcopy referral rate and cumulative sensitivity for CIN II + s detected during the first and second screening rounds of the trial were calculated for all algorithms.
Results
In the first screening round, direct referral of HPV positives to colposcopy led to the highest sensitivity (94%) accompanied by the lowest specificity (93%). Following HPV positives up with one repeat screen showed 86% sensitivity and 97% specificity. The corresponding figures with two repeat screens were 84% and 98%. In HPV algorithms, where cytology negative HPV positive individuals had no follow-up, the sensitivities were 65–82% and the specificities 98–99%. The Cytology algorithm had a low sensitivity (69%) with a high specificity (99%). Compared to the first round, the second-round sensitivities were lower and specificities similar or higher.
Conclusions
The best balance between sensitivity and specificity was achieved by an HPV algorithm with two repeated follow-up tests. However, all HPV algorithms with cytology triage increased colposcopy volume more than the cytology algorithm and thus provoked overdiagnosis.
]. For decades, a cytological Papanicolaou (Pap) test was the primary screening test. The pursuit to develop a screening test based on human papillomavirus (HPV) detection started in the 1990s [
]. In Europe, seven countries have either fully or regionally switched to a primary HPV screening, and the introduction of primary HPV screening is ongoing in several countries [
Compared to screening with the Pap test, the main advantage of HPV testing is its higher sensitivity for cervical intraepithelial neoplasia (CIN) grades II+ leading to further prevention of cervical cancer [
Efficacy of human papillomavirus testing for the detection of invasive cervical cancers and cervical intraepithelial neoplasia: a randomised controlled trial.
Human papillomavirus testing for the detection of high-grade cervical intraepithelial neoplasia and cancer: final results of the POBASCAM randomised controlled trial.
]. Primary HPV screening also has a higher negative predictive value (NPV) for CIN III+ than primary cytology screening and thus enables longer screening intervals [
]. Pre-cancer treatments can cause adverse effects for the examined individuals, including pain, bleeding, discharge and psychological distress, and an increased risk of adverse obstetric outcomes [
]. The increased colposcopy referral rates also elevate the demand of health care resources.
The most effective way to minimize the harms of primary HPV screening is to optimize triage procedures for HPV positive individuals. European Guidelines for Cervical Cancer screening recommends a cytology triage for those who test HPV positive [
]. Nonetheless, the optimal triage strategy for minimizing the harms and maximizing the benefits is yet unclear. Long-term randomized HPV screening implementation trial with various post-hoc formulated triage algorithms can, however, provide insights to their benefits and harms.
In this study, we compared five post-hoc formulated algorithms for primary HPV screening with a cytology triage. Specifically, we aimed to assess whether HPV positive individuals without cytological abnormalities should be followed-up, referred to colposcopy, or sent back to routine screening. We used cytology screening and HPV screening without triage as references. Our aim was to compare different algorithms for HPV screening with cytology triage and assess their sensitivities and specificities for CIN II+. We therefore estimated the real-life accuracy of different algorithms using the large Finnish randomized HPV screening implementation trial data [
Detection rates of precancerous and cancerous cervical lesions within one screening round of primary human papillomavirus DNA testing: prospective randomised trial in Finland.
In 2003, a large-scale randomized HPV screening implementation trial started in Southern Finland. The trial is registered as an International Standard Randomised Controlled Trial (number ISRCTN23885553). During 2003–2008, over 236,000 individuals aged 25 to 65 years were randomized with a 1:1 ratio to HPV test with a cytology triage (HPV arm) or to conventional cytology test (cytology arm). The HPV test was Hybrid Capture 2 (HC2). The trial continued two 5-year screening rounds up to year 2012, with the total duration of 10 years. After the first screening round, the biggest municipality of Finland dropped out, and only a subset of 102,150 of the 236,727 randomized individuals remained in the trial and were invited to the second screening round. (Fig. 1, Fig. 2). The trial was implemented within the organized cervical cancer screening program where all women in the target age are invited to screening in every five years irrespective of their previous findings. The randomization and management procedures of this trial are described in previous publications [
In the HPV arm, the study population (shown in light blue) for the first screening round consists of women who attended the index test and were tested with an HPV test. In cytology arm, the study population (shown in light blue) consists of women who attended the index test. Incident CIN II+ cases were collected from Care Register for Health Care (HILMO), Finnish Cancer Registry (FCR), and Mass Screening Registry (MSR).
The individuals invited in the second round are a subset of individuals randomized in the first round. In the HPV arm, the study population (shown in light blue) for the second screening round consists of women who attended the index test and were tested with an HPV test. In cytology arm, the study population (shown in light blue) consists of women who attended the index test. Incident CIN II+ cases were collected from Care Register for Health Care (HILMO), Finnish Cancer Registry (FCR), and Mass Screening Registry (MSR).
Data of individuals randomized to HPV trial were gathered from the Mass Screening Registry (MSR). Our study population consists of those who attended their index screen in the first or the second screening round (Fig. 1, Fig. 2). In the HPV arm of both screening rounds, we restricted the study population further to those actually tested with HPV in the index screen.
We defined index screen as the first screening test containing both the primary and the possible cytology triage test.
The CIN II+ cases in the study population accumulated from cases diagnosed within a 4.5-year period after the index screen. For both screening rounds, first CIN II+ case of each woman was considered. CIN II + s detected in the screening were received from the MSR. CIN II + s detected due to opportunistic testing or diagnostic visits during the trial were received from the Care Register for Health Care (HILMO) and the Finnish Cancer Registry (FCR). Data on cervical cancer were derived from the FCR.
We formulated five alternative post-hoc algorithms for HPV screening with a cytology triage, one reference algorithm for HPV screening without a triage and one reference algorithm for cytology screening. The algorithms with a triage comprised of different combinations of primary HPV test with a cytology triage and follow-up tests. In all algorithms, the recommended interval for follow-up tests was 12–24 months. The chosen algorithms were either those performed in the European screening programmes, those presented in previous studies or those otherwise easy to implement for screening [
]. Please note that since the HPV algorithms were created post-hoc and were applied to the same study population, they are not mutually independent.
The five algorithms for HPV screening with a cytology triage were classified into two HPV Persistence Algorithms (Fig. 3) and three Decisive Cytology Algorithms (Fig. 4). The reference algorithms, HPV screening without cytological triage (HPV Stand-alone) and cytology screening (Cytology Algorithm), are presented in Fig. 3. The HPV Persistence Algorithm 2 and the Cytology Algorithm were those used in the original trial.
Fig. 3HPV Persistence algorithms and reference algorithms.
(A) HPV Stand-alone Algorithm. All HPV positive women are referred to colposcopy in the index test and HPV negative women sent back to the routine screening. (B) Cytology Algorithm. The algorithm used in the cytology arm of the trial. Some laboratories sent ASCUS individuals directly to colposcopy after the 1. follow-up screen and some sent them to 2. follow-up screen. (C) HPV Persistence Algorithm 1. All women who have persistent HPV positivity in the first follow-up test are referred to colposcopy. (D) HPV Persistence Algorithm 2. All women who have persistent HPV positivity in the second follow-up test are referred to colposcopy.
(E) Decisive Cytology Algorithm 1. All HPV positive cytology negative are sent back to the routine screening already at the index screen. (F) Decisive Cytology Algorithm 2. All HPV positive cytology negative are sent back to the routine screening at the first follow-up screen (G) Decisive Cytology Algorithm 3. All HPV positive cytology negative are sent back to the routine screening at the second follow-up screen.
In the HPV Persistence Algorithms, all HPV positive cytology negative individuals were followed-up with one or two tests, and if HPV persisted in the final follow-up test, sent to colposcopy. In the Decisive Cytology Algorithms, HPV positive, cytology negative individuals were sent back to routine screening either after the index or the follow-up tests. The name Decisive Cytology refers to the fact that in these algorithms, the follow-up and referral decisions were based on results from the cytology trial rather than solely the persistence of HPV.
The HPV Persistence Algorithm 2 was the algorithm used in the Finnish HPV screening implementation trial and is currently used in England [
We calculated episode sensitivity, specificity, positive predictive value (PPV) for CIN II+, and colposcopy referral rates for all the seven screening algorithms in the first and the second screening rounds. We also calculated CIN II+ cases detected per colposcopies performed for each algorithm. Both screening rounds started at the index test and continued for 4.5 years. A finding leading to referral at any point of the algorithm (at the index test, at the second or third follow-up test) was considered as a positive episode result, otherwise a woman was episode negative. The estimates for episode sensitivity, specificity and PPVs and their confidence intervals were calculated using the epiR package [
The colposcopy referral rates were created post-hoc according to the referral criteria in each algorithm. They were calculated by dividing the number of individuals referred to colposcopy by the number of screened individuals. The confidence intervals for the colposcopy referral rates were calculated using the Wilson score method [
To examine sensitivity changes throughout the screening round, we calculated also cumulative estimates in the first screening round using the method described in Nygård et al. 2014 [
In the first screening round, 65.2% (77,148/118324) of the individuals who were randomized to the HPV arm of the trial attended the index test and out of these 93.6% (72,238/77148) were tested with an HPV test (Fig. 1). In the cytology arm, 64.7% (76,654/118403) of the randomized attended the index test. The cumulative detection rate of CIN II+ at 4.5 years was 0.93% (670/72238) and 0.60% (457/76654) for the HPV and the cytology arms, respectively (Supplement 1). The proportion of cervical cancers among CIN II + s was the same 3.7% in both arms (25/670 and 17/457). Of the first follow-up HPV tests, 64%, in the first screening round were done between 12 and 18 months from the index test (shows as a detection peak in Supplement 1).
In the second screening round, 60.4% (30,849/51101) of the invited attended the index screen and 80.8% (24,937/30849) of them were tested with an HPV test (Fig. 2). In the cytology arm, 60.2% (30,720/51049) of the invited attended the index screen.
The cumulative detection rate of CIN II+ at 4.5 years was 0.52% (129/24937) and 0.45% (138/30720) for the HPV and the cytology arms, respectively (Supplement 1). The proportion of cervical cancers among CIN II + s was higher in the cytology arm, 0.51% (7/138), than in the HPV arm, 0.31% (4/129). Of the first follow-up HPV tests, 67%, in the second screening round were done between 12 and 18 months from the index test (again a detection peak in Supplement 1).
The episode sensitivity (vertical axis) and specificity (horizontal axis) estimates, the PPVs for CIN II+, and the colposcopy referral rates of each algorithm are summarized by screening round in Fig. 5. Based on this receiver operating characteristic (ROC) -like curve, the best balance between sensitivity and specificity is in algorithm HPV Persistence 2 in both screening rounds. HPV Persistence Algorithm 1 has a similar sensitivity in both rounds as HPV Persistence 2 but lower specificity and a higher colposcopy rate. The HPV Stand-alone Algorithm has clearly the highest sensitivity, but lowest specificity although it improves in the second screening round. In both rounds, the highest specificities are among the two algorithms with the lowest sensitivities, the Decisive Cytology 1 and the Cytology. However, the differencies in specificities are quite marginal in all HPV Persistence and Decisive Cytology algorithms.
Fig. 5ROC-like curve presenting the algorithm characteristics by screening round.
X-axis is the 1-specificity and and y-axis the sensitivity for CIN II+. The color of the point represents the magnitude of the PPV, and the size of the point the colposcopy referral rate.
The colposcopy referral rate goes hand in hand with the specificity, being the highest in the HPV Stand-alone Algorithm and the lowest in the Cytology Algorithm. Compared to the first screening round, the colposcopy referral rates are clearly lower in algorithms HPV Stand-alone, HPV Persistence 1, and HPV Persistence 2 in the second screening round (see Supplement 2 for exact numbers). The ratio of CIN II+ cases detected to colposcopies performed was highest in the algorithms with the highest specificity, see Supplement 2. In the HPV algorithms, this ratio is much higher in the first screening round than in the second screening round.
The PPVs of the HPV algorithms are better in the first than in the second screening round (Fig. 5, Supplement 2). In the first screening round, all PPVs, except that of the HPV Stand-alone Algorithm, are far over 20%. In the second screening round, all PPVs of the HPV algorithms are 17% or below.
3.1 The cumulative sensitivities
The HPV Stand-alone Algorithm shows the highest (> 93%) cumulative sensitivity throughout the first screening round (Fig. 6). Correspondingly, also the HPV Persistence 1, the HPV Persistence 2, and the Decisive Cytology 3 algorithms show high cumulative sensitivity (>80%).
Fig. 6Cumulative sensitivities for algorithms in the first screening round.
Algorithms are presented in decreasing order in cumulative sensitivity. The x-axis is the time in years from the index test to the end of the first screening round (4.5 years). The y-axis is the cumulative sensitivity. The upper and lower limits of confidence intervals are the dashed lines.
The cumulative sensitivity curve of the Decisive Cytology Algorithm 1 differs from the curves of the other HPV algorithms. The cumulative sensitivity of Decisive Cytology 1 starts to decrease already after one year, at the time of the first follow-up test, and at 4.5 years it is only 65%. The cumulative sensitivity curve of Cytology Algorithm decreases quite steadily after one year, dropping to 69% at 4.5 years.
4. Discussion
We compared sensitivity, specificity, PPV for CIN II+, and colposcopy referral rates of five different HPV algorithms with cytology triage, HPV algorithm without triage, and a cytology algorithm within two rounds of an HPV implementation trial. Based on our analyses, the best balance between episode sensitivity and episode specificity was in the HPV Persistence 2 Algorithm in both screening rounds. In this algorithm, all HPV positive cytology negative individuals were followed-up with two tests without sending them back to routine screening. In both screening rounds, the HPV Persistence 1 had quite a similar episode sensitivity, but a lower specificity and thus clearly higher colposcopy referral rate than the HPV Persistence 2.
The Decisive Cytology Algorithms, where HPV positive cytology negative individuals were sent back to the routine screening either at the index test or at the follow-up tests, had lower colposcopy referral rates and lower episode sensitivities for the CIN II+ detection compared to the other HPV algorithms. From the cumulative sensitivity curves, we can see that the drops in sensitivity happened when HPV positive cytology negative individuals were sent back to the routine screening. Additionally, the episode specificity estimates of these algorithms were not essentially better than that of the HPV Persistence Algorithm 2.
HPV Stand-alone Algorithm has the highest episode sensitivity in both rounds but is not a sensible screening algorithm for population-based cervical cancer screening due to its low specificity. When HPV Stand-alone is compared with the Cytology Algorithm, the HPV test finds radically more CIN II+ cases than the cytological test. However, the Finnish primary cytology screening has been of a high impact and has had a low probability of such false negative cytological diagnoses that would have led to cervical cancer diagnosed after the screening visits [
Large performance variation does not affect outcome in the Finnish cervical cancer screening programme: performance and outcome in cervical cancer screening.
]. This may be of relevance also considering that all the HPV algorithms with cytology triage, except for Decisive Cytology Algorithm 1, would lead to a higher overall CIN II+ detection probability than the Cytology Algorithm.
Feasible triage algorithms have been studied previously for the first HPV screening round [
], on the other hand, considered this algorithm infeasible due to a high colposcopy referral rate, which was the main constraint in their analysis. Instead, they considered a strategy comparable to our Decisive Cytology Algorithm 2 the most suitable. We cannot rule out that differential referral rate as well as detection rate of CIN lesions in cytology between Sweden and the Netherlands could have affected the above interpretations. Nonetheless, we did not find distinct differences in the colposcopy referral rates between these two algorithms in our study.
Based on our results, the algorithm characteristics are only slightly different by screening round. The sensitivities for CIN II+ are a bit lower in the second screening round than in the first screening round in all the algorithms. Similarly, the colposcopy referral rates are also lower, especially in HPV Stand-alone Algorithm. The ratio of CIN II+ detected to colposcopies performed and the PPVs are lower in the HPV algorithms in the second screening round than in the first round. The second screening round may resemble more the real-life situation after the onset of routine HPV screening and is therefore important to assess. Regarding the HPV algorithms these differences might be due to the fact that many prevalent HPV infections were treated during the first screening round. Further, the trial cohort was 5 years older in the second screening round which may have slightly lowered the amount of infections and lesions, affecting the algorithm characteristics.
The HPV screening algorithms suffer from essential overdiagnosis of non-progressive CIN II+ lesions, because HPV screening can not differentiate between progressive and non-progressive CINs. To examine the potential of overdiagnosis, we calculated the number of CIN II + s needed to treat (NNT) [
] to prevent one additional cancer in the HPV arm compared to the cytology arm. This was calculated for individuals who attended both screening rounds (see Supplement 3 for details). NNT for this is 53, which indicates that many of the CIN II+ cases in the HPV arm in the first screening round represent overdiagnosis. The incidence of cervical cancer is low in Finland, mostly due to the high effectiveness of cytological screening [
]. This may elevate the NNT, since there are only a few additional cancers to be prevented. Therefore, in a country where the cervical cancer incidence is clearly higher [
] or cervical screening has not yet been developed, the balance between benefits and harms could be different.
Of the 25 cancers in the HPV arm in the first screening round, seven cancers were HPV negative in the index test. In the second round 1 out of 4 cancers in the HPV arm was HPV negative in the index test. There were only minor differences in the detection of these cancers in both rounds among the HPV algorithms. However, Decisive Cytology 1 clearly detected the least number of cancers of the HPV algorithms, and it was the only HPV algorithm with a lower cancer detection percentage than the Cytology Algorithm.
Due to the clearly higher colposcopy rates of the more sensitive HPV algorithms when compared to Cytology Algorithm, we still need other triage options besides cytology. Further triage options for HPV-positive women such as e.g. HPV genotyping or other clinically validated triage tests might be considered [
Our data were based on one of the largest and earliest HPV screening trials which was executed as a part of the routine screening programme. Due to a long follow-up (2003–2012) we were able to study the algorithm characteristics over two screening rounds, making the results more applicable to routine screening. We also had individual information on the randomized individuals, which allowed us to use comprehensive data from three nationwide registers.
The algorithms were created post-hoc to the same study population. This means that the HPV algorithms are not mutually independent, which may effect the results. During the trial, individuals were managed according to the HPV Persistence Algorithm 2 and because all HPV positive individuals were not immediately sent to colposcopy, we might have missed few (regressive) CIN II+ cases only detectable with the most sensitive algorithms. However, the fact that our study material also had all CIN II+ cases diagnosed outside of the screening programme and since opportunistic testing is highly common in Finland [
], we can assume that the results are close to a real-life scenario.
The adherence to follow-up tests can affect the sensitivities of algorithms with one or two follow-up tests. However, the inclusion of the CIN II+ cases diagnosed outside of the organized screening clearly reduces the potential effect of this phenomenon on our results. It is also possible, but similarly unlikely, that individuals have seeked treatment much belated after the second screening round, and we thus have missed lesions.
One major limitation is the current histological classification using high-grade squamous intraepithelial lesion (HSIL) grading, encompassing both CIN II and CIN III. Therefore, we could only consider CIN II+ as the outcome [
]. Furthermore, in the early years of the trial our cytology laboratories used a modified Papanicolaou classification; the conversion from Papanicolaou groups to Bethesda diagnoses is not precise. Also, the management guidelines for cervical precancers have changed after the onset of the trial. Primary HPV screening is currently not recommended for individuals under 30 years old [
]. However, our data also had individuals younger than this age. If the HPV algorithms were limited to individuals 35 and over, the specifity of HPV testing would have been slightly higher, prevalence of precancers lower and progression probability of precancers larger [
Comparison of the Digene HC2 assay and the Roche AMPLICOR Human Papillomavirus (HPV) Test for detection of high-risk HPV genotypes in cervical samples.
Based on our results, the best balance between sensitivity and specificity is achieved by HPV Persistence Algorithm with two repeated follow-up tests. Screening algorithms where HPV positive cytology negative individuals are sent back to the routine screening induce lower sensitivity of CIN II+ and only a slightly higher specificity. Furthermore, even the HPV Persistence Algorithms require more colposcopy referrals than cytology screening and therefore contribute to increased overdiagnosis. Triage options based on HPV genotyping should be considered for the HPV positive individuals.
Ethics statement
Permit for this register-based study has been granted by the Finnish Institute for Health and Welfare (THL/276/5.05.00/2018).
Data availability
Data cannot be shared publicly due to its confidentiality. Data could be shared to those individuals who have also received a permit for its use from the Finnish Institute for Health.
Ahti Anttila is a member of author group for the current Finnish clinical guidelines for precancers of the cervix, vagina and vulva and editorial board and author group of the European guidelines for cervical cancer screening. Tytti Sarkeala is a member of the Finnish national screening board.
Acknowledgement
We thank Maiju Pankakoski and Aapeli Nevala for help. We also thank all the people at the Finnish Cancer Registry who have assisted us during this process. We thank Cancer Foundation Finland for the funding (grant reference number: 190101).
Cumulative detection rate of CIN II+ by study arm. This was calculated by dividing the cumulative amount of CIN II+ cases detected in each study arm by the N of study population in the corresponding study arm
Algorithm characteristics by screening round. Sensitivities, specificities, PPVs, colposcopy referral rates and amounts, and the ratio of detected CIN II+ cases to colposcopies are presented for each algorithm separately for both screening rounds.
Efficacy of human papillomavirus testing for the detection of invasive cervical cancers and cervical intraepithelial neoplasia: a randomised controlled trial.
Human papillomavirus testing for the detection of high-grade cervical intraepithelial neoplasia and cancer: final results of the POBASCAM randomised controlled trial.
Detection rates of precancerous and cancerous cervical lesions within one screening round of primary human papillomavirus DNA testing: prospective randomised trial in Finland.
Large performance variation does not affect outcome in the Finnish cervical cancer screening programme: performance and outcome in cervical cancer screening.
Comparison of the Digene HC2 assay and the Roche AMPLICOR Human Papillomavirus (HPV) Test for detection of high-risk HPV genotypes in cervical samples.