Next-Generation Sequencing of Influenza Viruses in a Household Cohort Accurately Identifies Transmission Pairs and Reveals a Bottleneck Size of Close to One

Abstract

Background. A detailed understanding of influenza virus transmission is central to the evaluation of public health interventions. Household transmission is a major driver of influenza epidemics. However, individuals remain at risk for community-acquired infection even while they are exposed to an infected household contact. We hypothesized that sequence data could be used to differentiate community and household-acquired viruses, improving the specificity of transmission studies.

Methods. We used the Illumina platform to sequence 192 influenza A positive throat and nasal swabs collected within 7 days of illness onset over 4 influenza seasons in the HIVE study, a prospective cohort of households with ≥2 children that each year includes 250–350 families. We have developed and robustly benchmarked a variant calling pipeline that can identify intrahost single nucleotide variants (iSNV) as a rare as 0.5% in patient-derived influenza populations with >99.95% specificity.

Results. Intrahost viral diversity was quite low. The average number of iSNV per sample at >0.5% frequency was 10 (range 1–135, IQR 4–9) and did not differ appreciably with day of sampling or vaccination status. Pearson correlation of shared iSNV accurately distinguished 23 H1N1 and 34 H3N2 household transmission pairs from community samples. The pairwise genetic distances of influenza populations from these transmission pairs were sufficiently low relative to a null distribution of distances generated each season from the entire cohort that we could identify transmission pairs with >95% confidence. Across all transmission pairs, 424 iSNV were polymorphic, and 30.2% of iSNV were found in both members of a pair. Maximum likelihood optimization and a simple binomial model estimated that a bottleneck size of <10 transmitted viruses is most consistent with the number and frequency of shared variants in our cohort.

Conclusion. We have used deep sequencing to identify transmission pairs in a household-based cohort. Our data suggest that the size of the transmission bottleneck is much lower than previously thought and that infections originating from more than one virus are relatively uncommon in this population. This study is the most comprehensive to date of the genetics and molecular epidemiology of influenza virus transmission.

Publication
Open Forum Infectious Diseases
Ryan Malosh
Ryan Malosh
Assistant Research Scientist