Genetic association with B-cell acute lymphoblastic leukemia in allogeneic transplant patients differs by age and sex

Alyssa I. Clay-Gilmour, Theresa Hahn, Leah M. Preus, Kenan Onel, Andrew Skol, Eric Hungate, Qianqian Zhu, Christopher A. Haiman, Daniel O. Stram, Loreall Pooler, Xin Sheng, Li Yan, Qian Liu, Qiang Hu, Song Liu, Sebastiano Battaglia, Xiaochun Zhu, AnneMarie W. Block, Sheila N. J. Sait, Ezgi Karaesmen, Abbas Rizvi, Daniel J. Weisdorf, Christine B. Ambrosone, David Tritchler, Eva Ellinghaus, David Ellinghaus, Martin Stanulla, Jacqueline Clavel, Laurent Orsi, Stephen Spellman, Marcelo C. Pasquini, Philip L. McCarthy and Lara E. Sucheston-Campbell

Key Points

  • IKZF1 associations with high-risk B-ALL may differ by age and sex.

  • A novel variant on chromosome 14, rs189434316, is associated with over a 3.5-fold risk of normal cytogenetic B-ALL.


The incidence and mortality rates of B-cell acute lymphoblastic leukemia (B-ALL) differ by age and sex. To determine if inherited genetic susceptibility contributes to these differences we performed 2 genome-wide association studies (GWAS) by age, sex, and subtype and subsequent meta-analyses. The GWAS included 446 B-ALL cases, and 3027 healthy unrelated blood and marrow transplant (BMT) donors as controls from the Determining the Influence of Susceptibility Conveying Variants Related to One-Year Mortality after BMT (DISCOVeRY-BMT) study. We identified 1 novel variant, rs189434316, significantly associated with odds of normal cytogenetic B-ALL (odds ratio from meta-analysis [ORmeta] = 3.7; 95% confidence interval [CI], 2.5, 6.2; P value from meta-analysis [Pmeta] = 6.0 × 10−9). The previously reported pediatric B-ALL GWAS variant, rs11980379 (IKZF1), replicated in B-ALL pediatric patients (ORmeta = 2.3; 95% CI, 1.5, 3.7; Pmeta = 1.0 × 10−9), with evidence of heterogeneity (P = .02) between males and females. Sex differences in single-nucleotide polymorphism effect were seen in those >15 years (OR = 1.7; 95% CI, 1.4, 2.2, PMales = 6.38 × 10−6/OR = 1.1; 95% CI, 0.8, 1.5; PFemales = .6) but not ≤15 years (OR = 2.3; 95% CI, 1.4, 3.8; PMales = .0007/OR = 1.9; 95% CI, 1.2, 3.2; PFemales = .007). The latter association replicated in independent pediatric B-ALL cohorts. A previously identified adolescent and young-adult onset ALL-associated variant in GATA3 is associated with B-ALL risk in those >40 years. Our findings provide more evidence of the influence of genetics on B-ALL age of onset and we have shown the first evidence that IKZF1 associations with B-ALL may be sex and age specific.


Acute lymphoblastic leukemia (ALL) is a disease primarily impacting children, however, one-third of cases are in adults (>20 years of age).1 The 5-year relative survival for children 1 to 5 years of age is >90%, but for those over 60 years of age, the 5-year relative survival is <20%.2,3 Causation appears to be multifactorial, including exogenous or endogenous exposures and genetic susceptibility.4 Studies of environmental risk contributing to ALL are inconclusive. Suggested risk factors include obesity and smoking in individuals >55 years of age, occupational exposures, chemotherapy/radiation as therapy for other diseases, and prenatal/early exposures.5-15 These external exposures only account for a small proportion of disease risk, and a strong case for genetic contribution can be made for pediatric and adolescent/young-adult (AYA) ALL.16-19

Genome-wide association studies (GWASs) have identified common genetic variants for pediatric and AYA ALL in 6 regions either within or near these genes: ARID5B, IKZF1, CDNK2A/B, CEBPE, GATA3, BMI-PIP4K2A.20-31 Recently, the largest meta-analysis of 2 pediatric GWASs of B-cell ALL (B-ALL) identified 2 new susceptibility loci within genes LHPP and ELK3.32 Variants in ARID5B, IKZF1, and GATA3 vary in effect by age and cytogenetic subgroups. For example, ARID5B variants are associated with hyperdiploid B-ALL in pediatric patients,20,30,33,34 which accounts for ∼30% of pediatric ALL and is a marker of favorable prognosis.1 In contrast, rs3824662, a variant in GATA3, was consistently associated with AYA ALL regardless of cytogenetic subgroup, but is not associated with pediatric ALL.29 Despite the evidence of potential age effects20,21 (evidenced by the high mortality rate in adult ALL cases3) and sex effects, there is both higher incidence and worse prognosis of ALL in males than females; GWASs have not been conducted in adult-onset ALL nor have they been conducted by sex.3

To this end, we performed the first GWAS in a high-risk B-ALL population treated with unrelated donor (URD) allogeneic blood and marrow transplant (BMT), then further stratified by age, sex, and cytogenetic subgroup.


This study was conducted in accordance with the Declaration of Helsinki and was reviewed and approved by the Roswell Park Cancer Institute Institutional Review Board. All patient data were deidentified. Summary data are provided in this manuscript.

Study design and population

The cases and controls were selected from an ongoing parent study: Determining the Influence of Susceptibility Conveying Variants Related to One-Year Mortality after BMT (DISCOVeRY-BMT).35-38 Briefly, the parent study was designed to find common germ line genetic variation associated with survival after an URD-BMT. DISCOVeRY-BMT consists of 2 cohorts of ALL, acute myeloid leukemia, and myelodysplastic syndrome patients and their HLA-matched unrelated healthy donors35 (supplemental Methods). For this study, cases were diagnosed with B-ALL and controls were unrelated healthy donors aged 18 to 61 years who passed a comprehensive medical examination and were disease-free at the time of donation. T-cell ALL cases (N = 77) were removed from both cohorts due to the low number of cases and inherent differences between B- and T-cell ALL. All patients and donors provided written informed consent for their clinical data to be used for research purposes and were not compensated for their participation.

Genotyping and quality control

Genotyping was performed at the University of Southern California (USC) Genomics Facility using the Illumina Omni-Express BeadChip containing ∼733 000 single-nucleotide polymorphisms (SNPs). SNPs were removed if the missing rate was >2%, minor allele frequency (MAF) <1%, or for violation of Hardy-Weinberg equilibrium (P < 1.0 × 10−6).

Problematic samples were removed based on the sample missing rate, duplicates, reported-genotyped sex mismatch, abnormal heterozygosity, cryptic relatedness, and population outliers (supplemental Methods). All quality-control (QC) measures were implemented in R and Plink statistical software.39,40


Genotype data were imputed using Impute2 v2.041-43 with a reference panel of haplotypes from the 1000 Genomes phase 1v3.44,45 QCTOOL46 was used to remove imputed genotypes with a MAF <0.01 and info and certainty score <0.7 and <0.9, respectively45,46

Statistical analyses

Descriptive statistics, including χ2 and Student t tests, were performed on demographic variables (age, sex, and cytogenetic subgroup) by case-control status. Logistic regression assuming an additive model implemented in SNPTESTv2.5 was used to perform genome-wide association analyses adjusted for age in cases and controls for overall B-ALL and subtype-specific B-ALL analyses.40,43 The following subtype-specific analyses were performed: hyperdiploid negative (<51 chromosomes), Philadelphia chromosome negative (Ph), abnormal and normal cytogenetic subgroups. Some high-risk subtypes have been omitted due to data availability and insufficient sample sizes. GWASs by age group were done for individuals <20 years (pediatric), 20 to 40 years (young adult), and >40 years (older adult) compared with all controls. These age categories represent an average of age-specific categories from clinical literature and prior pediatric and AYA GWASs. Sex-specific analyses were also performed in males and females; sex status of individuals was self-reported and further confirmed by genotyping. Cohorts 1 and 2 were combined with METAL software (odds ratio from meta-analysis [ORmeta] and P value from meta-analysis [Pmeta])47,48 using standard error weighted meta-analysis; genome-wide significance was defined as Pmeta < 5 × 10−8 with Pmeta < 5.0 × 10−6 considered suggestive of association. The Cochran Q method (P values) and I2 were used to test for heterogeneity across subgroups.49 To determine whether age or sex could be considered a mediator50,51 (thus significantly impacting the effect of the association between SNP and B-ALL risk), the Sobel mediation test was used.52,53 The proportion of the total effect mediated by age or sex was determined by comparing the difference between the β-coefficients for B-ALL before and after adjustment for age: (βunadj − βadj)/βunadj, where βunadj and βadj are the total effect and the direct effect, respectively.52,53

We defined significant evidence of age mediating the SNP-disease relationship when P was <.05 and if all of the Sobel test criteria for a mediator were met.52,53 To interrogate the role of genetics in ALL age of onset further, we also performed a case-only analysis, using age as outcome (continuous and categorical) in regression models for each genome-wide significant association initially identified. Interaction analyses of SNP and sex, as well as OR, and comparisons of stratified estimates were used to determine whether sex was acting as an effect modifier.

Replication of sex-specific effects

DISCOVeRY-BMT ALL cases are predominantly young adults and older adults, thus sex- specific findings for our pediatric associations were tested within a meta-analysis of 2 previously performed pediatric B-ALL GWASs. These 2 data sets were previously used in a meta-analyses with 2 other published GWASs.54-60 Briefly, the cases consisted of 437 children of European ancestry with B-ALL treated on The Children’s Oncology Group P9904 protocol58,61 and 427 pediatric B-ALL cases from the German GWAS.55,62 Controls included 475 from the German GWAS and European ancestry controls (N = 958) from the Genetic Association Information Network (supplemental Methods).54 Stratified analyses by sex were performed within the cohort and we report ORs and P values for the logistic additive model. These data are only pediatric and thus replication for adult GWAS associations could not be performed.

Heritability and polygenic risk scores

Two approaches were used to better understand heritability and the aggregate contribution of genetic variation to B-ALL risk. We estimated the proportion of phenotypic variance explained by common SNPs genome-wide in males and females separately using genome-wide complex trait analysis (GCTA).63-66 Second, polygenic risk scores (PRSs) were calculated by combining significant loci weighted by effect sizes estimated from the logistic regression GWAS using PRSice software67 (supplemental Methods).



DISCOVeRY-BMT consists predominantly (>95%) of individuals self-reported as European American (EA) with 3073 BMT recipients and 3144 BMT donors initially genotyped. Sample QC on self-reported EA recipients and donors was performed on each cohort separately (supplemental Methods; supplemental Figure 1), which yielded 2111 recipients and 2219 donors in cohort 1 and 777 recipients and 808 donors in cohort 2.

Analyses described herein include either the 364 individuals with B-ALL who received a BMT (cases) and 2219 donors (controls) in cohort 1 and 82 B-ALL cases and 808 controls in cohort 2, or a subset of these (supplemental Table 1). Ages of cases ranged from 1 to 68 years, whereas controls ranged from 18 to 61 years due to minimum and maximum ages for donation. Controls are predominantly male, which reflects clinical selection bias against parous females who may increase risk of graft-versus-host disease. However, the proportion of males in the control group mirrors that of cases. In both cohorts, cases were also skewed to a hyperdiploid-negative karyotype, which is a more aggressive subtype of B-ALL frequently treated with BMT.

Genome-wide associations

The final genotyping data in recipients consisted of 637 655 typed SNPs in cohort 1 and 632 823 typed SNPs in cohort 2 (supplemental Methods; supplemental Figure 1) from which ∼8.5 million imputed variants were available for genome-wide analyses. Quantile-quantile (Q-Q) plots of SNP association with B-ALL (supplemental Figure 2) show no evidence of genomic inflation due to cryptic population structure (λ = 1.001) and none of the principal components (PCs) were associated with risk of B-ALL, therefore, PCs were not included in the regression analyses. We report on a novel genome-wide association with normal cytogenetic B-ALL, as well as genome-wide significant associations in genes previously identified by GWASs, considering age and sex in exploratory analyses (supplemental Table 2; supplemental Figure 3).

A novel association, rs189434316 (92 697 912 bp), with normal cytogenetic B-ALL was identified on chromosome 14 (Table 1; Figure 1) between SLC24A4 and CPSF2. The T allele (MAF = 0.07) increases odds of normal cytogenetic B-ALL by over 3.5-fold compared with controls (ORmeta = 3.7; 95% confidence interval [CI], 2.5-6.2; Pmeta = 6.0 × 10−9). This genome-wide association was seen only with normal cytogenetic B-ALL and was not observed overall or in other subtypes (all B-ALL cases, Pmeta = 2.6 × 10−5; hyperdiploid negative, Pmeta = 1.6 × 10−6; Ph, Pmeta = 3.4 × 10−6; abnormal cytogenetic B-ALL, Pmeta = .3). To further explore the biological relevance of this novel variant, we analyzed the association of the SNP with both death due to disease and progression free survival, defined as the time to disease progression or death following transplant. The T allele is associated with death due to disease (hazard ratio [HR]meta = 2.28; 95% CImeta = 1.27, 4.13; Pmeta = .006) and worse progression-free survival (HRmeta = 1.45; 95% CImeta = 1.03, 2.03; Pmeta = .03) in normal cytogenetic B-ALL cases.

Table 1.

Genome-wide significant associations with B-ALL by age (<20, 20-40, >40 y) in EAs

Figure 1.

Regional plot of SNP associations with normal cytogenetic B-ALL. Regional plot showing association with normal cytogenetic B-ALL in a region of chromosome 14 (rs189434316). The x-axis is the position on the chromosome (Mb) and y-axis is the −log10 P value of the SNP association with normal cytogenetics B-ALL. Filled colors indicate linkage disequilibrium (LD), as measured by r2, with the most significant SNP (shown in purple); red shows a high degree of LD whereas blue indicates lower r2. Meta-analyses show that the T allele (MAF = 0.07) increases odds of normal cytogenetic B-ALL by over 3.5-fold compared with controls (Pmeta = 5.6 × 10−9). This association was seen only with normal cytogenetic B-ALL and not observed in other subtypes. CPSF2 is the nearest gene to the association signal.

We found evidence of genome-wide associations in GATA3 and IKZF1, both identified in published AYA and pediatric ALL GWASs, respectively.20-23,29 With the exception of GATA3 and IKZF1, other previously published pediatric and AYA GWAS associations (CDKN2A/B, ARID5B, BMI-PIP4K2A, CEBPE, LHPP, ELK3) did not replicate in DISCOVeRY-BMT (supplemental Table 2). The variant in GATA3, rs3824662, previously shown to be associated with AYA B-ALL,29 increased odds of B-ALL overall (Pmeta = 3.29 × 10−13), within hyperdiploid-negative (Pmeta = 2.95 × 10−13), Ph (Pmeta = 1.16 × 10−12), and normal cytogenetic B-ALL (1.09 × 10−8) (Table 1; supplemental Table 2). Another GATA3 variant, rs569421, showed a 60% increased risk of B-ALL in cases compared with controls (Pmeta = 4.4 × 10−8). This variant has not been reported to be associated with B-ALL overall or by subtype, however, joint analyses with rs3824662 indicated rs569421 is not an independent risk variant.

SNP rs11980379 in IKZF1 is a known pediatric B-ALL risk variant and perfectly correlated (r2 = 1.0) with rs413260, which has also previously been associated with pediatric B-ALL.20,21,23-28,30-32 SNP rs11980379 in IKZF1 was significantly associated at the genome-wide level with Ph (Pmeta = 3.6 × 10−9) and normal cytogenetics B-ALL (age-adjusted) (Pmeta = 4.6 × 10−8) (supplemental Table 2). However, rs11980379 demonstrated smaller effect sizes and higher P values with increasing age (Table 1). In pediatric patients, the C allele in this SNP conferred 2.3-fold increased odds of B-ALL, an ∼1.4-fold increased risk of B- ALL in young adults (Pmeta = .005), and a 1.2-fold increased risk of B-ALL in older adults (Pmeta = .13) (Table 1; Figures 2-3). Stronger associations with smaller P values were also observed within the pediatric age group across all cytogenetic subgroups (hyperdiploid negative, Ph, abnormal and normal) (data not shown) and thus this age-specific association is not attributable to differences in underlying subtype distributions. Sobel tests of mediation estimate that the proportion of the total effect (percentage of mediation) of the SNP that is mediated by age is 4% (P = .055), indicating some evidence for age as a pathway mediator.

Figure 2.

Manhattan plots for SNP associations with B-ALL by age. (A) Pediatric (<20 years), (B) young adults (20-40 years), and (C) older adults (>40 years). The x-axis indicates the chromosome and the y-axis is the −log10 P value from meta-analyses of SNP associations with B-ALL in cohorts 1 and 2. The dashed red line indicates genome-wide significance of P > 5.0 × 10−8. The SNPs highlighted in red on chromosome 7 (pediatric) are known variants in IKZF1.

Figure 3.

Regional plot of age-specific (<20, 20-40, >40 years) associations with B-ALL in IKZF1. Regional plot showing age-stratified association with B-ALL within IKZF1. The x-axis is the position on the chromosome (Mb), and y-axis is the −log10 P value. The circles (older adults), diamonds (young adults) and squares (pediatric) show the P value for the SNP association with risk of B-ALL, rs11980379. Filled colors indicated the LD with the most significant SNP shown in purple. Red indicates a high degree of LD whereas blue indicates lower r2. This plot shows a significant association in pediatrics, but no evidence of association with B-ALL in young adults and older adults.

In addition to age-specific effects, there are strong sex-specific associations for IKZF1, however, unlike age, sex does not mediate the effect of the variant on the risk of B-ALL (Sobel mediation, P = .5) but rather there is evidence of heterogeneity of effect between males and females. The C allele in rs11980379 showed an 80% increased risk of B-ALL in males (OR = 1.8; 95% CI, 1.24-2.50; Pmeta = 3.8 × 10−8), whereas there is no significant association in females (OR = 1.26; 95% CI, 0.92-1.70; Pmeta = .06) (Figure 4; Table 2; supplemental Figure 4), with evidence of significant heterogeneity between males and females (I2 = 80; Q = 0.02). When comparing crude and sex-adjusted OR, the crude OR for the IKZF1 SNP is 1.58, whereas the sex-adjusted OR is 1.56, yielding about a 1% difference; the ORs between the crude and adjusted should be different if sex is a confounder. The stratified ORs for males (1.9 [95% CI, 1.4, 2.3]) and females (1.2 [95% CI, 0.9, 1.6]) differ and in turn differ from the crude OR. Allele frequencies of controls by sex show no difference between males and females thus the SNP is not just associated because it has a higher frequency in men in the general population. This evidence indicates that sex could be modifying the effect of this IKZF1 variant on ALL risk. This sex-specific genetic effect shows some evidence of an age association as well. The C allele is not a risk factor in females >15 years (OR = 1.1; 95% CI, 0.8-1.5; P = .6) but appears to be more strongly associated in females <15 years (OR = 1.9; 95% CI, 1.2-3.2; P = .007). In contrast males show little change in OR for those >15 years at diagnosis (OR = 1.74; 95% CI, 1.4- 2.2; P = 6.38 × 10−6) vs <15 years (OR = 2.34; 95% CI, 1.4-3.8; P = .0007). In analyses of cases >15 years of age, the evidence of heterogeneity based on the Cochran Q statistic remains between males and females (I2 = 60), whereas in cases <15 years of age (replication cohort) there is no significant evidence in effect heterogeneity between males and females (supplemental Table 3). This finding for >15 years is also supported by evidence of statistical interaction between the SNP and sex variable (P = .02) in models of ALL susceptibility.

Figure 4.

Regional plot of sex-specific associations with B-ALL in IKZF1. Regional plot of sex-specific associations with B-ALL in the IKZF1 region. The x-axis is the position on the chromosome (Mb), and y-axis is the −log10 P value. The squares (male) and circles (female) show the P value for the SNP association with risk of B-ALL. Filled colors indicate the LD with the most significant SNP shown in purple. Red indicates a high degree of LD whereas blue indicates lower r2.

Table 2.

Genome-wide significant associations with the C allele in rs11980379 (IKZF1) and B-ALL by sex in EAs

Additional genome-wide significant variants previously identified in pediatric and AYA B-ALL GWASs, in genes ARID5B, BMI1-PIP4K2A, CEBPE, CDKN2A/B, ELK3, and LHPP, did not reach genome-wide significance in our B-ALL GWAS either overall, or by subtype, age, or sex. The most significant association in these genes was seen in CDKN2B, rs1333035, Pmeta = 2.09 × 10−5, in LD (r2 = 0.8) with rs3218005, previously identified in pediatric genome-wide analyses of ALL (supplemental Table 2).

Replication analysis of sex-specific findings

Replication analysis of sex-specific findings for those ages 1 to 15 years was performed in meta-analyses of 2 B-ALL pediatric GWASs. The rs11980379 association replicated in both the male and female B-ALL pediatric population conferring a 40% increased risk (ORmeta = 1.4; 95% CI, 1.1-1.9; P = 3.9 × 10−5) in males and a 60% increased risk in females (ORmeta = 1.6; 95% CI, 1.2, 2.3; P = 1.4 × 10−8) (supplemental Table 3). There was no evidence of heterogeneity in effect sizes between males and females <15 years, similar to our findings. The novel significant variant, rs189434316, could not be replicated in this pediatric population as there were no cytogenetically normal cases; rs189434316 was not associated with abnormal cytogenetic B-ALL in either our data or the replication (P = .3).


GCTA estimates of male and female heritability (h2) were 0.83 (standard error [SE] = 0.34; P = .001) and 0.52 (SE = 0.18; P = .0002), respectively (supplemental Table 4). The correlation between male and female heritability estimates was weak with a large standard error (ρ = 0.11; SE = 0.29) (supplemental Table 4).

The PRS distributions and medians differ between cases (overall and by age group) and controls, with cases having a significantly higher median risk score than controls (P < .001) (supplemental Figure 5). In cohort 1, the high-risk group (carriers of the most high-risk alleles from SNPs) had a threefold increased risk of having B-ALL (P = 2.4 × 10−14) and the medium-risk group conferred an almost twofold increased risk of disease (P = .0006), when compared with the low-risk group (reference) (supplemental Table 5).

When stratified by age, pediatric individuals with high PRS had an almost fourfold increased risk of B-ALL (P = 1.5 × 10−7); the medium PRS group had 40% increased odds of having B-ALL (P = .3), although nonsignificant, compared with those with low PRS score. In the young adults and older adults, high-risk PRS score individuals had threefold increased odds of disease, P = 4.3 × 10−7 and P = 3.7 × 10−6, respectively. The medium-risk groups showed an ∼1.8 and twofold difference from the low-risk group in young adults (P = .02) and older adults (P = .01), respectively (supplemental Table 5). Males had a 3.5-fold increased risk of B-ALL for the high- risk PRS group compared with the low-risk group (P = 4.1 × 10−11). Females also showed an increased risk of B-ALL (OR = 2.5) in the high-risk group compared with the low-risk group (P = .0001). The high-risk median PRS was not significantly different between males and females (P = .34).


In these GWASs of B-ALL susceptibility across age and between sexes in a high-risk BMT population, we found evidence for novel associations within subtypes, as well as sex- and age- specific associations in previously identified variants. Besides these variants, we did not replicate genome-wide or suggestive associations with other prior known pediatric GWAS regions. Most likely this is due to these loci being associated with favorable-risk ALL (pediatric) and the cases that comprise DISCOVeRY-BMT are high-risk ALL.

The established GATA3 B-ALL risk variant showed association regardless of cytogenetic subgroup. GATA3 encodes for a transcription factor that is critical for lymphoid cell lineage commitment and early T-cell differentiation, and loss-of-function somatic mutations have been discovered in early T-cell precursor ALL.68,69 Alterations in GATA3 have been linked to other blood cancers, including Hodgkin lymphoma.70 Our GATA3 findings in conjunction with the evidence for association of the GATA3 risk variant with pediatric Ph-like ALL,22 a more adverse type of ALL, and AYA associations suggest GATA3 is linked to high-risk ALL.

The strength of the IKZF1 variant in our high-risk pediatric B-ALL group indicates that this variant most likely contributes to risk of both adverse ALL (hyperdiploid negative) and favorable ALL as it has previously been strongly associated in pediatric ALL cases (hyperdiploid) with favorable outcomes. IKZF1 is a transcription factor that is needed for development of hematopoietic stem cells to lymphoid precursors.71,72 It is frequently targeted by copy-number alterations in ALL blast cells (particularly in high-risk ALL); deletions and mutations result in loss of function or dominant-negative isoforms73 and are associated with a poor prognosis.72,73 Two SNPs (rs6964969 and rs11978267) strongly correlated with the IKZF1 variant (r2 > .95) are cis-expression quantitative trait loci in monocytes and whole blood.74-76 In addition, rs6964969 is predicted to affect NFKB1 transcription factor binding, which can lead to inappropriate immune cell development or delayed cell growth when there are problems with normal binding/expression.77

The age-specific findings reinforce the idea that genetic variation may contribute differently to risk of pediatric vs adult high-risk B-ALL. This is reasonable as other genetic features, for example, chromosome aberrations, also differ between pediatric and adult ALL.

Unlike age, sex is not a mediator variable in the relationship between the IKZF1 genetic variant and B-ALL risk, but rather is modifying the effect of the SNP on risk of B-ALL. Interestingly, it is possible that this sex effect is age-specific. Our analyses of genetic association at an approximate pre- and postpuberty age may indicate that there is some relationship between female sex hormones (activated during puberty) and this SNP, manifesting in similar risk attributable to the IKZF1 variant before puberty in males and females, but not for females following puberty. Analyses of the (prepuberty) pediatric patients, and thus those with favorable B-ALL subtypes, who comprise the replication data set, clearly demonstrate that pediatric ALL germ line susceptibility IKZF1 associations do not differ by sex. Our data demonstrate that as age increases (>15 years), sex-specific associations are observable. Although we analyzed <15 vs >15 years, further investigation into the role of age and sex in large sample sets with greater variance in B-ALL subtype is warranted.

The PRS models show that the high-risk group and the medium-risk group have a significantly different risk than the low-risk category, with nonoverlapping CIs. Given the significantly increased risk of B-ALL in those in the high-risk group, it is valuable to consider these variants together and how they are contributing to disease risk.

Our novel finding on chromosome 14, rs189434316, was associated with normal cytogenetic B-ALL but not abnormal cytogenetic B-ALL. This variant is significantly associated with increased hazard of death due to disease and worse progression-free survival, adding to the biological plausibility of this novel finding. It is not immediately clear how this SNP could be correlated with outcome, as functional annotation does not demonstrate evidence for impacting gene expression or transcription factor binding. Replication in another high-risk population is an important next step.78

Although this is the first B-ALL susceptibility study of a high-risk population across the age spectrum, our study has some limitations. We had 80% power to detect ORs in line with previous reports, 2 to 1.5, for MAF ranging from 40% to 10%, respectively; this was reduced for the age and sex subgroup analyses. Also, specific translocations (eg, BCR-ABL, ETV- RUNX, and MLL) were not considered and analyses were limited to EAs, thus, these variants may not be valid for other continental ancestry groups. We used a standard genome-wide association significance level of P < 5.0 × 10−8 for each of our GWASs; a more stringent threshold might need to be considered given that we are testing 3 age groups and sex as well.

Our age-specific GWAS identified inherited variants that strongly influence B-ALL susceptibility in adults and validate AYA findings, shedding new light on age-related differences in ALL biology. To date, GWASs of ALL either adjusted for sex or did not report sex-specific results, hence our study provides the first evidence that sex is an effect modifier and different genetic variants are contributing to ALL in males and females.

Understanding genetic contribution can aid our understanding of B-ALL etiology. Also, identification of people at high risk for B-ALL enables the integration of genetic and clinical risk factors to improve patient stratification.79 This study is 1 step closer to achieving more personalized inherited susceptibility to this heterogeneous and devastating disease.


K.O., E.H., and A.S. performed the replication analyses while K.O. and E.H. were employed at The University of Chicago.

This work was supported by the National Institutes of Health: National Cancer Institute grant R03 CA188733 (L.E.S.-C. and T.H.), National Heart, Lung, and Blood Institute grant R01 HL102278 (which funded DISCOVeRY-BMT) (L.E.S.-C. and T.H.), and National Cancer Institute grant P30 CA016056 (which partially supported the Roswell Park Cancer Institute Biostatistics & Bioinformatics Core).

Replication data sets were supported by grants from the National Institutes of Health (Eunice Kennedy Shriver National Institute of Child Health and Human Development HD0433871; National Cancer Institute CA129045 and CA40046 [K.O.]; National Institute of Mental Health R01 MH101820; National Cancer Institute U01CA176063 and National Institute of General Medical Sciences U01GM92666); the St. Baldrick’s Foundation (K.O.); the American Cancer Society, Illinois Division (K.O.); and the Cancer Research Foundation (K.O.).

The Children’s Oncology Group GWAS was also supported by grants from the National Institutes of Health, National Cancer Institute (the Chair’s grant U10 CA98543 and Human Specimen Banking grant U24 CA114766).

The German GWAS was supported by the German Ministry of Education and Research (Bundesministeriums für Bildung und Forschung) through the National Genome Research Network, the popgen biobank, the Federal Radiation Protection Agency (project no. 3609S30013), the Deutsche Krebshilfe, and the Madeleine Schickedanz Kinderkrebs-Stiftung; received infrastructure support through the Deutsche Forschungsgemeinschaft excellence cluster “Inflammation at Interfaces”; and was conducted within the frame of the International BFM Study Group.

The Center for International Blood and Marrow Transplant Research was supported by US Public Health Service grant/cooperative agreement 5U24-CA076518 from the National Cancer Institute, the National Heart, Lung, and Blood Institute, and the National Institute of Allergy and Infectious Diseases; grant/cooperative agreement 5U10HL069294 from the National Heart, Lung, and Blood Institute and the National Cancer Institute; contract HHSH250201200016C with the Health Resources and Services Administration/US Department of Health and Human Services; grants N00014-15-1-0848 and N00014-16-1-2020 from the Office of Naval Research; and grants from Alexion; Amgen, Inc*; anonymous donation to the Medical College of Wisconsin; Astellas Pharma US; AstraZeneca; Be the Match Foundation; Bluebird Bio, Inc*; Bristol-Myers Squibb Oncology*; Celgene Corporation*; Cellular Dynamics International, Inc; Chimerix, Inc*; Fred Hutchinson Cancer Research Center; Gamida Cell Ltd; Genentech, Inc; Genzyme Corporation; Gilead Sciences, Inc*; Health Research, Inc; Roswell Park Cancer Institute; HistoGenetics, Inc; Incyte Corporation; Janssen Scientific Affairs, LLC; Jazz Pharmaceuticals, Inc*; Jeff Gordon Children’s Foundation; The Leukemia & Lymphoma Society; Medac, GmbH; MedImmune; The Medical College of Wisconsin; Merck & Co, Inc*; Mesoblast; MesoScale Diagnostics, Inc; Miltenyi Biotec, Inc*; National Marrow Donor Program; Neovii Biotech NA, Inc; Novartis Pharmaceuticals Corporation; Onyx Pharmaceuticals; Optum Healthcare Solutions, Inc; Otsuka America Pharmaceutical, Inc; Otsuka Pharmaceutical Co, Ltd Japan; Patient-Centered Outcomes Research Institute; Perkin Elmer, Inc; Pfizer, Inc; Sanofi US*; Seattle Genetics*; Spectrum Pharmaceuticals, Inc*; St. Baldrick’s Foundation; Sunesis Pharmaceuticals, Inc*; Swedish Orphan Biovitrum, Inc; Takeda Oncology; Telomere Diagnostics, Inc; University of Minnesota; and Wellpoint, Inc* (*corporate members).

The views expressed in this article do not reflect the official policy or position of the National Institutes of Health, the Department of the Navy, the Department of Defense, the Health Resources and Services Administration, or any other agency of the US government.


Contribution: A.I.C.-G. performed research, analyzed/interpreted data, and wrote the paper; T.H. conceived and designed the study, acquired data, and interpreted data analyses; L.M.P., K.O., A.S., E.H., C.A.H., D.O.S., L.P., X.S., A.W.B., and S.N.J.S. analyzed and interpreted data; E.E., D.E., M.S., J.C., and L.O. contributed data and data analyses and interpreted data; S.S., M.C.P., and D.J.W. acquired data and interpreted data analyses; P.L.M. interpreted data analyses; L.E.S.-C. conceived and designed the study, acquired, analyzed, and interpreted data, and wrote the paper; and all authors participated in the revising of the manuscript, contributed critically important intellectual content, and gave final approval of the version/revised version submitted.

Conflict-of-interest disclosure: T.H. owns stock in Novartis Pharmaceuticals Corporation. D.J.W. provided consulting services to, and served on advisory boards for, Kadmon and Alexion, and received research funding from Alexion. S.S. received compensation for travel, accommodations, and expenses from Astellas Pharma. M.C.P. received honoraria, as well as compensation for travel, accommodations, and expenses, from Baxalta and Atara Biotherapeutics. P.L.M. received honoraria from Celgene, Bristol-Myers Squibb, Janssen Pharmaceutical, Sanofi, and Karyopharm Therapeutics Inc; research funding from Celgene; and compensation for travel, accommodations, and expenses from Celgene and Sanofi. The remaining authors declare no competing financial interests.

Correspondence: Lara E. Sucheston-Campbell, College of Pharmacy, College of Veterinary Medicine, The Ohio State University, 496 W. 12th Ave, 604 Riffe, Columbus, OH 43210; e-mail: sucheston-campbell.1{at}


  • The full-text version of this article contains a data supplement.

  • Submitted February 20, 2017.
  • Accepted July 7, 2017.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
  62. 62.
  63. 63.
  64. 64.
  65. 65.
  66. 66.
  67. 67.
  68. 68.
  69. 69.
  70. 70.
  71. 71.
  72. 72.
  73. 73.
  74. 74.
  75. 75.
  76. 76.
  77. 77.
  78. 78.
  79. 79.
View Abstract