Induced pluripotent stem cell–based mapping of β-globin expression throughout human erythropoietic development

Kim Vanuytsel, Taylor Matte, Amy Leung, Zaw Htut Naing, Tasha Morrison, David H. K. Chui, Martin H. Steinberg and George J. Murphy

Data supplements

Article Figures & Data


  • Figure 1.

    Characterization and validation of the β-globin reporter human iPSC (hiPSC) line. (A) Targeting schematic depicting the introduction of a promoterless enhanced GFP (eGFP) cassette after the β-globin promoter. (B) Differentiation schematic and corresponding changes in β-globin transcript levels (quantitative reverse transcription PCR) during erythroid specification ± standard deviation (SD; 2 ≤ n ≤ 9). β-actin was used as a reference gene. CD34+ cells isolated from peripheral blood and specified toward the erythroid lineage (PB) were included as a primary control. (C-E) Characterization of wild-type (WT) or reporter (β_GFP) hiPSC-derived erythroid cells after 29 days of differentiation: representative FACS plots of GFP readout (C), bright field microscopy showing BFU-E colony morphology (×10) (D), and quantification of percentage of GFP+ cells ± SD (n = 9) (E). (F) HBB expression in individual GFP (gray) and GFP+ (green) cells based on scRNAseq counts. (G) Violin plot demonstrating a significant enrichment for HBB transcripts in the GFP+ fraction (green) compared with the GFP (gray) fraction based on normalized counts. ***FDR ≤ 0.0005, ****P ≤ .001. HA, homology arm; LCR, locus control region; PE, phycoerythrin; PGK, phosphoglycerate kinase promoter; PURO, puromycin resistance gene.

  • Figure 2.

    Characterization of erythroid differentiation. Time course including bright field microscopy images (×40) of Wright-Giemsa–stained cytospins and cell pellets (top) and FACS plots showing cell surface marker expression (bottom) at successive stages during erythroid differentiation. The time points shown correspond to the time points illustrated in Figure 1B. Day-15 cells represent HSPCs; suspension cultures were sampled at days 20, 22, and 25 to obtain progenitors at intermediate stages of erythroid specification; and day-29 cells represent BFU-E cells picked from Methocult cultures after 14 days of erythroid specification. Black arrows indicate examples of proerythroblasts (ProE), basophilic erythroblasts (Baso), polychromatic erythroblasts (Poly), and orthochromatic erythroblasts. FACS plot showing unstained cells is included as a reference below. Uniform gating based on day-29 cells was maintained throughout the figure.

  • Figure 3.

    Heatmap showing a subset of genes differentially expressed between day-29 GFPand GFP+sorted fractions. Heatmap illustrating downregulation of ribosomal transcripts, upregulation of a subset of genes involved in the ubiquitin-proteasome system (UPS) pathway and increased expression of genes linked to erythroid maturation in GFP+ vs GFP cells. In addition to genes coding for ribosomal proteins, this heatmap contains a selection of genes identified in supplemental Figure 4 as linked to erythroid maturation based on comparison of transcriptomic data sets representing terminal erythroid maturation found in the literature.40,44 The UPS-related genes shown represent a subset of these erythroid maturation genes identified in supplemental Figure 4.

  • Figure 4.

    Pseudotime analysis of scRNAseq data. (A) Supervised monocle plot displaying the trajectory of day-29 hiPSC-derived erythroid cells along pseudotime indicating a gradation from GFPHBBlow to GFP+ cells. The 3 input fractions specified when performing this analysis are shown in the violin plot on the right. (B) Principal component (PC) analysis plot constructed using the top 500 differentially expressed genes between the 3 conditions illustrating the separation of day-29 hiPSC-derived erythroid cells (top). Density graph showing the distribution of the 3 subpopulations with respect to PC1 (bottom). (C) Heatmap showing gene expression changes over pseudotime. Vertical columns represent units of pseudotime. (D) Scatter plots illustrating the expression pattern of selected genes over pseudotime. Normalized expression is shown as transcripts per million (TPM).

  • Figure 5.

    Globin expression in day-29 hiPSC-derived erythroid cells. (A) Coexpression of globin genes (HBE, HBG1, HBG2, HBB) in individual cells visualized as the percentage of transcripts from the individual globin type vs total β-globin gene cluster transcripts (HBE + HBG1 + HBG2 + HBB). Average percentages per globin type are shown on the right for the GFP and GFP+ fractions. A 2-tailed t test was performed to determine significant differences between GFP and GFP+ fractions. (B) Violin plots showing normalized expression of globin genes in the GFP– (gray) vs GFP+–sorted (green) fraction of day-29 hiPSC-derived erythroid cells. *FDR ≤ 0.05, **FDR ≤ 0.005, ***FDR ≤ 0.0005, ****P ≤ .001.


  • Table 1.

    Enrichr analysis of scRNAseq data

    Gene fractionEnriched terms*Adjusted PSource
     Top 200 expressed genes (all cells)Hemoglobin’s chaperone_Homo sapiens_h_ahspPathway6.46E−10BioCarta 2016
    Heme biosynthesis_Homo sapiens_PWY-59201.82E−03HumanCyc 2016
    Heme biosynthesis_Homo sapiens_P027461.39E−01Panther 2016
    Hypoxic and oxygen homeostasis regulation of HIF-1-alpha_Homo sapiens_4c0f3584-6193-11e5-8ac5-06603eb7f3035.54E−02NCI-Nature 2016
    Ribosome_Homo sapiens_hsa030107.22E−41KEGG 2016
    HIF-1 signaling pathway_Homo sapiens_hsa040662.75E−02KEGG 2016
     Upregulated in GFP+Hemoglobin’s chaperone_Homo sapiens_h_ahspPathway8.26E−07BioCarta 2016
    Heme biosynthesis_Homo sapiens_PWY-59204.54E−02HumanCyc 2016
    Heme biosynthesis_Homo sapiens_P027467.41E−02Panther 2016
    Ubiquitin protein ligase binding (GO:0031625)4.20E−02GO molecular function 2017
    Oxygen transport (GO:0015671)1.29E−04GO biological process 2015
    Oxidative stress_Homo sapiens_WP4081.12E−02WikiPathways 2016
    Cellular responses to stress_Homo sapiens_R-HSA-22627522.82E−04Reactome 2016
    HIF-1-α transcription factor network_Homo sapiens_20ef2b81-6193-11e5-8ac5-06603eb7f3035.57E−04NCI-Nature 2016
     Downregulated in GFP+Ribosome_Homo sapiens_hsa030104.75E−29KEGG 2016
    tRNA binding (GO:0000049)6.01E−16GO molecular function 2017b
    Peptide chain elongation_Homo sapiens_R-HSA-1569021.52E−31Reactome 2016
    Eukaryotic Translation Elongation_Homo sapiens_R-HSA-1568421.71E−31Reactome 2016
    Nonsense Mediated Decay (NMD) independent of the Exon Junction Complex (EJC)_Homo sapiens_R-HSA-9759561.71E−31Reactome 2016
    3′ -UTR-mediated translational regulation_Homo sapiens_R-HSA-1572795.82E−30Reactome 2016
    BCell type
     Top 200 expressed genes (all cells)CD71+_EarlyErythroid1.90E−23Human Gene Atlas
    Bone marrow (bulk tissue)2.98E−25ARCHS4 Tissues
    Cord blood1.60E−23ARCHS4 Tissues
    Erythroblast1.07E−03ARCHS4 Tissues
     Upregulated in GFP+CD71+_EarlyErythroid4.25E−49Human Gene Atlas
    Erythroid_cell6.60E−30Jensen Tissues
    Blood7.61E−13Jensen Tissues
    Erythroblast1.96E−12ARCHS4 Tissues
    CTranscription factor targets
     Upregulated in GFP+KLF15.196E−32Enrichr Submissions TF-Gene Coocurrence
    KLF1_20508144_ChIP-Seq_FETAL-LIVER-ERYTHROID_Mouse1.490E−06ChEA 2016
    GATA1_CHEA4.279E−15ENCODE and ChEA
    GATA1_erythroblast_hg197.311E−9Consensus TFs from ChIP-X
    GATA1_19941826_ChIP-Seq_K562_Human4.508E−8ChEA 2016
    TAL1_20566737_ChIP-Seq_PRIMARY_FETAL_LIVER_ ERYTHROID_Mouse5.805E−12ChEA 2016
    TAL1_erythroblast_mm94.769E−10ENCODE TF ChIP-seq 2015
    • * Enriched in different fractions of day-29 β_GFP reporter hiPSC-derived erythroid cells according to Enrichr analysis of scRNAseq data.

    • Gene set libraries were accessed through the Enrichr platform (