Standard measures for sickle cell disease research: the PhenX Toolkit sickle cell disease collections

James R. Eckman, Kathryn L. Hassell, Wayne Huggins, Ellen M. Werner, Elizabeth S. Klings, Robert J. Adams, Julie A. Panepinto and Carol M. Hamilton

Key Points

  • The PhenX Toolkit recommends standard SCD measures for use in clinical, epidemiologic, and genomic studies.

  • Widespread use of PhenX measures will accelerate translational research to elucidate the etiology, epidemiology, and progression of SCD.


Standard measures and common data elements for sickle cell disease (SCD) will improve the data quality and comparability necessary for cross-study analyses and the development of guidelines that support effective treatments and interventions. In 2014, the National Institutes of Health, National Heart, Lung, and Blood Institute (NHLBI) funded an Administrative Supplement to the PhenX Toolkit (consensus measures for Phenotypes and eXposures; to identify common measures to promote data comparability across SCD research. An 11-member Sickle Cell Disease Research and Scientific Panel provided guidance to the project, establishing a core collection of SCD-related measures and defining the scope of 2 specialty collections: (1) cardiovascular, pulmonary, and renal complications, and (2) neurology, quality-of-life, and health services. For each specialty collection, a working group of SCD experts selected high-priority measures using a consensus process that included scientific community input. The SCD measures were released into the Toolkit in August 2015. The 25 measures included in the core collection are recommended for use by all NHLBI-funded investigators performing human-subject SCD research. The 10 neurology, quality-of-life, and health services measures and 14 cardiovascular, pulmonary, and renal measures are recommended for use within these specialized research areas. For SCD and other researchers, PhenX measures will promote collaborations with clinicians and patients, facilitate cross-study analysis, accelerate translational research, and lead to greater understanding of SCD phenotypes and epigenetics. For clinicians, using PhenX measures will help elucidate the etiology, progression, and treatment of SCD, leading to improved patient care and quality of life.


Sickle cell disease (SCD) exerts a major impact on every organ system in the body starting early in life and causes complex clinical complications with physical, psychological, social, and economic consequences for the affected individual. Accordingly, investigators with diverse perspectives contribute to SCD research. To assess the complexity of SCD phenotypes, researchers use a variety of measures to collect information across SCD research studies. However, the scope of these studies and the heterogeneity of the measures used make combining or comparing studies difficult. Additionally, the orphan disease status of this disorder represents a further barrier to SCD research. There are approximately 100 000 affected individuals in the United States; thus, the potential to conduct studies with large cohorts of subjects is limited. Opportunities to collaborate via international studies, especially in low-resource populations in which the number of affected individuals is substantially greater, present additional measurement challenges. Collaboration across diverse disciplines in areas of common interest would increase the impact of individual studies and lead to improved health outcomes and quality of life for individuals with SCD.

Analyses of data collected across multidisciplinary SCD studies can be combined to test new hypotheses and accelerate scientific progress. Standard measures and associated common data elements (CDEs) are needed to improve data quality and consistency at the time of data collection. The use of standard measures in SCD research will improve data comparability and make cross-study analyses of data more efficient and informative. Furthermore, the use of such standard measures would facilitate cross-study comparisons of not only SCD studies but also other study populations in which the same standard measures are used.

The National Institutes of Health (NIH), National Heart, Lung, and Blood Institute (NHLBI) has a history of interest in standard measures and CDEs. In 1977, the natural history of SCD was studied by the Cooperative Study of Sickle Cell Disease (CSSCD), which used a detailed protocol to collect clinical, laboratory, organ damage, and complication data from 3000 subjects who were followed at 15 centers.1 The Comprehensive Sickle Cell Centers initiated a collaborative effort in 2005 to establish consensus definitions of the phenotypic manifestations of SCD.2 In 2010, NHLBI convened the Hemoglobinopathies Uniform Medical Language Ontology Working Groups (WGs) to address the emerging areas of data science that would inform the development of CDEs and standard measures as research resources ( In 2006–2010, the Adult Sickle Cell Quality-of-Life Measurement Information System (ASCQ-Me) was developed to enable adults with SCD to self-report their physical, mental, and social health and indicators of disease severity.3 The domains and measures identified in these initial projects provided the foundation for the priorities in the PhenX (consensus measures for Phenotypes and eXposures) Measures for Sickle Cell Disease Research project.

In May 2014, NHLBI funded the project “PhenX Measures for Sickle Cell Disease Research” to provide investigators and clinicians with standard measures and CDEs for SCD research. This project was guided by the 11-member Sickle Cell Disease Research and Scientific Panel (SRSP), which identified a core collection of measures for use by all SCD researchers and prioritized 2 SCD WGs: the cardiovascular, pulmonary, and renal WG (WG 1) and the neurology, quality-of-life, and health services WG (WG 2). The measures selected by the SRSP and the 2 WGs are provided to the scientific community at no cost via the PhenX Toolkit.4 The goal of this report is to present the process used by the SRSP and WGs to select the SCD measures, briefly describe the measures, and provide examples of how these measures can be used. Measures that the WGs felt would be highly relevant to the SCD research community but did not quite meet the selection criteria for inclusion in the Toolkit were included in the PhenX Toolkit Supplemental Information (SI).

Materials and methods

To identify standard measures for SCD research, NHLBI took advantage of the well-established consensus process and infrastructure developed for the PhenX Toolkit.5 Driven by the scientific community, PhenX is guided by an overarching steering committee (SC), and measures are chosen by WGs of domain experts using a consensus process. For each measure, the Toolkit provides all of the information an investigator needs to implement the measure, including a description, the rationale for its inclusion, detailed protocol(s) for collecting the data, and other supporting documentation. Measures selected for inclusion in the PhenX Toolkit must meet established criteria. These criteria require that the measures be high quality, well established, reproducible, broadly applicable, and relatively low burden for both participants and investigators.6

Most of the selection criteria for PhenX measures, are self-explanatory (eg, well established, broadly applicable, and reproducible).5,6 However, the SC recognized that the concept of “low burden” necessitates clearly defined parameters. If a measure requires major equipment, specialized training to conduct or collect, or more than 15 min for an unaffected person to complete, then the measure is considered “high burden”; only a limited number of high-burden measures can be added to the Toolkit. In the Toolkit, users are alerted when they select high-burden measures and are asked to “review requirements” before deciding to include the measure in their Toolkit. The WGs were also responsible for identifying “essential measures” (ie, any additional measures that need to be collected to interpret the data). For example, current age, gender, history of transfusion, and medication inventory are essential measures for the interpretation of complete blood count.

For the PhenX SCD project, the SRSP provided overall direction and guidance, defined the scope to be addressed by the two SCD WGs, and helped identify individuals to serve as WG members. The SRSP cochairs also acted as SRSP liaisons to each SCD WG. The SRSP identified a core collection of measures recommended for use by all SCD researchers and was responsible for reviewing and approving the measures selected by the SCD WGs for release in the PhenX Toolkit. Other key stakeholders were engaged to participate in WG or SRSP meetings or both, including representatives from other NIH institutes and agencies, such as the Health Resources and Services Administration and the Centers for Disease Control and Prevention.

Two WGs were prioritized by the SRSP and assembled to include a diverse group of scientists with enthusiasm for the project and the expertise necessary to address the scope. The 2 WGs were the cardiovascular, pulmonary, and renal WG (WG 1) and the neurology, quality-of-life, and health services WG (WG 2). The WGs reviewed well-established scientific measures, identified measures and protocols currently in use in the field, examined SCD-related measures in the PhenX Toolkit to ensure that new measures would complement existing Toolkit content, assessed the diversity of methods for application to a range of study designs, and recommended individual protocols for collecting PhenX measures. As part of the measure-selection process, the WGs requested input from researchers, clinicians, and other stakeholders via a community outreach effort. The WGs were responsible for identifying appropriate groups to contact to obtain feedback and considered the results of the community outreach during their final deliberations. The measures proposed for inclusion in the Toolkit were subject to the approval of the SRSP and PhenX SC.


As a result of this effort, three collections of measures for SCD research were established and released in the PhenX Toolkit: a core collection intended for use by all SCD investigators and 2 specialty collections, 1 from each of the WGs. In some cases, a new protocol or annotation was added to an existing PhenX measure to address specific needs relevant to SCD research.

The SCD core collection of measures was defined by the SRSP and constitutes measures recommended for use by all investigators performing data collection in SCD. The SCD core measures are designed to create a framework of standard measures and CDEs that will facilitate cross-study data analysis and address questions broadly relevant to SCD-related outcomes. Core Tier 1 measures are recommended for inclusion by all investigators engaged in defining important baseline SCD characteristics in research involving human subjects. The core Tier 2 measures provide important baseline characteristics that may inform more specific areas of research. The 11 core Tier 1 measures recommended for all areas of SCD research are shown in Table 1.

Table 1.

Sickle cell disease research and scientific panel core tier 1 measures

Eight of the measures selected for inclusion in the core Tier 1 collection were already in the PhenX Toolkit. The 3 new measures that were added to the PhenX Toolkit were frequency of sickle cell pain episodes per year, history of transfusion, and hemoglobin characterization. The protocol used to collect the frequency of sickle cell pain episodes per year measure is the 3-question, self-report of pain episodes using a 6-month recall window, as in the CSSCD. Similarly, the history of transfusion measure involves 3 simple questions answered by patient/proxy recall; these questions ask for the number of “pints” of blood ever received and whether the patient is on chronic transfusion or iron chelation therapy. Hemoglobin characterization is designed to capture the results from diagnostic testing to determine the type of SCD, specifically, analyses of the types of hemoglobin by electrophoresis, high-performance liquid chromatography, or DNA. The SRSP included SCD case definitions from the Newborn Screening Technical Assistance and Evaluation Program (NewSTEPs) in the SI. The NewSTEPs case definitions are relevant guidelines for determining the phenotypes or genotypes of subjects included in SCD research but do not include detailed instructions for collecting the data.

The 16 core Tier 2 measures shown in Table 2 are relevant to many areas of SCD research but are more specialized and may require a greater time commitment or more resources to collect. As part of core Tier 2, 2 new measures were added to the Toolkit: marital status of primary caregiver and pediatric school performance.

Table 2.

Sickle cell disease research and scientific panel core tier 2 measures

The measures in the WG collections are complementary to core Tier 1 and core Tier 2 and are recommended for specialized research domains.

The cardiovascular, pulmonary, and renal complications specialty collection includes measures selected by WG 1 on the basis of the prioritization of appropriate available measures for the pathophysiological mechanisms and organ systems most affected by SCD. These 14 measures (including 11 new ones) address heart and lung function and disease and biomarkers for hemolysis, anemia, iron overload, and renal function, as is shown in Table 3. Brachial artery reactivity, cell free hemoglobin, liver iron, cardiac short-axis function, and sleep disordered breathing measures were included in the SI.

Table 3.

Cardiovascular, pulmonary, and renal complications specialty collection measures

The neurology, quality-of-life, and health services specialty collection includes measures identified and prioritized by WG 2 that address important aspects of SCD, including developmental delays, risk factors and outcomes for stroke, quality of life, and quality of care (Table 4). These 10 measures (including 9 new ones) include protocols for both pediatric and adult populations as needed, because many of the complications in this domain begin in early childhood and span the lifetimes of subjects with SCD.

Table 4.

Neurology, quality-of-life, and health services specialty collection measures

Three additional measures important for SCD-related neurological research (brain arterial blood supply, brain morphology by computerized tomography, and brain morphology by magnetic resonance imaging [MRI]) were included in the SI. The American College of Radiology–American Society of Radiology–Society of Neurointerventional Surgery–Society for Pediatric Radiology have published standard protocols appropriate for data collection across all ages.7 The intent of including these measures in the SI is to promote the standardization and validation of these measures for SCD research.


The SCD collection of measures in the PhenX Toolkit provides a basic set of CDEs with standardized protocols for data collection in important areas of SCD, including a core set of essential data that should be included in all studies. Wide incorporation of PhenX measures into research going forward will markedly enhance the ability to perform cross-study analysis and compare findings in SCD with those in other populations also studied using the Toolkit.

The National Human Genome Research Institute (NHGRI) recognized the need for standard measures to allow investigators to more effectively compare or combine their studies. Standard measures and associated CDEs improve data quality and ensure data comparability, thereby facilitating study validation and meta-analyses. The goal of the PhenX project (phase 1), which was funded by NHGRI and the NIH Office of Behavioral and Social Sciences Research, was to provide the research community with standard measures for use in large-scale genomic studies.1-3,7 The approach was to select up to 15 measures for each of 21 research domains and make them freely available to the research community via the Web-based PhenX Toolkit. A 12-member SC prioritized 21 domains to be addressed by WGs; these domains included Demographics, Anthropometrics, Environmental Exposures, Speech and Hearing, Psychiatric, and Psychosocial.8 The PhenX SC established criteria for the WGs to be mindful of as they selected measures to be included in the Toolkit. The selection criteria require that the measures be high quality, well established, reproducible, broadly applicable, and relatively low burden for both participants and investigators.9

The first project implemented to add depth to a specific research area was the PhenX Measures for Substance Abuse and Addiction (SAA) supplement, which was funded by the National Institute on Drug Abuse (NIDA). Although the PhenX Toolkit already had an Alcohol, Tobacco, and Other Substances domain, NIDA wanted to provide additional support for SAA investigators. This project resulted in a core collection of measures for use by all SAA researchers and several specialty collections.10 In 2013, PhenX (phase 2) was funded via a Genomic Resource award, and the scope was expanded beyond genomics and common complex diseases to include rare genetic conditions and a variety of study designs. Recognizing the success of the NIDA-funded supplement, other NIH institutes and programs, including the National Institute of Mental Health and the Tobacco Regulatory Science Program, also provided funding to add depth to the PhenX Toolkit. In 2014, PhenX Measures for Sickle Cell Disease Research was funded by NHLBI to provide investigators and clinicians with standard measures and CDEs for SCD research.

PhenX measures have been recommended to promote the collection of comparable data in 226 funding opportunity announcements ( and 5 NIH guide notices ( As a result, PhenX measures have been incorporated into both individual research projects11-25 and several major studies and consortiums, including the Precision Medicine Initiative,1,2 the Environmental Influences on Child Health Outcomes program, the Adolescent Brain Cognitive Development Study,3 and the Tobacco Centers of Regulatory Science.4

Establishing requirements to include PhenX measures in funding opportunities for SCD research by NHLBI and other NIH institutes will facilitate the generation of CDEs. For example, the recent NHLBI funding opportunity “Sickle Cell Disease Implementation Consortium (SCDIC): Using Implementation Science to Optimize Care of Adolescents and Adults with Sickle Cell Disease (U01)” (FOA RFA-HL-16-010 and RFA-HL-16-011) recommended that the PhenX measures be used to harmonize collected data. The goal of the SCDIC is to identify barriers to health care and its utilization and set goals for reducing those barriers and improving patients’ health status. To accomplish these aims, the SCDIC is integrating PhenX with patient-reported outcomes from the NIH PROMIS and ASCQ-Me ( One year after the launch of the SCDIC, 53% of the targeted 1252 subjects have completed data in the needs assessment, and approximately 5% of the targeted 2400 subjects have entered baseline data into the registry.

PhenX measures are also being used in the Sickle Cell Disease Treatment Demonstration Project and the NHGRI-funded Human Heredity and Health in Africa (H3Africa). The Sickle Cell Disease Treatment Demonstration Project is using PhenX measures to collect a minimum data set from participating partners to document health services utilization and quality improvement in health services delivery in this population ( In H3Africa, several current projects used PhenX measures when developing their case report forms. Going forward, a recommended minimum set of questions based on PhenX is being implemented to harmonize data from new H3Africa projects. Additionally, the Sickle Cell Disease Ontology group used PhenX measures as a basis for parts of the sickle cell disease ontology.26

The cardiovascular, pulmonary, and renal WG (WG 1)

The goal of WG 1 was to incorporate the standard measures used to diagnose the often-inter-related complications of SCD involving the heart, lungs, and kidneys. Conditions that fell under the auspices of this WG included congestive heart failure, endothelial dysfunction, systemic and pulmonary hypertension, airways disease, abnormal pulmonary function, sleep-disordered breathing, hypoxemia, proteinuria, and chronic kidney disease. Because hemolysis is a common feature of many of the complications within this category, the biomarkers comprising the hemolytic index27 were included. The members of this WG acknowledged that a number of the existing renal measures in the Toolkit have SCD-specific issues (eg, the urinary excretion of creatinine results in overestimation of the glomerular function rate).28 These measures were annotated accordingly. WG 1 identified several limitations of existing protocols and annotated them to provide normative values for different populations and age groups. Additionally, annotations were made to the pulse oximetry protocol to reflect limitations in SCD resulting from a right-shifted oxygen-hemoglobin dissociation curve in the setting of more severe anemia.29,30

Through their efforts, WG 1 identified several promising new measures that could not be included in the Toolkit because of insufficient validation or excessive burden. These included brachial artery relaxivity, cell free hemoglobin, liver iron by MRI R2*, cardiac short-axis function, and sleep disordered breathing, all of which were included in the SI.

The neurology, quality-of-life, and health services WG (WG 2)

Although the importance of central nervous system imaging in the clinical evaluation of neurological disease in individuals with SCD was recognized, the relevant imaging techniques are high-burden measures and, as such, were not included in the PhenX Toolkit. Annotation is provided for protocols to obtain measures of brain arterial blood supply by cervicocerebral magnetic resonance angiogram, brain morphology by computed tomography, and brain morphology by MRI.

One of the important contributions of the WG 2 to the PhenX Toolkit was the addition of life-stage–specific protocols to measures that were already in the Toolkit. For example, WG 2 identified and added pediatric and adult protocols and annotated measures that were already included in the Toolkit. In some cases, measures that were already in the Toolkit were annotated with pediatric collection recommendations. Indeed, pediatric protocols were added to migraine, visual memory, executive function, working memory, and pain type and intensity, which already existed in the Toolkit.

SCD affects every organ system, leading to accelerated morbidity and mortality while exerting considerable psychological, social, and economic impacts over the life of the affected individual. Because of limited resources, only 2 SCD WGs were engaged to develop SCD specialty collections. Thus, the need to systematically identify standard measures for SCD research in many other areas remains. The PhenX Toolkit focuses on low-burden measures with the aim of reducing subject inconvenience, research infrastructure requirements, and cost. Investigators are encouraged to first consider using PhenX SCD core and specialty measures for data collection but are also encouraged to include additional measures as needed to support their research hypotheses. Researchers will be able to generate custom data collection worksheets to help them integrate their PhenX measures into a new or existing study design. Additionally, they will be able to generate data dictionaries for their selected PhenX measures (in dbGaP or REDCap format). The PhenX Toolkit provides not only measures and detailed protocols but also tools to help researchers effectively use the measures and identify potential collaborations. The PhenX Toolkit Link Your Study feature allows registered users to share information about what PhenX measures they are using and identify potential collaborators.

The SCD community now has a resource that will improve the consistency of data collection across all SCD studies with additional depth in 2 SCD research domains. Measures released in the PhenX Toolkit are intended for use in clinical, epidemiologic, and genomic studies and have the potential to accelerate translational research. The PhenX Toolkit provides researchers with the tools they need to improve the consistency and quality of data collection, foster collaborations, and compare and combine data sets. For researchers, adopting and using PhenX standard measures will promote collaborations among researchers, clinicians, and patients, leading to greater understanding of the phenotypes in SCD. For clinicians, using PhenX measures should expedite quality assessment and improvement and, thereby, improve patient care, outcomes, and quality of life. Accordingly, the consistent use of these measures by SCD researchers and clinicians will accelerate translational research and help the SCD community better understand the etiology, epidemiology, and progression of SCD and improve the treatment of its complications.


The authors wish to acknowledge all Sickle Cell Disease Research and Scientific Panel (SRSP) and Working Group (WG) members: SRSP members included J.R.E. (co-chair), K.L.H. (co-chair), Jon A. Detterich, Jeffrey Glassberg, Allison A. King, Zora R. Rogers, Kim Smith-Whitley, John J. Strouse, James Taylor, Marilyn J. Telen, and E.M.W. WG 1 members included E.S.K. (chair), Carol Blaisdell, Jon A. Detterich, Antonio Guasch, Johnson Haynes, Vandana Sachdev, and John Wood. WG 2 members included R.J.A. (co-chair), J.A.P. (co-chair), F. Daniel Armstrong, David C. Brousseau, Judith A. Paice, Steven Pavlakis, and Marsha J. Treadwell.

Funding for the PhenX Toolkit was provided by National Institutes of Health, National Human Genome Research Institute (via Cooperative Agreement U41 HG007050) with cofunding from the National Institute on Drug Abuse. The sickle cell disease Administrative Supplement to the U41 was supported with funding from the National Heart, Lung, and Blood Institute.


Contribution: J.E.R., K.L.H., E.M.W, W.H., and C.M.H. participated in the drafting and writing of the manuscript; W.H. created Tables 1-4; E.S.K. participated in writing the sections about the cardiovascular, pulmonary, and renal WG (WG 1) and provided a critical review of the manuscript; and R.J.A. and J.A.P. participated in writing the sections about the neurology, quality-of-life, and health services WG (WG 2) and provided a critical review of the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Carol M. Hamilton, RTI International, 3040 Cornwallis Rd, Research Triangle Park, NC 27709; e-mail: chamilton{at}

  • Submitted July 18, 2017.
  • Accepted November 3, 2017.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45.
  46. 46.
  47. 47.
  48. 48.
  49. 49.
  50. 50.
  51. 51.
  52. 52.
  53. 53.
  54. 54.
  55. 55.
  56. 56.
  57. 57.
  58. 58.
  59. 59.
  60. 60.
  61. 61.
View Abstract