Instrument review process and criteria - Developmental and social-emotional screening of young children (0-5 years of age) in Minnesota

Developmental and social-emotional screening of young children (0-5 years of age) in Minnesota

Instrument review process and criteria

The Minnesota Interagency Developmental Screening Task Force was convened in spring of 2004 to establish a standard of practice for the developmental component of the screening of children birth through age five, and assure its quality and effectiveness. Partners include the Minnesota Departments of Education, Health, and Human Services.

The goals of the Task Force are to:

  1. Establish criteria for developmental (cognition, fine and gross motor skills, speech and language) and social-emotional screening instrument selection.
  2. Develop a list of recommended and/or approved developmental and social-emotional screening instruments.

The Task Force continues to integrate research and evidence-based practice in the review of developmental and social-emotional screening instruments. Information on screening instruments is gathered from several sources, including administration manuals, technical documents, literature reviews, and communication with the instrument developers and publishers.

The Task Force reserves the right to modify the criteria standards used in the review process of developmental and social-emotional screening instruments. Developmental and social-emotional screening instruments that sufficiently meet the criteria outlined below are considered for recommended/approved status.

Review criteria

[Expand All] [Collapse All]

Instrument purpose

The Task Force evaluates the purpose of the instrument to ensure that it is focused on screening, rather than assessment, or diagnostic evaluation, and whether it is designed to screen for developmental and social-emotional health rather than to predict the future academic success of the child.

If a purpose is not clearly defined in a brief statement, the Task Force reviews the descriptive materials about the instrument in an attempt to determine the instrument's purpose.

Developmental domains

The following domains must be included in developmental screening: motor, language, cognitive, and social-emotional.

Currently, the social-emotional domains embedded within developmental screening instruments do not demonstrate adequate reliability and validity to determine if a child needs further assessment. Therefore, the Task Force also reviews and recommends separate instruments for the social-emotional domain.


Reliability is an indicator of how consistently or how often identical results can be obtained with the same screening instrument. A reliable instrument is one in which differences in test results are attributable less to chance and more to systematic factors such as lasting and general characteristics of the child (Meisels & Atkins-Burnett, 2005).


  • The Task Force expects reliability scores of approximately 0.70 or above.
  • Each instrument is evaluated on the actual reliability scores and the methods used to obtain these scores, such as scores by age, test-retest, inter-rater and intra-rater reliability.


Validity within a screening instrument indicates how accurately it distinguishes between those children who are at risk and those not at risk for developmental or social delays or concerns. There are various measures of validity. The following are the key measures of validity for screening instruments:

  • Sensitivity: Accuracy of the instrument in identifying delayed development.
  • Specificity: Accuracy of the instrument in identifying individuals who are not delayed.

Other validity measures include:

  • Content validity: The extent to which the measures represent all aspects of a given domain and skills of interest.
  • Construct validity: The degree to which the equipment measures what it is supposed to measure.
  • Concurrent validity: Comparison between the instrument under study and reference-standard (gold standard) measures or valid diagnostic assessment usually performed 7-10 days after the screening test. The validity coefficient reports the agreement between the two tests (Meisels & Atkins-Burnett, 2005).
  • Positive predictive value: The probability of an instrument to accurately identify delayed development.


  • The Task Force expects sensitivity and specificity scores of approximately 0.70 or above (AAP, 2015).
  • Each instrument is evaluated on the actual validity scores, sufficient sample size representative of the US population, and the methods used to obtain these scores.

Recent Standardization

Understanding of and expectations for child development change over time as new researches emerges, and as changes occur in population demographics, technology, and curriculum. According to standards, screening instrument normative data should be updated every 10-15 years to account for these changes (Emmons and Alfonso, 2005; Head Start, 2011; Glascoe, 2014).


  • The Task Force recommends instruments that have been developed or normed within the last 15 years, unless no other equivalent instrument is available that better meets the screening need for the given population.
  • Other considerations may include whether the instrument has had recent or ongoing research that demonstrates its effectiveness in identifying children who need further evaluation for developmental or social-emotional concerns.

[Expand All] [Collapse All]

Additional Considerations

The following items are important considerations in selection of an instrument, and are reviewed by the Task Force. Issues related to these items will not solely result in failure to achieve recommended/approved status, but when combined with concerns in the above criteria, an instrument may be eliminated from consideration.

[Expand All][Collapse All]


Practicality refers to the ease of administration of the screening instrument, and the amount of time needed to administer and score the screening instrument. The instrument should typically take 30 minutes or less to administer to English-speaking populations.

Population and age span targeted by the instrument

The Task Force considers both the target group for whom the instrument was designed and standardized, and the age of the child the instrument is designed to screen. This information should be clearly stated by the developer or publisher.

Cultural, ethnic, and linguistic sensitivity

The Task Force considers:

  • The availability of the instrument in languages other than English, and whether the instrument has been validated in those languages.
  • The instrument’s ability to accurately screen children from diverse cultures
  • Normative scores, or scores used to establish appropriate cutoff points for referral for the population for which the test is developed, should be provided (Meisels & Atkins-Burnett, 2005)

Minimum expertise of screeners

Screening instruments are designed to be administered by persons with varying levels of expertise, such as assistants, teachers, or psychologists. Some instruments allow for the screening instrument to be administered by a paraprofessional, but need to be scored or evaluated by a professional to determine if the child should be referred for further assessment.

The Task Force looks at the presence of training materials or availability of training workshops for screeners to receive training on proper administration. The Task Force also considers an instrument that requires administration by a psychologist or similar professional may be an assessment instrument rather than a screening instrument.


The Task Force understands that school districts and organizations responsible for screening programs consider cost when selecting a developmental screening instrument. For this reason, the Task Force provides cost information on each developmental screening instrument, as available from the publisher.

[Expand All][Collapse All]


Emmons, M.R. & Alfonso, V.C. (2005). A critical review of the technical characteristics of current preschool screening batteries. Journal of Psychoeducational Assessment, 23(11).

Glascoe, F.P. (2014). Best practices in test construction: Quality standards for reviewers and researchers. Journal of Developmental and Behavioral Pediatrics (submitted).

Meisels, S. J., & Atkins-Burnett, S. (2005). Developmental screening in early childhood: A guide (5th ed.). Washington, DC: National Association for the Education of Young Children.

Nunnelly, J.C. and Berenstein, I.H. (1994). Pyschometric Theory. (3rd ed.). New York: McGraw-Hill.

U.S. Department of Health and Human Services Administration for Children & Families (2011). Resources for Measuring Services and Outcomes in Head Start Programs Serving Infants and Toddlers. Retrieved 8/2014 from