Developmental and Social-emotional Screening Instrument review process and criteria

Screening Instrument review process and criteria

The Minnesota Interagency Developmental Screening Task Force convened in 2004 to establish a standard of practice for developmental and social-emotional screening of children birth through age five, and assure its quality and effectiveness. Partners include the Minnesota Departments of Education, Health, and Human Services.

The goals of the Task Force are to:

  1. Establish criteria for developmental (cognition, fine and gross motor skills, speech and language) and social-emotional screening instrument selection.
  2. Develop a list of recommended and/or approved developmental and social-emotional screening instruments.

The Task Force integrates research and evidence-based practice to review developmental and social-emotional screening instruments. Information on screening instruments is gathered from several sources, including administration manuals, technical documents, literature reviews, and communication with the instrument developers and publishers.

Review criteria are based on nationally accepted psychometric standards. The Task Force reserves the right to modify the criteria standards used in the review process of developmental and social-emotional screening instruments. Developmental and social-emotional screening instruments that sufficiently meet the criteria outlined below are considered for recommendation/approval.

Review criteria

Expand All   Collapse All

The Task Force evaluates the purpose of the instrument to ensure that it is focused on screening, rather than assessment or diagnostic evaluation, and whether it is designed to screen for developmental and social-emotional health rather than to predict the future academic success of the child.

The following domains must be included in developmental screening: fine and gross motor, communication, cognitive, and social-emotional.

Currently, the social-emotional domains embedded within developmental screening instruments do not demonstrate adequate reliability and validity to determine if a child needs further assessment. Therefore, the Task Force also reviews and recommends separate instruments specifically for the social-emotional domain.

Reliability is an indicator of how consistently or how often identical results can be obtained with the same screening instrument. A reliable instrument is one in which differences in test results are attributable less to chance and more to systematic factors such as lasting and general characteristics of the child (Meisels & Atkins-Burnett, 2005).


  • The Task Force expects reliability scores of approximately 0.70 or above.
  • Each instrument is evaluated on the actual reliability scores and the methods used to obtain these scores, such as scores by age, test-retest, inter-rater and intra-rater reliability.

Validity of a screening instrument indicates how accurately it distinguishes between those children who are at risk and those not at risk for developmental or social delays or concerns. There are various measures of validity. The following are the key measures of validity for screening instruments:

  • Sensitivity: Accuracy of the instrument in identifying delayed development.
  • Specificity: Accuracy of the instrument in identifying individuals who are not delayed.

Other validity measures include:

  • Content validity: How well the measures represent all aspects of a given domain and skills of interest.
  • Construct validity: How well the instrument measures what it is supposed to measure.
  • Concurrent validity: How well the instrument under study compares to reference-standard (gold standard) measures or valid diagnostic assessment, usually performed 7-10 days after the screening test. The validity coefficient reports the agreement between the two tests (Meisels & Atkins-Burnett, 2005).
  • Positive predictive value: The probability an instrument can accurately identify delayed development.


  • The Task Force expects sensitivity and specificity scores of approximately 0.70 or above (AAP, 2015).
  • Each instrument is evaluated on the actual validity scores, sufficient sample size representative of the US population, and the methods used to obtain these scores

Expectations for child development change over time as new research emerges, and as changes occur in population demographics, technology, and curriculum. According to national standards, screening instrument normative data should be updated every 10-15 years to account for these changes (Emmons and Alfonso, 2005; Head Start, 2011; Glascoe, 2014).


  • The Task Force recommends instruments that have been developed or re-normed within the last 15 years, unless no other equivalent instrument is available that better meets the screening need for the given population.
  • Other considerations may include whether the instrument has had recent or ongoing research that demonstrates its effectiveness in identifying children who need further evaluation for developmental or social-emotional concerns.

Additional Considerations

The following critical considerations are also reviewed by the Task Force.

Expand All   Collapse All

Practicality refers to the ease of administration of the screening instrument, and the amount of time needed to administer and score the screening instrument. The instrument should typically take 30 minutes or less to administer.

The Task Force considers both the target group for whom the instrument was designed and standardized, and the age of the child the instrument is designed to screen. This information should be clearly stated by the developer or publisher.

The Task Force considers:

  • The availability of the instrument in languages other than English, and whether the instrument has been validated in those languages.
  • The instrument's ability to accurately screen children from diverse cultures.
  • Normative scores, or scores used to establish appropriate cutoff points for referral for the population for which the test is developed, should be provided (Meisels & Atkins-Burnett, 2005).

The minimum level of expertise required to administer the tool and score and interpret the results are instrument-specific, and can range from paraprofessional to doctoral-prepared professionals. Some instruments allow for the screening instrument to be administered by a paraprofessional, but need to be scored or evaluated by a professional to determine if the child should be referred for further assessment. The Task Force considers an instrument that requires administration by a psychologist or similar professional to be an assessment instrument rather than a screening instrument.

The Task Force also reviews availability of training materials or workshops for screeners to receive training on proper administration.

The Task Force understands that school districts and organizations responsible for screening programs consider cost when selecting a developmental screening instrument. For this reason, the Task Force provides cost information on each developmental screening instrument, as available from the publisher.


Emmons, M.R. & Alfonso, V.C. (2005). A critical review of the technical characteristics of current preschool screening batteries. Journal of Psychoeducational Assessment, 23(11).

Glascoe, F.P. (2014). Best practices in test construction: Quality standards for reviewers and researchers. Journal of Developmental and Behavioral Pediatrics (submitted).

Meisels, S. J., & Atkins-Burnett, S. (2005). Developmental screening in early childhood: A guide (5th ed.). Washington, DC: National Association for the Education of Young Children.

Nunnelly, J.C. and Berenstein, I.H. (1994). Pyschometric Theory. (3rd ed.). New York: McGraw-Hill.

U.S. Department of Health and Human Services Administration for Children & Families (2011). Resources for Measuring Services and Outcomes in Head Start Programs Serving Infants and Toddlers. Retrieved 8/2014.

Updated Wednesday, 24-Mar-2021 18:55:04 CDT