ASDBank English Nadig Corpus

Aparna Nadig
School of Communication Sciences and Disorders
McGill University


Janet Bang
College of Education
San Jose State University

Participants: 38
Type of Study: longitudinal
Location: Canada
Media type: not yet available
DOI: doi:10.21415/T54P4Q

Browsable transcripts

Download English transcripts
  • A subset of this corpus includes 38 of the 40 children included in the matched group in Bang and Nadig (2015) below. The parent input variables used in the publication (i.e., maternal tokens, types, D, MLU and number of utterances) are provided in the excel document: bang_nadig.xlsx.

    Citation information

    Publications using this data should cite:

    Additional publications presenting data from this sample:

    In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

    Usage Restrictions

    We set no restrictions on the use of this data, but would appreciate notice of studies and articles using this corpus. Please send notice to the following contacts:

    Janet Bang:

    Aparna Nadig:


    Warnings concerning the use of this data include:


    Only families who provided informed consent to have their transcriptions contributed to CHILDES are included in this corpus. Options for consent included either keeping all first names or replacing first names with a pseudonym. Parents were informed that no other identifying information (i.e., last names, birth dates, etc) would be included on the transcript. Telephone numbers that were uttered during the interaction were replaced with different numbers.

    Project Description

    Data collection

    The overall goals of this project were to longitudinally examine word learning in children with autism spectrum disorders (henceforth ASD) and typically-developing children from both English-speaking and French-speaking families in Montreal, Canada. The study employed a variety of measures, including a natural language sample during parent-child interaction. Families participated at three time points over the course of a year (between 2009-2012). The first and last visits were conducted at McGill University and included developmental assessments and other experimental measures (i.e., fast mapping and a non-word repetition task). The present language sample was collected at the second time point (approximately 6 months after the first visit) and was either conducted at the families’ home or at McGill University, depending on the preference of the families.

    The language sample that comprises this corpus was collected during a freeplay task in which the parent and child played with a standardized set of toys for approximately 10 minutes and an experimenter video recorded the session. At the end of 10 minutes the parent was asked to clean up with the child. This freeplay task was often the first task of the second visit, depending on the child’s interests. Pretend play was another focus of the freeplay task; toys were chosen based on those commonly used in the pretend play literature. Other activities during this second visit included shared book-reading and a task examining how parents teach children novel words.

    For the freeplay interaction, a blanket approximately 4x4 square feet was laid down and parents were asked to try their best to stay on the blanket and play with their child as they typically would. Toys included: a tea set (two cups, saucers, spoons), colored blocks, two dolls (a female doll and a small baby doll), a telephone (on the telephone was a hologram of a horse), baby bottle, and a dump truck. Families were videotaped and transcription was done from the video recording.

    Sampling procedure

    This study was conducted in the city of Montreal, Canada. The mother-child dyads include both English-exposed (En) and French-exposed (Fr) young children with ASD or typically developing children, and their mothers. Participants were included as En or Fr when their language exposure across settings consisted of 75% or more of the respective language by parent report.

    Children with ASD (36 – 74 months) were recruited in collaboration with the Montreal Children’s Hospital Autism Spectrum Disorders Clinic, as well as through flyers in the Montreal area. Inclusion criteria for the children with an ASD included: a clinical diagnosis of Autism Spectrum Disorder, most often from a multi-disciplinary autism evaluation team at the Montreal Children’s Hospital, meets criteria for ASD on the The Modified Checklist for Autism in Toddlers (M-CHAT; Robins, Fein, & Barton, 1999; 2 or more “critical items” are failed OR when any three items are failed), The Autism Diagnostic Observation Schedule (ADOS; Lord et al., 2000; score of 7 or higher), absence of other medical conditions associated with autism (e.g. fragile X syndrome, tuberous sclerosis); absence of physical disability that would interfere with completing the study procedures, ability to sit and work to complete developmental testing and study tasks.

    Typically-developing children (henceforth TYP; 12 – 57 months) were recruited from the McGill Infant Research Group Database. Inclusion criteria for the typically-developing children included no symptoms of autism based on parent report and the MCHAT; no developmental, learning or behavioral disorder by parent report; no history of significant medical complications or conditions; no 1st or 2nd degree relatives with an ASD; absence of physical disability that would interfere with completing the study procedures. Typically-developing children were recruited to match children with ASD on language ability. Many children with ASD experience a significant language delay; therefore the TYP children were substantially younger than the children with ASD as a group.

    Transcription protocol

    There were a total of 3 French-speaking (undergraduate students in Psychology, Neuroscience, and Anatomy and Cell Biology) and 3 English-speaking transcribers (two graduate students in Communication Sciences and Disorders, 1 undergraduate in Psychology); all were native speakers of the language they transcribed. All but 1 (i.e., the project manager) were blind to the child’s diagnostic group and videos were assigned to transcribers to ensure that each transcriber encountered approximately the same number of children with ASD, TYP children, as well as higher and lower language levels. This assignment was done by a project manager; demographic and group status information was never provided to all transcribers (except for the project manager). Each video was viewed twice, with each time by a different transcriber. The first transcriber completed the transcription and the second transcriber reviewed the original transcription for utterance breaks, possibility of deciphering unintelligible utterances, and adherence to CHAT conventions. Any large discrepancies between the first and second transcriber were noted on a protocol sheet for later review. No transcriptions needed to be reviewed.

    Transcribers worked in a quiet room and wore headphones while watching the video recording using Windows Media Player. The full duration of the interactions (approximately 10 minutes) were transcribed. Gems mark approximately 9 minutes of transcription. When applicable, utterances breaks were determined by clear intonational markers such as questions or exclamations. All other utterances were determined by a clear pause followed by a breath. If an utterance was difficult to comprehend, transcribers were told to listen to the segment a maximum of 3 times before transcription. If the utterance was still unintelligible, the portion or full utterance was marked with an “xxx”. If there was an overlap between speakers, the transcriber was to first pay attention to the parent utterances, then the child utterances. All CHAT conventions applicable to the present study were provided in an abridged manual, which was made available in printed and electronic forms. The original manuals provided by CHILDES (v2000) were also made available.

    Project-specific conventions

    Biographical data

    Biographical data for the child include: child’s age, gender, diagnostic group, and language are provided. Biographical data for the parent include: parent gender, education and language. At the time of data collection, all families lived in the greater Montreal, Canada area. Montreal is commonly referred to as a bilingual city (i.e., English and French), although the official language is French. French-speaking families were made up of a few different dialects, although the majority spoke Quebec-French; dialectal information per family was not collected.

    In order to participate in the study, parents reported that their children were exposed to English-speaking or French-speaking environments for =>75% of the time per week. However, some families did not have English/French as a native language. Any additional languages spoken by the families are noted. Occasionally English-speaking and French-speaking parents would use words in the other language; for these words, the CHAT conventions of @s:fre or @s:eng codes are used for the respective lexical item.

    The majority of families were recorded in their homes. Other families were recorded at McGill University in our testing facility.


    Funding for this project was awarded to Dr. Aparna Nadig via a nouveau professeur-chercheur grant from the Fonds de recherche du Québec - Société et culture in 2008.