Marina Hu

Data Analysis for Frequency of Terms Indicating Autism Spectrum Disorder (ASD)

Marina Hu, BASIS Chandler High School

Behavior, Anorexia, and Obsessive-Compulsive Disorder (OCD) in ASD Electronic Health Records (EHRs).

In the US, boys are four times more likely to get diagnosed with Autism Spectrum Disorder (ASD) than girls. We focused on two possible explanations for this disparity: some ASD symptoms for girls overlap with other diseases such as Anorexia and Obsessive-Compulsive Disorder (OCD), and boys' behaviors are examined more comprehensively when evaluated for ASD compared to girls. To investigate these hypotheses, we applied Natural Language Processing (NLP) to 4480 Electronic Health Records (EHRs), the Arizona dataset of a 2000-2010 CDC surveillance project. We created a Java program to count the number of terms indicating Anorexia, OCD, and ASD-Behavior (such as "play", "alone", "repetition", etc.) in the EHRs and compared the counts between boys and girls. We found that terms indicating Anorexia appeared two times more frequently in girls' EHRs than in boys', while terms indicating OCD appeared 12.2% greater in boys' EHRs than girls'. Additionally, the average of total behavior words per boy EHR was 14.9% greater than that of girls. Our results suggest that girls evaluated for ASD tend to also be considered for Anorexia but not so much with OCD, and that behavior words show up more frequently in boys' EHRs than girls'. These results provide potential explanations for the disparity in ASD diagnosis between boys and girls. More research is needed to verify if girls are also considered for just Anorexia or any other diseases and if behavior words appear at similar frequencies in national datasets of ASD EHRs.

