Modified

September 13, 2024

Background

We make use of the datadictionary package here.

This is not a perfect solution. Among other challenges, this package throws many warnings. But we will use it for the time being.

Note

This data dictionary workflow does not include the questions that were asked. Adding that, and more descriptive information, is a high priority.

Load data and generate dictionary

Note

This data dictionary workflow does not include the questions that were asked. Adding that, and more descriptive information, is a high priority.

Code
scr_df <- readr::read_csv(paste0(here::here(), "/data/csv/screening/agg/PLAY-screening-datab-latest.csv"),
                          show_col_types = FALSE)

scr_dd <- datadictionary::create_dictionary(scr_df)

readr::write_csv(scr_dd, paste0(here::here(), "/data/csv/screening/dd/PLAY-screening-data-dictionary.csv"))

Here are the data this package provides:

Code
scr_dd |>
  kableExtra::kable() |>
  kableExtra::kable_classic()
item label class summary value
Rows in dataset 933
Columns in dataset 68
session_id No label numeric mean 66910
median 68325
min 38196
max 75625
missing 0
session_name No label character unique responses 898
missing 0
session_date No label Date mean 2023-08-12
mode 2023-08-13
min 2019-06-09
max 2025-02-03
missing 0
session_release No label character unique responses 3
missing 0
participant_ID No label character unique responses 98
missing 0
participant_birthdate No label Date mean 2022-02-15
mode 2021-08-18
min 2017-06-08
max 2024-02-08
missing 0
participant_gender No label character unique responses 2
missing 0
participant_race No label character unique responses 11
missing 0
participant_ethnicity No label character unique responses 3
missing 0
participant_language No label character unique responses 13
missing 0
exclusion_reason No label character unique responses 15
missing 867
group_name No label character unique responses 2
missing 0
context_setting No label character unique responses 2
missing 6
context_country No label character unique responses 2
missing 47
context_state No label character unique responses 19
missing 6
vol_id No label numeric mean 1299
median 1376
min 899
max 1705
missing 0
participant_disability No label character unique responses 5
missing 767
pilot_pilot No label logical missing 933
submit_date No label POSIXct POSIXt mean 2023-06-09
mode 2024-02-09 2024-04-19
min 2019-10-08
max 2024-09-08
missing 102
site_id No label character unique responses 31
missing 102
play_id No label character unique responses 692
missing 224
child_age_mos No label numeric mean 18
median 18
min 9.24
max 42.6
missing 104
child_sex No label character unique responses 3
missing 104
child_bornonduedate No label character unique responses 3
missing 110
child_onterm No label character unique responses 3
missing 218
child_birthage No label numeric mean 5
median 3
min -20
max 367
missing 131
child_weight_pounds No label numeric mean 7
median 7
min 4
max 100
missing 114
child_weight_ounces No label numeric mean 7
median 7
min 0
max 143
missing 126
child_birth_complications No label character unique responses 3
missing 114
child_birth_complications_specify No label character unique responses 67
missing 867
child_hearing_disabilities No label character unique responses 3
missing 114
child_hearing_disabilities_specify No label character unique responses 2
missing 932
child_vision_disabilities No label character unique responses 3
missing 114
child_vision_disabilities_specify No label character unique responses 6
missing 928
child_major_illnesses_injuries No label character unique responses 3
missing 114
child_illnesses_injuries_specify No label character unique responses 30
missing 900
child_developmentaldelays No label character unique responses 3
missing 226
child_developmentaldelays_specify No label character unique responses 9
missing 925
child_sleep_time No label character unique responses 99
missing 115
child_wake_time No label character unique responses 114
missing 116
child_nap_hours No label character unique responses 39
missing 116
child_sleep_location No label character unique responses 6
missing 116
mom_bio No label character unique responses 4
missing 128
mom_childbirth_age No label numeric mean 33
median 33
min 20.22
max 121.22
missing 131
mom_race No label character unique responses 9
missing 121
mom_birth_country No label character unique responses 5
missing 121
mom_birth_country_specify No label character unique responses 39
missing 845
mom_education No label character unique responses 15
missing 122
mom_employment No label character unique responses 4
missing 122
mom_occupation No label character unique responses 486
missing 318
mom_jobs_number No label character unique responses 7
missing 316
mom_training No label character unique responses 3
missing 124
biodad_childbirth_age No label character unique responses 639
missing 102
biodad_race No label character unique responses 13
missing 102
language_spoken_mom No label character unique responses 6
missing 110
language_spoken_mom_comments No label character unique responses 17
missing 917
language_spoken_child No label character unique responses 6
missing 104
language_spoken_home_comments No label character unique responses 832
missing 102
language_spoken_child_comments No label character unique responses 14
missing 920
language_spoken_home No label character unique responses 7
missing 109
language_spoken_house_other No label character unique responses 15
missing 919
language_spoken_home_other No label character unique responses 2
missing 932
childcare_types No label character unique responses 14
missing 309
childcare_location No label character unique responses 26
missing 908
childcare_hours No label character unique responses 71
missing 500
childcare_number No label numeric mean 6
median 6
min 0
max 25
missing 503
childcare_age No label numeric mean 7
median 5
min 0
max 45
missing 613
childcare_language No label character unique responses 38
missing 499

Extracting questions

An alternative approach to generating the data dictionary starts with the questionnaire files themselves.