Gather

Modified

September 13, 2024

Collection protocol

Details about the data collection protocol for participant screening and recruiting can be found on the PLAY Project website.

Setup

We load functions needed to download KBT screening/demographic questionnaire files.

fl <- list.files(file.path(here::here(), "R"), "^kobo_|^file_|^screen_|CONSTANTS", full.names = TRUE)
purrr::walk(fl, source)

library(tidyverse) # for the `magrittr` pipe `%>%`
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Retrieve from KoBoToolbox (KBT)

We make use of the targets package for downloading data files from KoBoToolbox and for saving local XLSX and CSV copies. This allows us to download and process those files on a regular basis.

This excerpt from _targets.R shows two of these ‘targets’:

  # Download screening/demographic survey
  tar_target(
    kb_screen_df,
    kobo_list_data_filtered(kb_df, "[Dd]emographic"),
    cue = tarchetypes::tar_cue_age(
      name = kb_screen,
      age = as.difftime(update_interval, units = update_interval_units)
    )
  ),
  tar_target(
    screen_download,
    screen_download_convert(kb_screen_df, "data/xlsx/screening", "data/csv/screening"),
    cue = tarchetypes::tar_cue_age(
      name = screen_df,
      age = as.difftime(update_interval, units = update_interval_units)
    )
  ),

The kb_df data frame is input to the first target kb_screen_df. kb_df is generated when the following target is generated (by kobo_list_data()).

  tar_target(
    kb_df,
    kobo_list_data(),
    cue = tarchetypes::tar_cue_age(
      name = kb_df,
      age = as.difftime(update_interval, units = update_interval_units)
    )
  ),

Download

We have two targets specified in _targets.R that handle the regular downloading of screening data files.

First, we generate a data frame of KoBoToolbox forms that contain the screening (“Demographic”) data. Here is the target for that process:

# Not evaluated
  tar_target(
    kb_screen_df,
    kobo_list_data_filtered("[Dd]emographic"),
    cue = tarchetypes::tar_cue_age(
      name = kb_screen,
      age = as.difftime(update_interval, units = update_interval_units)
    )
  ),

Then we download and save the raw XLSX files to ../data/xlsx/screening using kobo_retrieve_save_many_xlsx(kb_screen_df, save_dir = "../data/xlsx/screening"). Finally, we convert the XLSX files to CSVs via file_load_xlsx_save_many_csv("../data/xlsx/screening", "../data/csv/screening", "Demographic"). The latter two steps are handled by the wrapper function screen_download_convert(kb_screen_df, "data/xlsx/screening", "data/csv/screening"). Here is the accompanying target:

# Download screening/demographic survey
  tar_target(
    screen_download,
    screen_download_convert(kb_screen_df, "data/xlsx/screening", "data/csv/screening"),
    cue = tarchetypes::tar_cue_age(
      name = screen_df,
      age = as.difftime(update_interval, units = update_interval_units)
    )
  ),

We can confirm that these functions have run properly, as follows:

screening_fl <- list.files("../data/csv/screening/", pattern = "*.csv$")

Survey questions

There are n=3 screening/demographic data files.

The KoBoToolbox API enables us to download a table with the questions, as follows.

targets::tar_load(kb_screen_df, store="../_targets")
kb_screen_df
      id              id_string                                    title
1 334134 agChfbnGHc6fiUdfYUaZ2z PLAY Demographic Questionnaire (Spanish)
2 359546 aGLEqT7eRBhuPgizCQBeqA           PLAY Demographic Questionnaire
3 275882 anqhvxKtQieo7gc9uPqJ3y           PLAY Demographic Questionnaire
                               description
1 PLAY Demographic Questionnaire (Spanish)
2            New Demographic Questionnaire
3                 PLAY Phone Questionnaire
                                            url
1 https://kc.kobotoolbox.org/api/v1/data/334134
2 https://kc.kobotoolbox.org/api/v1/data/359546
3 https://kc.kobotoolbox.org/api/v1/data/275882
                                                             form_url
1 https://kf.kobotoolbox.org/api/v2/assets/agChfbnGHc6fiUdfYUaZ2z.xls
2 https://kf.kobotoolbox.org/api/v2/assets/aGLEqT7eRBhuPgizCQBeqA.xls
3 https://kf.kobotoolbox.org/api/v2/assets/anqhvxKtQieo7gc9uPqJ3y.xls

Based on this information, we know that the questionnaires may be downloaded from the following URLs:

paste0("https://kf.kobotoolbox.org/api/v2/assets/", kb_screen_df$id_string, ".xls")
[1] "https://kf.kobotoolbox.org/api/v2/assets/agChfbnGHc6fiUdfYUaZ2z.xls"
[2] "https://kf.kobotoolbox.org/api/v2/assets/aGLEqT7eRBhuPgizCQBeqA.xls"
[3] "https://kf.kobotoolbox.org/api/v2/assets/anqhvxKtQieo7gc9uPqJ3y.xls"

We import and clean the screening/demographic data in the next section.