Research Project Summary
Children today are growing up in an immersive digital media culture, with nearly continuous interaction with mobile applications (apps), including via parents' online behaviours. Child digital engagement has extended to the realm of health, where apps target a range of health promotion and disease management foci. The perceived value of health apps has resulted in their endorsement by child health organizations and prescription by clinicians, often without a complete understanding of data privacy and security. It is known that app developers routinely transmit user data to third parties to enhance user experiences or monetize the app. Although the data sharing practices of these third parties are largely unexplored, adult health app research indicates that commercial entities may be the recipients of personal and health data. In the case of children, serious safety and privacy issues can arise if health information is used for data- driven health product and service advertising without clinician consultation. Further, data aggregators may identify child users and create digital health dossiers that eventually impact on education or employment in adulthood. It is therefore critical to characterize the data sharing practices of child health apps and examine the advertising impact of transmitted data. We conducted a cross-sectional study of top user- rated mobile apps labeled for children under 12 years available in the Apple App store in Australia, Canada, the United Kingdom, and the United States as of July 2022. We aimed to: (1) characterize their data sharing practices through analyzing their network traffic; (2) identify the third parties who received the information transmitted from these apps. Building off previously reported methods, we created a parent/child dummy profile and measured network traffic analysis during simulated app use to identify transmission of 21 pre-specified types of user data and its network destinations. For identified data recipients, we examined their websites to categorize data recipients' main activities. Our analyses will enable the characterization of data being transmitted by child health apps, the entities receiving the data and their data sharing practices, and the resultant advertising (and potential health) impacts on children. This research will provide a crucially needed picture of the privacy risks to children associated with health app use.
Methods
This cross-sectional study is reported in accordance with the STROBE guideline (eTable 1) [1]. We conducted a network traffic and security analysis [2] to examine the data sharing practices of children's apps and a content analysis to describe the third party data recipients.
Sampling and eligibility
We included apps that: (1) are labeled in the app store for use by a child aged 12 or under or their parents; (2) are available for download in major English-speaking markets (Australia, Canada, United Kingdom (UK), US) through the Apple App Store; (3) report collecting user data; and (4) require the user to input at least one piece of data. We developed a crawling program that systematically searched the Australian, Canadian, UK, and US Apple App Stores and implemented it weekly from September 1 to September 28, 2021, using the keywords "kids," "kid," "children," and "child." The crawling program returned a list of the top 100 user-rated apps for each country store, which we screened to determine eligibility, classified by app store category, and ranked from highest to lowest user rating score. We sampled the top 5 scoring apps in each category. Using the app store metadata, we excluded all apps not available in English, not present in a top 100 list for the entire 4-week sampling period, requiring payment for download, or not updated within the previous 6 months. We excluded duplicate apps with the same title, vendor, and logo.
Data collection and analysis
Pediatric clinicians on the author team created a dummy user profile that included demographic data, developmental stages, and health conditions for a parent/child dyad (eTable 2). Between February and July 2022, we downloaded apps to an iPhone 13 (iOS 14.5), and an iPhone XR (iOS 13.1.3) and used the mitmproxy tool [3] to expose and record all network (internet) HTTP(S) traffic. We purchased any subscriptions required, and then ran each app by logging in, clicking available buttons, adjusting settings, and using the information from the dummy profiles to explore all offered functionalities. For apps involving a parent and child, we used both iPhones to capture the network traffic for both user types. We manually reviewed all network traffic to identify transmissions of 21 pre-specified data types (e.g., name, birthdate) and recorded the domain name and IP address of the destination. We used Shodan.io, and ip2location.com to identify the geographical location of IP addresses, and the WHOIS public service to identify the recipient of the user data (termed "host"). By examining the websites of entities associated with the IP address and domain name, we categorized hosts as "first party" (i.e., app developers, parent company, and freelance app developers who make apps on their behalf), "third party" (i.e., entity providing some type of service to the developer), or both. [4] We further categorized third parties receiving user data as either "infrastructure" providers (i.e., provide services necessary for the app to function, including cloud services and content delivery networks) or analysis providers (i.e., collect and commercialize user data for the purposes of analytics, advertising, and/or engagement), which we considered higher risk in terms of privacy. We used descriptive statistics for analysis.
Ethics
This study is exempt per the University of Toronto Health Research Ethics Board.
Sources of Private Information
List of the sources of private information that we tracked during out study. While some of them clearly regard consumers private data, other may seem irrelevant from a privacy perspective. However, when combined such information can reveal users characteristics and can be used to identify them.
- android_id
- Unique ID to each Android device. For instance, it is used to identify devices for market downloads.
- birthday
- User birthday.
- browsing
- App-related activity performed by the user (e.g., view pharmacies, search for medicines).
- carrier
- Mobile network operator, provider of network communications services (e.g., Vodafone).
- connection_type
- Cellular data or WiFi.
- country
- Country in which the device is located (e.g., Australia).
- course-grain-location
- Non precise location. Usually tells only the city in which the device is located (e.g., Sydney).
- device_id
- IMEI code of the device.
- device_name
- Name of the device (e.g., Google Pixel).
- doctor-name
- Information about the user's doctor (e.g., name).
- doses
- Medicines doses (100 mg Aspirin per day).
- User E-mail address.
- feelings
- User current feelings (e.g., happy, sad).
- gender
- User gender.
- name & lastname
- User name and lastname.
- med-conditions
- User medical conditions (e.g., past diseases).
- meds-instruction
- Instructions on how to take medicines (e.g., after dinner, in the morning).
- meds-list
- List of (prescripted) medicines taken by the user.
- meds-schedule
- Times for medicined (e.g., 8.00 PM Aspirin).
- os_version
- Device Android version.
- personal-conditions
- User personal conditions (e.g., smoker, pregnant).
- personal-factors
- User personal factors (e.g., height, weight, blood pressure, blood type).
- pharmacy-name
- Information about the user favorite pharmacies (e.g., name).
- symptoms
- User symptoms (e.g., headache).
- timezone
- Timezone in which the device is located (e.g., GMT+11).
People
- Jessica Pimienta - University of Toronto
- Jacco Brandt - University of Twente
- Timme Bethe - University of Twente
- Ralph Holz - University of Twente and University of Münster
- Andrea Continella - University of Twente
- Lindsay Jibb - University of Toronto and Hospital for Sick Children
- Quinn Grundy - University of Toronto
References
[1] von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The strengthening the reporting of observational studies in epidemiology (strobe) statement: Guidelines for reporting observational studies. Journal of Clinical Epidemiology. 2008;61(4):344-349. doi:10.1016/j.jclinepi.2007.11.008
[2] Continella A, Fratantonio Y, Lindorfer M, et al. Obfuscation-resilient privacy leak detection for mobile apps through differential analysis. In: Proceedings 2017 Network and Distributed System Security Symposium. Internet Society; 2017. doi:10.14722/ndss.2017.23465
[3] mitmproxy. mitmproxy. Published online 2022. Accessed March 2, 2023. https://mitmproxy.org/
[4] Grundy Q, Chiu K, Held F, Continella A, Bero L, Holz R. Data sharing practices of medicines related apps and the mobile ecosystem: Traffic, content, and network analysis. BMJ. Published online March 20, 2019:l920. doi:10.1136/bmj.l920