Associations between
Single Nucleotide Polymorphisms
& Unexplained Lung Cancer Risk
in the ARIC Study Dataset

Martin Skarzynski
Capstone Mentor: Prof. Elizabeth Platz
Johns Hopkins School of Public Health

Atherosclerosis Risk in Communities Study (ARIC) Dataset

Cancer types with highest primary cancer incidence and cancer mortality 1987-2012 among 14,735 at risk ARIC participants; 8,028 females, 6,707 males

Site Incidence Mortality
Colon 364 109
Lung and bronchus 748 526
Hematopoietic/lymphatic 378 177
Melanoma 130 14
Breast (female & male) 696 112
Prostate 887 91
Kidney 178 42
Bladder 234 36

Can SNPs explain some variance left after taking into account known risk factors?

Everything I need is on the cluster:

  • ARIC epidemiologic & genomic data
  • BASH & R scripts to work with the data
  • Compute resources
Working on getting access to the cluster and the dataset :)

