CANCER DATA SCIENCE 101

Formulating the Question

Martin Skarzynski

2019-04-23

What is data science?

What is data science?

Data science terms

  • Workflow
  • Tidy (Wrangle, Munge)
  • Exploratory Data Analysis
  • Feature Engineering
  • Machine Learning
  • Deep Learning
  • Supervised learning (Classification, Regression)
  • Unsupervised learning (Clustering, Dimensionality Reduction)
  • Underfitting (Bias)
  • Overfitting (Variance)

R for data science Modern dive

How is data science done?

How is data science done?

  • Visualize ~ Exploratory Data Analysis
  • Transform ~ Feature Engineering
  • Model ~ Machine Learning

Exploratory Data Analysis

What is AI?

Supervised versus unsupervised

Logistic regression

logistic regression

breast cancer article

under and overfitting

validation curve

Decision tree

decision tree

Decision tree levels

decision tree levels

Decision tree overfitting

decision tree overfitting

Decision tree evaluation

decision tree evaluation