The Departments of Statistical Science and Computer Science have collaboratively mapped out a data science pathway for an interdepartmental major (IDM) between the two departments. This pathway makes it easier for you to identify courses relevant to a career in data science, and to plan and optimize your program of study accordingly.
Note that this IDM is intended for students interested in data science, particularly its underpinning statistical techniques, but not necessarily its lower-level computational aspects. Depending on your interests, the other options include:
- The Data Science Concentration within the COMPSCI BS major, which requires fewer courses on the mathematical and statistical foundations, but focuses more on the computational aspect and practical issues that arise in applying data science.
- The IDM in MATH+CS on Data Science, which covers more topics on its mathematical foundations.
Note also that some STAT and COMPSCI courses required below need Calculus, Multivariable Calculus, Linear Algebra, and Introduction to Computer Science as prerequisites. More specifically:
- Introduction to Computer Science: one of COMPSCI 101, 102, 116, or their AP or IB or pre-college equivalents
- Calculus: MATH 111L and MATH 112L, or their AP or IB or pre-college equivalents
- Multivariable Calculus: one of MATH 202, 212, or 222, taken at Duke or transferred
- Linear Algebra: one of MATH 216, 218, or 221, taken at Duke or transferred
From Computer Science:
- COMPSCI 201 - Data Structures and Algorithms
- COMPSCI 210** - Intro to Computer Systems or COMPSCI 250** - Computer Architecture
- COMPSCI 330 - Design and Analysis of Algorithms
- One of COMPSCI 371 - Elements of Machine Learning, COMPSCI 370* - Intro. Artificial Intelligence, COMPSCI 570 - Artificial Intelligence, or COMPSCI 671* - Machine Learning
*NOTE: COMPSCI 370 was re-numbered from 270 in Fall 2019, and COMPSCI 671 from 571 in Spring 2019.
NOTE: COMPSCI 571 (not listed here) is cross-listed as STA 561, and can be used as an elective for the requirement by statistics.
- 3 Electives from the following (or others approved by the Director of Undergraduate Studies):
- COMPSCI 216 - Everything Data
- COMPSCI 230 - Discrete Math for CS or 232 - Discrete Mathematics and Proofs
- COMPSCI 226 - User Research Methods in Human-Centered Computing
- COMPSCI 260 - Computational Genomics
- COMPSCI 316 - Introduction to Databases or CompSci 516 - Data-Intensive Systems
- COMPSCI 290 - Special Topics on the following subjects (some may not be offered regularly):
- Introduction to Applied Machine Learning (Fain)
- COMPSCI 321/521 - Graph-Matrix Analysis
- COMPSCI 333 - Algorithms in the Real World, previously a COMPSCI 290
- COMPSCI 390 - Special Topics on the following subjects (some may not be offered regularly):
- Algorithmic Foundations of Data Science (Spring 2025)
- COMPSCI 474 - Data Science Competition
- COMPSCI 526 - Data Science
- COMPSCI 527 - Computer Vision
- COMPSCI 590 - Special Topics on the following subjects (some may not be offered regularly):
- Theory of Deep Learning (Spring 2025)
- Generative Models: Foundations and Applications (Spring 2025)
- Causal Inference in Data Analysis with Applications to Fairness and Explanations (Spring 2025)
- COMPSCI 290/590 Special Topics on the following subjects (some may not be offered regularly):
- Algorithmic Aspects of Machine Learning
- Algorithms for Big Data
- Algorithmic Foundations of Data Science
- Reinforcement Learning
**NOTE: For anyone who matriculated before Fall 2022, COMPSCI 316 may be used in lieu of the COMPSCI 210 or COMPSCI 250 requirement. In this case, then COMPSCI 210 or COMPSCI 250D can be used as one of the three electives.
From Statistics:
- STA 199 - Intro to Data Science
- STA 221L - Regression Analysis: Theory and Applications or STA 211 - Regression
- STA 240L - Probability for Statistics or STA 230 / MATH 230 - Probability or STA 230S / MATH 230S - Probability Inquiry Based Learning or STA 231 / MATH 340 - Advanced Introduction to Probability or MATH 231 - An Algorithmic Introduction to Probability and its Applications
- STA 250 - Mathematical Statistics or STA 432 - Stat Learning and Inference
- STA 360 - Bayesian Modeling
- 2 Electives from the following (or others approved by the Director of Undergraduate Studies):
- STA 310 - Generalized Linear Models
- STA 313 - Advanced Data Visualization
- STA 323 - Statistical Computing
- STA 325 - Machine Learning and Data mining
- STA 440 - Capstone
- STA 444 - Spatio-Temporal Modeling
- STA 450 - Social Network Analysis
- STA 465 - High Dimensional Data Analysis
- STA 561 - Machine Learning