CS+: Computer Science Projects Beyond the Classroom

Applications are now Open!! Soft Deadline February 14, 2025

 

What is CS+?

CS+ is a ten week summer program exclusively for Duke undergraduates to get involved in computer science research projects with faculty in a fast-paced but supportive community environment. Students participate in teams of 3-4 and are jointly mentored by a faculty project lead and a graduate student mentor. The experience is meant as a rich entry point into computer science research and applications beyond the classroom.

Logistics

  • Only students enrolled at Duke University are eligible to apply.
  • The program this summer will run from Monday, May 19, 2025 through Friday, July 25, 2025.
  • The program is held in-person, following Duke guidelines for summer programs. There is no virtual option available, and students must reside in Durham during the summer (on or off campus) to participate.
  • Students participate in this program full-time (40 hours/week). You cannot take summer courses or do other internships/fellowships while doing CS+.
  • Participants receive a stipend of $5,000 to cover expenses.
  • Applications received by Friday, February 14 will receive full consideration (afterwards applications will be considered depending on whether positions have been filled).

If you have questions about the program, please email csplus@cs.duke.edu.

Apply Now

CS+ Project Offerings Summer 2025

Leads: Kamesh Munagala

Description: The goal of this project is to explore algorithmic aspects of fairness and human alignment in societal and AI applications. Candidate problems include participatory budgeting, fair division, and reinforcement learning. The project will be exploratory in nature.
 

Goals/Deliverables: Students will:

  • Write a research paper. The work will be theoretical in nature, but the project could involve some empirical work as well.
     

Student Background/Prerequisites: Strong background in algorithms and math, with some programming experience. 
 

Lead: Kristin Stephens-Martinez

Description: "Academic help-seeking is a sophisticated action. Students in modern CS/computing courses interact with a rich landscape of help resources thanks to the widespread uses of autograders, human help provided by highly structured course staffs, and the rise of generative AI tools. In the past few years, we have been studying computing students' help-seeking approach, behavior, and tendencies, yet more interesting questions remain, especially when it comes to the non-introductory course contexts.

As concrete examples, we may study: (1) how do students' learning beliefs and practices relate to their help resource selection and their help-seeking tendencies? (2) how do students' experiences in using one help resource influence their usage of other resources? (3) how do students' individual help-seeking tendencies relate to their identities? "
 

Goals/Deliverables: We expect the project to culminate into a research paper that can be published in a computing education research conference such as the ACM Technical Symposium on Computer Science Education (SIGCSE TS). Alternatively, students may form a team in the student research competition track in the conference.
 

Student Background/Prerequisites: 

(1) Basic data handling and analysis skills, at least at the level of CS216, is mandatory. 

(2) Quantitative research skills such as hypotheses testing or statistical modelling is strongly recommended.

(3) Prior experience as a UTA providing help in any course will be useful but not mandatory, as is rich first-hand help-seeking experiences as a student. 
 

Lead: Danyang Zhuo

Description: The network stack is a critical component of an operating system. Currently, bugs/vulnerabilities in the network stack can cause incorrect application behavior and may crash computer systems. The correctness of the network stack is especially important as our applications have become distributed systems themselves. In this project, students will leverage LLMs to develop a minimal but correct network stack. We will develop a network stack from first principles and also reason about its correctness. At the end of the summer, students will build a demo of the network stack.
 

Goals/Deliverables: We hope students can develop critical software pieces that will be used in our research group's projects.
 

Student Background/Prerequisites: Prior experience with LLMs is highly desirable.  Preferable prerequisite: 310 and 356. Minimum requirement: 210.
 

Lead: Danfeng Zhang
 

Description: "Side channels are unintended ways through which information is leaked from a computer system, often as a result of its physical or operational characteristics, rather than through explicit data exchange channels. For instance, traffic analysis might reveal information from encrypted packages via their timing, frequency, length etc. Understanding and mitigating side-channel attacks is critical for ensuring the security of hardware, software, and cryptographic implementations.

In this project, we will explore static program analysis that can detect potential side channels in software implementations, including cryptographic implementations and network implementations. One particular issue that we will prioritize to study in the project, is the scalability and precision of existing static side-channel analysis tools. If time permits, we will also explore novels ways to advance the state-of-the-art. This project will give participating students first-hand experience with static program analysis and the emerging side channel vulnerabilities in software implementations."
 

Goals/Deliverables: 

  • Learn how to use static analysis to detect side channel vulnerabilities.
  • Learn how to model system behavior and analyze system security at compile time.

Student Background/Prerequisites: Proficient in at least one programming language. Basic knowledge of program analysis and system security is a plus.
 

Lead: Alberto Bartesaghi
 

Description: "Cryogenic electron microscopes -or cryo-EM for short- allow researchers to peer at the microscopic shape of cellular proteins like never before. These machines blast proteins with a 300,000-volt beam of electrons so that highly sensitive detectors underneath can tease out their shapes based on the interaction that occurs. Being able to “see” proteins -life’s crucial building materials- can help determine how they work. Recognizing protein structure and function is essential for scientists trying to design better drugs to tackle some the world’s most devastating diseases, including HIV, cancer, COVID-19 and Alzheimer’s disease. A 300,000-volt electron beam is, however, extremely damaging to the proteins it is trying to image. To help protect the samples in the machine, researchers cryogenically freeze them to help maintain their integrity and use very low electron doses to prevent structural damage which results in extremely noisy images.

An emerging modality of cryo-EM called cryo-electron tomography (cryo-ET) uses computerized tomography principles to provide an accurate representation of the 3D molecular architecture of entire cells. The mining of the rich information contained in the native cellular environment is hindered by the crowded nature of cells populated by many different molecular species. The accurate detection of individual molecules in 3D is a critical step towards allowing the visualization of these molecular machines at high-resolution [1]. Motivated by recent advances in deep neural network approaches for molecular pattern mining [2], this project seeks to improve these methods to detect the position of challenging macromolecules within 3D images of frozen hydrated cells with the ultimate goal of understanding cellular function and disease at the molecular level.

[1] Liu, HF., Zhou, Y., Huang, Q. et al. nextPYP: a comprehensive and scalable platform for characterizing protein variability in situ using single-particle cryo-electron tomography. Nat Methods 20, 1909–1919 (2023). https://doi.org/10.1038/s41592-023-02045-0

[2] Huang, Q., Zhou, Y. & Bartesaghi, A. MiLoPYP: self-supervised molecular pattern mining and particle localization in situ. Nat Methods 21, 1863–1872 (2024). https://doi.org/10.1038/s41592-024-02403-6"
 

Goals/Deliverables: As part of this project, students will write computer code that will take as input 3D volumes of cells and automatically detect the location of multiple molecular species so they can later be extracted and used for high-resolution 3D visualization. Students will carry out the development in a dedicated high-performance computing (HPC) environment and at the end of the project will incorporate their code to the web-based application nextPYP (www.nextpyp.app). Depending on the progress made, a research paper describing their approach and presenting results obtained on real datasets will be produced.
 

Student Background/Prerequisites: Applicants should have experience with Python programming (including PyTorch/TensorFlow frameworks), distributed version control (git), and deep learning for image analysis, data science, or computer vision. Knowledge of web development frameworks or experience with high-performance computing (HPC) environments is desirable.
 

Lead: Pan Xu

Description: We conduct research in machine learning, focusing on algorithm design, analysis, and implementation. Our theoretical work primarily addresses efficient exploration and robust learning in reinforcement learning, including (1) randomized exploration and (2) distributionally robust reinforcement learning. We also engage in empirical research on (1) off-dynamics reinforcement learning, (2) large language models for decision-making, and (3) reinforcement learning applications in healthcare. More examples can be found in our recent publications (https://panxulab.github.io). Our group also has a strong track record of mentoring undergraduate interns to publish high-quality research in top machine learning conferences.

Goals/Deliverables: The purpose of this program is to provide students with a solid foundation in the theory or application of reinforcement learning, aiming to help them establish a research topic and gain a comprehensive understanding of its current progress. The focus will be on key areas such as randomized exploration methods, including Thompson sampling, and robust reinforcement learning approaches, such as distributionally robust Markov Decision Processes. Participants will study foundational research papers on reinforcement learning and bandits, guided by recommended textbooks: Bandit Algorithms and Reinforcement Learning: Theory and Algorithms. Students will also engage in discussions and analysis of recent research papers with our team. By the end of the program, the goal is to develop a working paper with the potential for refinement and submission to top machine learning conferences.
 

Student Background/Prerequisites: A strong foundation in linear algebra, calculus, probability theory, and stochastic processes is highly desirable. Prior research experience in reinforcement learning (RL) is not required.

Lead: Xiaowei Yang

Description: A home network connects to the Internet via a device called a home router. In the U.S. a home router often obtains an IP address from the home’s broadband provider. The IP address is dynamic and can change over time, making it difficult to connect to any home device from outside. Most household commercial IoT devices such as cameras require users to use the vendors’ cloud services to be accessed, which raises security and privacy concerns. An alternative is to use Dynamic DNS which requires a user to create a DNS name of their home router and delegate the dynamic DNS service to update the DNS record as the household public IP changes. This approach requires purchasing a domain name and is difficult to maintain.

We want to develop a system such that a home router’s  IP address changes are automatically pushed to the user’s devices through a messaging system such as email. The user can then keep a local DNS storage and connect back to home IoT devices without third-party services.
 

Goals/Deliverables: Multiple small software and a communication protocol to achieve the described system.
 

Student Background/Prerequisites: Basic network knowledge (IP, DNS etc.), Mobile software development, Some basic knowledge of encryption.
 

Leads: Anru Zhang, Danyang Zhuo
 

Description: Research on digital twins has become a vital tool for advancing modern applications in healthcare. A digital twin is a simulator of a physical system, such as a patient, that is continuously updated with real-time data. The advent of digital twins has revolutionized predictive analytics by enabling real-time monitoring and decision support in critical healthcare.

This project aims to explore digital twins and their applications in critical care settings. Students would investigate the challenges of analyzing multi-modal, high-dimensional healthcare data and study methods for creating robust digital twin models. This project offers an exciting opportunity to explore cutting-edge research at the intersection of data science, machine learning, and healthcare.
 

Goals/Deliverables: Students will gain foundational knowledge about digital twins and learn how to design and implement them in clinical settings. They will work on real-world healthcare datasets to build models and analyze them in critical care applications.
 

Student Background/Prerequisites: Proficient Python programming. Understanding of deep learning and generative models.
 

Leads: Xiaowei Yang
 

Description: Many websites employ 3rd party cookies to track user's browsing behavior. In a recent study, we find that even after users turn off tracking cookies with cookie notices, numerous websites
continue to use tracking cookies. One mechanism to defeat tracking is to disable all 3rd party cookies in web browsing. However, websites can employ other mechanisms such as
cookie respawning or first-party cookie leakage. In this project, we aim to first research how these tracking mechanisms work, and how prevalent they are, and then develop techniques to detect and defeat such tracking behavior.  
 

Goals/Deliverables: Survey and measurements of web tracking mechanisms, and possible solutions to detect and deter them. 
 

Student Background/Prerequisites: Basic network knowledge (IP, DNS etc.), proficient in at least one programming language
 

FAQs

What is the difference between Code+, Data+, Climate+, and CS+?  All three “plus” programs have the same model: students collaborating in teams on a project in tech/data for the same 10 weeks of the summer and receiving a stipend of the same amount. We also partner to provide some common events (talks, social events, final poster fair, etc) in order to create a larger ecosystem of students studying in tech and data over the summer; over 100 students participated in 2019 across all three programs. Each program has its own application.

  • CS+ focuses on projects in computer science research and applications and is run by the Department of Computer Science. Project leads are typically computer science faculty.
  • Data+ focuses on interdisciplinary data science projects from all over the university, and is run by Rhodes I.I.D. in Gross Hall. Project leads are typically faculty from diverse areas of the university, with frequent additional participation from community and/or industry partners.
  • Code+ focuses on projects in software and product development and is run by Duke OIT taking place at the American Tobacco Campus in downtown Durham. Project leads are professional IT developers with the emphasis on developing real-world development experience.
  • Climate+ focuses on climate-related, data-driven interdisciplinary research projects on diverse topics like electricity consumption, wetland carbon emissions, climate change’s impacts on river and ocean ecosystems, and the use of remote sensing data to inform climate strategies. Project leads are data science experts, and also climate, environment, and energy researchers and practitioners with additional participation from other project teams.

Do I apply to the program, or can I pick the projects I want to be a part of?  You can apply specifically to the projects and faculty of interest to you.

How much background do I need?  CS+ is intended for students who have some computer science experience, but students do not need to be computer science majors or rising seniors in order to apply. We welcome and encourage applications from rising 2nd and 3rd year students who have completed the introductory course sequence in computer science and have skills and interests that make them a good fit for their projects. Feel free to reach out to individual project leaders to discuss background for specific projects.

Summer Research Projects:

Main    2023    2022    2021    2020    2019