2026 Edition

A Novel Skin Cancer Detector Using Machine Learning

Karthika Hariprasad

Skin cancer is one of the most common forms of cancer worldwide, yet early detection can dramatically improve patient outcomes. Inspired by years of dermatology appointments related to eczema and skin abnormalities, I explored whether machine learning could accurately distinguish cancerous lesions from benign skin conditions. Using thousands of dermoscopic images from the International Skin Imaging Collaboration (ISIC) database, I developed and trained a neural network capable of identifying suspicious lesions. To make the technology more accessible, I also designed and built a low-cost handheld prototype that integrates image capture, machine learning, and embedded hardware into a practical screening device.

2026 Edition

PsychSPT: A Novel AI System for Mental Health Assessment using Large Language Models (LLMs)

Winston Fan

Mental health disorders affect nearly one billion people worldwide, yet many individuals struggle to access timely support and assessment. In this project, I developed PsychSPT, an artificial intelligence system that uses large language models to identify signs of loneliness, depression, and other mental health concerns from written language. Using millions of posts collected from online mental health communities, I trained and evaluated models capable of both prediction and explanation. This work explores how advances in natural language processing can contribute to earlier mental health assessment while providing interpretable results for researchers and clinicians.

2026 Edition

Topology-Informed Flood Detection from Satellite Images

Max Zhao

Flood detection from satellite imagery is often complicated by noise, cloud cover, and changing environmental conditions. In this project, I explored whether tools from topological data analysis could improve the ability of machine learning systems to recognize flooding events. Using persistent homology and satellite observations from the SEN12-FLOOD dataset, I developed models that capture large-scale geometric features of flooded regions rather than relying solely on local image patterns. This work demonstrates how mathematical ideas from topology can provide new approaches for environmental monitoring and disaster response.

2025 Edition

Linguistic Frankenstein: Reconstructing Low-Resource Languages with Natural Language Processing

Kabilan Prasanna

Thousands of languages around the world are endangered, placing unique histories, traditions, and cultural identities at risk of disappearing. In this project, I explored whether natural language processing techniques could help reconstruct missing vocabulary in low-resource languages. Inspired by challenges facing endangered Dravidian languages, I investigated how machine translation and computational linguistics can leverage related languages to generate new words and support language revitalization efforts. This work highlights how artificial intelligence can contribute to preserving cultural heritage in the digital age.

2021 Edition

2-D Analog to Segement Trees?

Jason Yang

The main problem of our project was investigating whether or not there was an efficient 2D analog to the segment tree. Here, instead of updating and querying arbitrary ranges of a list of numbers, we want to update and query arbitrary submatrices of a matrix of numbers. When updating a submatrix, we add all numbers in the submatrix with an arbitrarily chosen constant value; when querying a submatrix, we find the minimum of all numbers in the submatrix. We wanted to see whether there was a data structure that could efficiently perform each of these operations being repeatedly done one after the other, where each update can affect future queries. The answer we found to this question is a conditional no if a specific long-standing conjecture is assumed. The conjecture is the “All-Pairs Shortest Path Conjecture”, which states that the “All-Pairs Shortest Path” problem cannot be solved in “truly subcubic” time. Since at least the 1970s, it has been known that the “All-Pairs Shortest Path” problem has equal difficulty to computing a certain operation on matrices, called “min-plus matrix multiplication”, which is similar to the standard matrix multiplication in linear algebra but where the summation is replaced with the minimum() function and the multiplication is replaced with addition. Also since the 1970s, this conjecture has remained open, and not too much progress has been made towards refuting it. The key to achieving our research result was realizing that if there was an efficient 2D segment tree, then it could perform min-plus matrix multiplication in truly subcubic time, which would refute the conjecture. Therefore, if the conjecture is true, which many believe to be the case, then an efficient 2D segment tree is impossible.

2018 Edition

User-Tailored Privacy by Design

Henry Sloan

The goal of my research is to find out the effects of showing different people various privacy suggestions in a Facebook-like system. In theory, this understanding could provide tools that allow users who want different amounts of privacy to achieve it conveniently. To adapt to users, however, one needs a model of the users. We picked Privacy Profiles, a previously established user model. Privacy Profiles basically categorize users of a social network based on the privacy features they use and are aware of. For example, people who often block people may be categorized as Privacy Maximizers. Based on this model, we designed three ways of creating adaptations, and three ways of showing them. When generating suggestions, we can use optimization, which helps the user with things they already do, solidification, which helps them with features inside of their profile (ones which they should be using), or self-actualization, which suggests things they might not do themselves. These are called Adaptation Methods. These adaptations can then be shown in various ways, called Introduction Methods: automation implements changes without asking the user first (with an undo button), highlighting makes features more visible or prominent, and suggestion shows a Privacy Dinosaur (Based on of a similar dinosaur on Facebook) to give personalized suggestions.

2014 Edition

A Naturally Efficient Computing Technique using Molecular Logic Gates with a DNA-cleaving Deoxyribozyme

Vishnu Shankar

Current computational devices and techniques are based on silicon microprocessors. Computer manufacturers have been increasing transistor density on computer chip microprocessors at a rate that approximates Moore’s Law, which states that the amount of gates on a single chip will double every two years. Unfortunately, the application of Moore’s Law has been predicted to reach an end because of the physical speed and miniaturization limits of silicon microprocessors. The advantages of DNA Computing include large storage capacity and an ample a supply of DNA, making it a cheap natural resource unlike the cost of fabrication of Si-based computers. Even though empirically it has been shown that DNA computation has slower cycle than a silicon system, the parallel processing capabilities of a DNA system is significant in solving NP-hard problems. Further motivations for studying DNA Computing or the construction of molecular scale computing devices include its scale. Biological systems through superior control have been shown to solve many complex problems while avoiding the inefficiency of current von Neumann architecture ….

2014 Edition

Investigation of Rule 73 as a Case Study of Class 4 Long-Distance Cellular Automata

Lucas Kang

That summer, I applied to and was accepted to the Wolfram Science Summer School (WSSS) WSSS2012 was hosted at Curry College in Milton, Massachusetts. At WSSS2012, I met Stephen Wolfram, members of the Wolfram Science team, and numerous computer science enthusiasts from around the world, all with unique and interesting backgrounds. It was after talking to Dr. Wolfram for the first time that I decided to study long-distance cellular automata, or LDCA, a field of cellular automata that had not been extensively documented before. I began by created a nomenclature for LDCA, and started to study their basic characteristics … Cellular automata (CA) have been utilized for decades as discrete models of physical, mathematical, chemical, and biological systems. The most common form of CA, the elementary cellular automaton (ECA), has been studied intensively in the past due to its simple form and versatility. However, ECA are constrained to evolve according to a neighborhood of adjacent cells, which limits their sampling radius and the environments in which that they can be used. The purpose of my study was to explore the behavior of one-dimensional CA in configurations other than that of ECA. Namely, long-distance cellular automata (LDCA), a construct that had been described in the past but never studied …

2014 Edition

Precision Impact of Emoticons for Social Media Sentiment Analysis

Tanya Lee

It all started with social media. Like many Facebook fans of my age, a significant part of my life was spent on social media. As we take knowledge from the infinite pool of cyberspace, cyberspace, in return, instilled appalling social habits, and my social interactions simply became competitions of who can glue eyes to their screen the most. Consequently, for me (and my 819 friends), my speech patterns rescinded to a level akin to OMG LOL I have to get to class. I lived in social media, knowing it inside and out … In sophomore year, I had an opportunity to put my social media expertise to some use as a paid summer intern at a Silicon Valley startup that automatically tracks public opinions and sentiments from social media. Their system uses natural language technology to do sentiment analysis of consumer opinions about a brand or topic … My initial job was to incorporate social media jargon into the system, especially the emotional expressions from Urban Dictionary. I was also assigned to test entries from Facebook fan pages, sorting positive sentiment from negative. I soon immersed myself into my work routine but noticed that the system always disregarded smiley faces (emoticons) as these are things beyond words, extra-linguistic symbols. As visible representations of emotion, isn’t that a missed opportunity to help gather sentiment? A happy face like :) usually denotes a positive tone of sentiment while a sad face :( a negative tone. Intuitively, it should help the system for the purpose of sentiment analysis … This research presents a novel study of how emoticons can help sentiment analysis precision. Data analysis shows that emoticons alone cannot determine sentiments towards a brand and they can only be used together with other evidence. Further study has discovered a use of emoticons as counter evidence to block glaring errors in sentiment analysis …

2008 Edition

Chip-Firing Analysis of Stabilization Behaviors, Hitting Times, and Candy-Passing Games

Paul Kominers

Math can be an intimidating field. To work on some problems, one must know decades or centuries of background before one can even understand the question. However, what tends to get lost in all of that is that math can be fun, even for the relatively uninitiated. There are problems in mathematics that are discrete (essentially, self-contained) and with some combination of background research, mathematical thought, and appropriate mentoring, they are easily within reach of the high school student. Generally, random walks on graphs are approximated by computing the expected hitting time, or probable number of random moves required to go from one vertex to another. Although random walks are useful in mathematics and computer science, probabilistic systems do not offer sufficient precision for some applications. There are, however, several emerging methods of deterministically simulating random walks which can be used to more efficiently compute hitting times [4, 6]. One such deterministic simulation uses a process known as chip-firing.

2007 Edition

Slope of the Rate Distortion Function

Jeffrey Wang

Computer Science

The next hurdle was to figure out what information theory was really about. Since it is traditionally taught as an undergraduate or graduate level course, I dedicated the several weeks I had before RSI started to learning everything I could about information theory. This was already far too little time to digest an entire field, but you would be surprised at how much you can cover when you are focused on a single subject.

2007 Edition

The Effects on Read Performance from the Addition of a Long Term Read Buffer to YAFFS2

Sam Neubardt

Hard drives have much smaller sectors than flash memory, which does give them a slight advantage in performance. While the data on a hard drive can be changed many times without causing physical wear on the drive, the data on a flash memory device can only be changed a certain amount of times (often over 500,000 times per page, but it still adds up) before the reliability of the drive starts to corrode.

2007 Edition

TopicWeb: A Novel Approach to Automatic Document Similarity Measurement and Categorization

Stanislav Nikolov

Computer Science

I eventually picked the topic of automatic summarization of documents, which involves creating a shortened version of a document by analyzing the most important topics, or themes, in the document. The more I read about the field, the more interesting and feasible it seemed.

2005 Edition

Discovery of the Predictors of the Standard Heats of Formation of Group 1 and 7 Compounds: A Heuristic Genetic Algorithm with Multiple Regression

Swarup Sai Swaminathan

As a student in the Medical Sciences Specialized Learning Center of Freehold High School in Freehold, New Jersey, I was given the opportunity to conduct independent research for the complete duration of my junior year. Having been given permission to complete any type of research, I wanted to challenge myself by working in a unique research field: I attempted to combine my knowledge and interest in the life sciences with computer science. A friend suggested the use of regression modeling in my project, as he advised that empirically estimating values was becoming essential in today’s scientific world. With helpful guidance from friends, teachers, and advisors, I continued to narrow down my field of research.

2005 Edition

Web Based Searches

Jennifer Ding

Computer Science

My initial inspiration to perform scientific research in the field of computer science came from my parents, whom I admire for their endless interest and successful efforts in their field of work. As a young child, I had visited their offices and saw some of their work. Although at the time I didn’t completely understand their jobs, I was amazed that the computer had the power to create such an impact on people’s lives. When I started my science research in high school I knew it would be the perfect opportunity to experience first hand what I had witnessed as a child. However, I also had personal aspirations. As a long time art enthusiast and visual learner, I wanted to find a field that combined my passion for art with my interests in computers, research that would affect me personally. My science research teachers in high school have always encouraged me to pursue a field that I am interested in. My mentor has also helped me to select and focus on a valuable research area. Ultimately, I found the field of Information Visualization, which deals with human computer interaction through visualized information