Research Projects for Summer 2025

1. Neural networks for system control

Basic skills: Python/MATLAB

Preferred skills: Control theory

Data: Simulation data from physical systems: cruise control, drones, gene expression, or others

Model: GPT-2 transformer

Number of positions available: 1

References: This project is about using simple neural network architectures for data-driven control tasks. Then, compare the performance of neural networks with optimal control strategies. Lecture notes on data-driven MPC (Findeisen et al.) and neural network based control (Abu et al.)

Brief description: In this project, we will explore how neural networks can be leveraged for control tasks in physical systems such as autonomous vehicles, robotics, and biological modeling. The goal is to implement and test neural network-based controllers and compare their effectiveness with traditional optimal control approaches. By analyzing simulation data from various physical systems, we aim to quantify the robustness and adaptability of AI-driven control strategies.

Student(s): None

2. Safety of LLM-based control of dynamical systems

Basic skills: Python

Preferred skills: Control theory

Data: Srinivasa et al. 2019

Model: Safe transformers

Number of positions available: 1

References: Develop safety certificates for transformer models by adapting the latest literature on using barrier functions for transformer models Meng et al. 2023.

Brief description: As large language models (LLMs) are increasingly applied to real-world control tasks, ensuring their safety and reliability becomes a critical concern. This project focuses on developing formal safety guarantees for LLM-based controllers in dynamical systems. Using concepts from control theory, we will investigate how safety constraints, such as barrier functions, can be integrated into transformer models to prevent failures and ensure stability.

Student(s): None

3. Are LLMs cheaper for control tasks?

Basic skills: Python

Preferred skills: Control theory / AI finetuning

Data: GPU usage data / theoretical data on signal energy used to control the system

Model: GPT-2

Number of positions available: 1

References: Compute the "signal energy" in controlling physical systems using LLMs by following Abu et al. 2023, then apply to other control algorithms and neural networks to analyze the "cost" in using LLMs for control tasks.

Brief description:The computational cost of using LLMs for control applications is a pressing issue in modern AI. This project aims to assess the feasibility of LLM-based control by analyzing the energy consumption and computational efficiency of GPT-2 in comparison to traditional control strategies. By studying GPU power usage and signal energy in different control scenarios, we will determine whether LLMs offer a cost-effective alternative to conventional methods.

Student(s): None

4. Develop comprehensive unit testing for AutoReduce: Formal validation of dimensionality reduction

Basic skills: Python

Preferred skills: Sympy

Data: --

Model: Ordinary differential equation models

Number of positions available: 1

References: Identify/design canonical unit tests for dimensionality reduction and build with sympy (using AutoReduce). Develop a formal validation framework for dimensionality reduction with ODE models and their reduced versions (considering error and robustness metrics (Pandey et al. 2024).

Dimensionality reduction is a crucial technique in analyzing complex systems modeled by differential equations. This project aims to develop a robust validation framework for reduced-order models using AutoReduce. The student will work on designing a comprehensive set of unit tests using Sympy and integrating them into an automated validation pipeline that checks the accuracy and stability of reduced models against full-scale systems.

Student(s): None

5. Automatic generation and verification of formal system specifications using differential equation descriptions

Basic skills: Python

Preferred skills: Linear analysis

Data: Truthful QA

Model: Pacti and AutoReduce

Number of positions available: 2

References: Develop JSONs using code-generating LLMs for Pacti and verify by integrating AutoReduce for symbolic dimensionality reduction.

Many engineering and biological systems are modeled using differential equations, but manually deriving formal specifications is time-consuming and error-prone. This project will use large language models to automatically generate system specifications in JSON format that can be verified using Pacti. The student will also integrate AutoReduce to analyze the validity and efficiency of these generated specifications through symbolic model reduction techniques.

Student(s): None

6. Self-driving car navigation using large-language models

Basic skills: Python

Preferred skills: CARLA/Unity or other backend simulation

Data: Waymo Open dataset and its variant, the WOMD Reasoning dataset.

Model: Llama 3b

Number of positions available: 3

References: Wei et al. arXiv 2024, Shi et al. arXiv 2022

Brief description: Large-language models are increasingly being explored for decision-making tasks in autonomous systems. This project will investigate how LLMs can be integrated into self-driving car navigation frameworks by using large-scale datasets such as Waymo Open. The project will focus on three key areas: designing a navigation policy, training on real-world trajectory data, and evaluating performance using reinforcement learning in CARLA or Unity simulations.

Student(s): None

7. A web-app for Python-based modeling of protein expression

Basic skills: Python

Preferred skills: Web development (e.g., Flask)

Data: Experimental data from Caltech

Model: Networks with BioCRNpyler

Number of positions available: 1

References: Jurado et al. bioRxiv 2023, GitHub for Python models

Modeling protein expression dynamics is an essential component of computational biology. This project will focus on developing a web application that allows researchers to simulate, visualize, and analyze protein expression models using BioCRNpyler. The web interface will integrate experimental data, provide interactive plotting capabilities, and offer export options for further analysis. The ideal candidate should have experience with Flask or other web frameworks.

Student(s): None

8. Autograder integration with Gradescope

Basic skills: Python

Preferred skills: AI models in Python

Data: Student data from UC Merced

Model: LLama, GPT

Number of positions available: 1

References: Frias et al. 2025

With the increasing use of AI in grading, integrating an autograder with platforms like Gradescope can streamline the evaluation process. This project will develop an AI-enhanced grading system that allows automated feedback generation using LLMs. The goal is to enable automated unit test generation, adaptive scoring, and integration into existing university assessment workflows. The student will work with Python, AI models, and automation tools.

Integrating a Python runtime environment to facilitate code execution and testing with Gradescope. The first (easy) step could be to have the tests be manual and combine the unit test grading with LLM grading to have a final grading score. The second step would be to generate the unit tests using an LLM and grading the code based on these unit tests. Finetune a model to create a “general unit test” creator. Use subprocess to handle inputs to the code. Use GPT-4 one-shot to generate the unit tests and then Llama to provide feedback and grade.

Student(s): None

9. Thematic: Design an LLM app for thematic analysis of educational data

Basic skills: Python and App design

Preferred skills: Finetuning AI

Data: Real student data from UC Merced

Model: Llama / Deepseek

Number of positions available: 3

References: Shailja et al. ASEE 2025 (under review), Xiao et al. IUI 2023

Brief description: Thematic analysis is widely used in education research to extract insights from qualitative student feedback and responses. This project will involve designing an LLM-powered application that automates thematic analysis for educational data. The system will categorize student responses, detect patterns, and generate insights using models like Llama or Deepseek. The application will include an AI pipeline for data processing, statistical correlation tools, and an interactive GUI for researchers.

Student(s): None



Fetching last updated date...

You can contribute to this page by creating a pull request on GitHub.