AI Robot Student in Classroom

(© Yuliia - stock.adobe.com)

In a nutshell

  • ChatGPT passed a full college engineering course with a B grade (82.24%), coming close to the class average of 84.99%, by simply being fed questions, no tutoring, prompting tricks, or human refinement involved.
  • The AI excelled at structured tasks like multiple-choice questions and auto-graded exams, but struggled with open-ended programming projects that required deeper understanding, integration, and optimization.
  • Researchers argue it’s time to rethink education, suggesting that assignments should focus less on pattern-based problem-solving and more on real-world judgment and conceptual reasoning, areas where humans still outperform AI.

URBANA, Ill. — Meet the newest engineering student at the University of Illinois: it never sleeps, never complains about workload, and just scored better than some human classmates. ChatGPT aced multiple-choice tests and stumbled through programming projects to earn a respectable B grade, all while researchers simply fed it questions with zero additional guidance.

A new study from the University of Illinois Urbana-Champaign reveals that ChatGPT can pass a challenging undergraduate engineering course with a solid B grade (82.24%), approaching but not quite reaching the class average of 84.99%. It did so with virtually no human guidance, simply by having questions pasted into it. The research will be presented at the 14th International Federation of Automatic Control Symposium on Advances in Control Education in June.

For countless students already using AI tools to complete assignments, this research confirms what many educators fear: Large Language Models (LLMs) like ChatGPT can successfully navigate much of the traditional college curriculum without breaking a sweat.

How AI Tackled a Full Engineering Course

Researchers evaluated the AI system across 115 assignments in a junior-level Aerospace Control Systems course. The study meticulously tested ChatGPT’s performance across multiple assignment types, from multiple-choice questions to complex programming tasks.

ChatGPT prompt on computer
With minimal help from humans, ChatGPT came close to the class average in a college engineering course. (Bangla press/Shutterstock)

The researchers selected ChatGPT because it’s the most accessible and widely used AI system among students. Their testing approach mimicked realistic student behavior of completing coursework with minimal effort by simply copying and pasting questions without providing any additional context or guidance.

The AI performed remarkably well on structured assignments like multiple-choice homework (90.38%) and exams (89.72%), often outperforming the class average in these areas. However, it struggled significantly with open-ended projects, scoring just 64.34% compared to students’ 80.99%.

“A student might take 20 minutes to answer a question. ChatGPT solves it in less than 20 seconds, but the correctness is sometimes questionable,” says study author Gokul Puthumanaillam from the University of Illinois, in a statement.

Where AI Falls Short

AI excels at pattern recognition but still falters when deep understanding and practical engineering judgment are required. In programming assignments, the AI showed that it could generate technically correct code, but these solutions were often inefficient and unnecessarily complex.

When writing project reports, ChatGPT showed a tendency toward “inappropriate sophistication,” using advanced terminology and precise numerical values without proper justification, a pattern that frequently concealed its lack of fundamental comprehension.

To ensure fair testing, researchers evaluated ChatGPT across three different prompting methods. The first simply uploaded screenshots of questions, the second translated questions into simplified text, and the third added relevant lecture notes before questions. As expected, providing context improved performance, but even the basic screenshot method yielded passing results.

Education in the AI Era

Rather than prohibiting AI tools, the researchers suggest redesigning courses to emphasize projects requiring deeper conceptual understanding and practical judgment, areas where humans still outperform machines.

College engineering course
Melkior Ornik lecturing in his AE 353: Aerospace Control Systems class. (Credit:
The Grainger College of Engineering at University of Illinois Urbana-Champaign)

“Like calculators in math classes, ChatGPT is a tool that’s here to stay and that students will use. What the results of this study pointed out to me is that I need to adjust as an educator,” says Melkior Ornik, Puthumanaillam’s advisor at the University of Illinois. 

Students might celebrate AI shortcuts, but this cuts deeper than homework help. When a computer can pass your course without understanding fundamentals, college assessments aren’t measuring what matters. The only question is whether our educational systems will adapt or become increasingly irrelevant.

Paper Summary

Methodology

The researchers tested ChatGPT (GPT-4) on a complete undergraduate Aerospace Control Systems course (AE 353) at the University of Illinois Urbana-Champaign, evaluating its performance across approximately 115 course deliverables including homework assignments, exams, and programming projects. They used three different prompting approaches: image-based prompting (uploading screenshots), simplified mathematical notation in text form, and context-enhanced prompting that included relevant lecture notes. They employed both “zero-shot” (single attempt) and “multi-shot” (multiple attempts with feedback) approaches to simulate realistic student behavior. All of ChatGPT’s outputs were converted into gradable formats following strict translation protocols that preserved the model’s raw capabilities without human enhancement.

Results

ChatGPT achieved an overall grade of 82.24% (a B grade) compared to the class average of 84.99%. Performance varied significantly by assessment type. The AI excelled in structured homework assignments (90.38% vs. class average of 91.44%) and examinations (89.72% vs. class average of 84.81%), with strongest results in multiple-choice questions. However, it struggled considerably with programming projects (64.34% vs. class average of 80.99%), showing limitations in system integration, error handling, and optimization. Context-enhanced prompting consistently outperformed other methodologies across all question types. The researchers observed that the AI often used template-like structures in explanations and produced inefficient code solutions that lacked the elegance and optimization of high-quality student work.

Limitations

The study focused on a single undergraduate control systems course rather than attempting to draw broad conclusions about AI in education. All testing was done with ChatGPT (GPT-4) only, not comparing against other LLMs. The researchers intentionally set aside broader questions about educational policy and ethics of AI use in academia to focus on technical performance evaluation. The findings are specific to AE 353 during Fall 2024 and may not generalize to other courses or disciplines. Additionally, the study simulated a “minimal effort” approach and did not explore how more sophisticated prompting techniques might further improve AI performance.

Funding/Disclosures

The work was supported by the Grants for Advancement of Teaching in Engineering program at the Grainger College of Engineering, University of Illinois Urbana-Champaign. The researchers acknowledged Professor Timothy Bretl for providing course materials, Grayson Schaer for developing project environments, and Pranay Thangeda for contributing to the question bank and lecture materials.

Publication Information

The paper “The Lazy Student’s Dream: ChatGPT Passing an Engineering Course on Its Own” was authored by Gokul Puthumanaillam and Melkior Ornik from the University of Illinois Urbana-Champaign. It was published as arXiv:2503.05760v2 [cs.CY] on March 11, 2025. Additional materials including syllabus, examination papers, design projects, and example responses can be found at the project website: https://gradegpt.github.io. The research will be presented at the 14th International Federation of Automatic Control Symposium on Advances in Control Education in June.

About StudyFinds Analysis

Called "brilliant," "fantastic," and "spot on" by scientists and researchers, our acclaimed StudyFinds Analysis articles are created using an exclusive AI-based model with complete human oversight by the StudyFinds Editorial Team. For these articles, we use an unparalleled LLM process across multiple systems to analyze entire journal papers, extract data, and create accurate, accessible content. Our writing and editing team proofreads and polishes each and every article before publishing. With recent studies showing that artificial intelligence can interpret scientific research as well as (or even better) than field experts and specialists, StudyFinds was among the earliest to adopt and test this technology before approving its widespread use on our site. We stand by our practice and continuously update our processes to ensure the very highest level of accuracy. Read our AI Policy (link below) for more information.

Our Editorial Process

StudyFinds publishes digestible, agenda-free, transparent research summaries that are intended to inform the reader as well as stir civil, educated debate. We do not agree nor disagree with any of the studies we post, rather, we encourage our readers to debate the veracity of the findings themselves. All articles published on StudyFinds are vetted by our editors prior to publication and include links back to the source or corresponding journal article, if possible.

Our Editorial Team

Steve Fink

Editor-in-Chief

John Anderer

Associate Editor

Leave a Reply