Investment is the best word that summarizes Agam Shah’s journey as a graduate student at Georgia Tech.
That is clearest on the surface, where Shah studied how public statements by businesses and financial institutions shape market behavior. At a deeper level, though, his success was buoyed by support from professors and his mentorship of younger students.
Shah’s ability to connect and invest in others led him to partner with Georgia Tech colleagues and start a financial technology business. He returns to campus this week to officially graduate from Tech, giving us a chance to catch up about his grad school experience and life as an entrepreneur.
Graduate: Agam Shah
Research Interests: Quantitative and computational finance, artificial intelligence, natural language processing, large language models (LLMs)
Education: Ph.D. in Machine Learning, home unit in the School of Computational Science and Engineering (CSE)
Faculty Advisors: Scheller College of Business Professor Sudheer Chava and School of CSE Associate Professor Chao Zhang
What persuaded you to attend graduate school at Georgia Tech?
Georgia Tech’s dedicated College of Computing strongly appealed to me. I was particularly drawn to the interdisciplinary nature of its machine learning Ph.D. program and the School of Computational Science and Engineering, both of which align well with my research interests.
What research project(s) from Georgia Tech are you most proud of and why?
I am proud of all 20-plus research papers I have had the opportunity to contribute to at Georgia Tech. However, if I had to choose one, it would be my work on Federal Open Market Committee (FOMC) text analysis, which was also highlighted in the news.
This work is not only well-cited in academic literature, but the language model developed in the paper is also actively used by economists at many of the world’s top central banks, including researchers at the FOMC and the Bank of England. It is also used by leading financial institutions such as BlackRock and Daiwa Securities. Since its release, the model has achieved over 100,000 downloads on Hugging Face.
What can you tell us more about your startup, ZettaQuant?
ZettaQuant aims to solve one of the biggest challenges in using LLMs and agents: working effectively with massive underlying datasets. We serve as a layer between raw data and LLMs, helping distill billions of tokens into the relevant context that models can use.
As a deep-tech startup, we are actively engaging with industry practitioners to better understand how to design and engineer our system to integrate seamlessly with their evolving AI workflows. Given the complexity of the problem we are tackling, particularly in advancing document intelligence systems, we are currently very focused on research and foundational development.
How did your Georgia Tech education prepare you for starting ZettaQuant?
Not just my education, but my entire experience at Georgia Tech, extending beyond the classroom, prepared me for this journey. I met my co-founders at Georgia Tech, and many of the initial use cases we are exploring at ZettaQuant are built on open-source research I conducted there.
In addition to research, I mentored more than 300 students through the Vertically Integrated Project “NLP for Financial Markets.” This experience taught me how to manage teams and think about building systems with a long-term vision.
What advice would you give someone interested in graduate school?
Most people pursue graduate school after already completing more than 15 years of education. Also, people who are admitted to a top school like Georgia Tech are often already well-positioned to secure strong job opportunities. So, graduate school should provide value beyond what you could learn outside the classroom.
Before deciding, think carefully about what you hope to gain from graduate school that you cannot otherwise. Once you enroll, take full advantage of the faculty, research labs, networks, and seminars. Many students underutilize these opportunities during their undergraduate and graduate years.
I would also like to quote the epilogue of my Ph.D. thesis: ‘Advice is abundant; conviction must be your own.’ Build a strong conviction about what you want to achieve from graduate school before committing to it.
What did you do for fun and relaxation while attending Georgia Tech? Do you still keep up with these now?
This may sound unconventional, but I spent a significant amount of time mentoring and teaching throughout my Ph.D. Many of my mentees went on to gain admission to top graduate programs. This included two students I mentored for all four years of their undergraduate studies who later joined the ML Ph.D. program at Georgia Tech. They are now teaching and mentoring students, completing a full-circle journey.
Working with mentees and supporting their growth gives me a strong sense of fulfillment and serves as a form of relaxation. In addition, I enjoy listening to music, especially while coding, and I continue to do that today.
What is your favorite Georgia Tech memory?
If I had to choose one favorite memory, beyond the many exciting late nights in the lab, it would be proposing to my wife on Tech Green at Georgia Tech. She is also a Yellow Jacket, having completed her undergraduate degree here and currently pursuing her Ph.D. Our home truly is a hive of Yellow Jackets.
News Contact
Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu
When Chengrui Li walks across the stage this Thursday at Commencement, it will be his final, and perhaps easiest, performance at Georgia Tech.
Between orchestra concerts, magic shows, and yo-yo exhibitions, Li thrives in the limelight. In fact, not much rattles his nerves considering the five years of pressure he endured studying computational neuroscience at Tech.
Before he returns to New York City to continue building brain-interface technologies at Meta, we caught up with Li to learn how he keeps such a cool head at Georgia Tech and beyond.
Graduate: Chengrui Li
Research Interests: Computational neuroscience, eye-tracking experiments and data analysis, statistical machine learning
Education: Ph.D. in Computational Science and Engineering (CSE)
Faculty Advisor: School of CSE Assistant Professor Anqi Wu
What persuaded you to attend graduate school at Georgia Tech?
My undergraduate was at Sichuan University in China. We knew that the most cutting-edge technology and research were in the United States, so I participated in an undergraduate exchange program at the University of Tennessee, Knoxville, during my third year.
I wanted to pursue a Ph.D. in neuroscience while also becoming very proficient in math and computer science (CS). This led me to apply to the CSE Ph.D. program over others. Georgia Tech’s CS ranking is very high, and the CSE program is very interdisciplinary, which matched my expectations super well. I did attain a solid education in math and CS at Georgia Tech. I also advanced my interest in neuroscience and its application by studying mathematical models and algorithms.
What research project from Georgia Tech are you most proud of?
My variational importance sampling paper is a favorite. That one was based heavily on statistical inference. I spent many hours working through complicated derivation calculations, often half-awake and half-asleep after several late nights.
This paper confirmed to me, though, that innovative research requires both hard work and inspiration, and that this endeavor can be rewarding. The paper was selected as a top 5% spotlight paper at ICLR 2024, a world-leading conference on artificial intelligence research.
Could you share more about your role as a research scientist at Meta?
I have been working on Meta’s electromyography (EMG) neural band. This next-generation human-computer interaction device connects with and navigates Meta’s AI glasses.
With the neural band, you can use finger gestures to control the display content you see through the glasses, like swiping your thumb to scroll the screen, or writing on your lap as if you had a pen in your hand to send WhatsApp messages.
How did your Georgia Tech education prepare you for this role?
By pursuing my Ph.D., I am more proficient in critical thinking, math, coding, and presentation. During my interview, I demonstrated these skills and provided my publication records. This helped me land an internship, enabled my success in that role, and led to a full-time position. Additionally, my background in computational neuroscience best matched the work on the EMG neural band team at a big tech company.
What advice would you give someone interested in graduate school?
First, be clear whether a bachelor’s or master’s degree meets your work needs, or if you are truly interested in a scientific research topic. This interest should be based on your own passion, not the current trends. Interest is an important factor in deciding to pursue a Ph.D. because you have to like the topic and like it for a long time. A Ph.D. will require you to dive deep into a subject you must be genuinely curious about.
Second, we are in a new era with rapid advances in information technology. Time is an invaluable resource and is shaped by technology. You have to think more about your time, consider where and how you spend it, and embrace ways to use it more efficiently.
Can you tell us more about your hobbies and how you keep up with them?
I started learning violin when I was five years old, and magic tricks when I was 11. The brain is a supercomputer suitable for functional computation. Our brain is an interface between the objective and subjective, where computation plays a core role in integrating these exact mechanics into interpretations of the world. This realization was one of the important factors that inspired me to pursue my Ph.D. research in computational neuroscience.
Another comparison I’ve learned after playing violin for 23 years is that the cochlea in our inner ear is a fast Fourier Transformer that simultaneously computes the aesthetic of music for us. Performing magic tricks for 17 years taught me that all the occurrences of seemingly low-probability magic phenomena are achieved by either letting it be a certain event or exhausting all possibilities.
I also have other hobbies, like yo-yo balls. I enjoy performing all these skills in front of audiences. Performing brings me satisfaction when I see excitement and happiness from the people I entertain. I am very grateful to my parents for their cultivation and encouragement in doing things that bring me fulfillment. They taught me to be curious and explore my interests, to enjoy pastimes, and instilled the habit to not give up my passions. These were not secondary things that distracted me from coursework or Ph.D. research, but rather complementary parts of my life that bring out the best in me.
What is your favorite Georgia Tech memory?
I have a lot. For my research, I debated frequently with Anqi Wu, my advisor. These often went late into the night to defend my stances. These challenged my beliefs and made me a stronger scholar, for which I am grateful to Anqi for her time and patience.
I also enjoyed performing in the Georgia Tech symphony orchestra with our great conductor, Chaowen Ting. I was involved with the Georgia Tech Chinese Students and Scholars Association, where I showcased magic and yo-yo performances at organization events.
News Contact
Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu
At Georgia Tech, undergraduate students are an integral part of the research enterprise – particularly when it comes to neuroscience. That dedication to undergraduate research was on full display on April 8, when more than 100 students from Atlanta-area universities gathered for the annual ATL Neuro Networking and Symposium Night.
This student-run event, hosted by the Georgia Tech Student Neuroscience Association (SNA) and co-sponsored by the Institute for Neuroscience, Neurotechnology, and Society (INNS) and the Neuroscience Undergraduate Program at Georgia Tech, aimed to bring together students and faculty from the broader Atlanta neuroscience community for an evening of data-blitz talks showcasing faculty research, undergraduate poster presentations, and catered networking.
“Our goal was to bridge the gap between Atlanta’s institutions and showcase the diversity of undergraduate research,” says Harshin Vijay, symposium director of SNA. “By bringing these groups together through SNA, we’re fostering an ecosystem where the next generation of scientists can exchange ideas and build collaborative networks essential for future innovation."
The impact of undergraduate neuroscience research is “more than bench to bedside,” said INNS Executive Director Chris Rozell at the event. “It’s about advancing neuroscience and neurotechnology to improve society through discovery and innovation. Undergraduate research catalyzes innovation – invigorating and advancing educational programs through collaboration that empowers society – fueling impact and fostering the community of next-generation scientists.”
Featuring more than 40 undergraduate posters, research topics ranged anywhere from the impact of music on associative memory to the role of taste projection neurons in Drosophila. Some students even examined their own coursework, either as a TA or their involvement with capstone research.
“There are neuroscientists in every College at Georgia Tech, and we have undergraduate neuroscience students performing research all over campus and in the broader Atlanta neuroscience community,” says Katharine McCann, the director of Undergraduate Research for Georgia Tech’s neuroscience program. “Events like this bring those students together to learn from each other and broaden their networks. It is exciting to see so many students passionate about their research.”
Four posters were awarded for their work:
Best Poster Design: “Role of Taste Projection Neurons in Drosophila Taste Processing”
- Hanti Jiang, Emory University
Best Presentation: “Neuroscience and Computer Science Roots of Pattern Recognition”
- Rishi Polepally, Georgia Tech
- Aryan Kumar, Georgia Tech
- Vedanth Natarajan, Georgia Tech
Best 4001 Group: “Evaluating Cognitive Engagement in AI-Generated VS. Human-Created Educational Content”
- Hannah Ammari, Georgia Tech
- Shobini Palaniappan, Georgia Tech
- Rayhan Quraishi, Georgia Tech
- Aryan Shah, Georgia Tech
- Divya Tadanki, Georgia Tech
People's Choice Award: “Vibration as an effective facilitation of sensorimotor learning in Blaptica dubia cockroaches”
- Diana Sethna, Georgia Tech
- Jacob Hayes, Georgia Tech
- Ellie Kate Watson, Georgia Tech
- Arya Oak, Georgia Tech
Esha Panse, Georgia Tech
- Hersh Mathur, Georgia Tech
News Contact
Writer: Hunter Ashcraft
Communications Student Assistant
Institute for Neuroscience, Neurotechnology, and Society
Media Contact: Audra Davidson
Research Communications Program Manager
Institute for Neuroscience, Neurotechnology, and Society
Earlier this year, Georgia Tech researchers showed that specially designed lenses could harvest energy from ambient wireless signals, pointing toward a future of battery-free sensors embedded throughout smart cities and digital infrastructure.
But powering devices is only part of the challenge. Enabling those same systems to communicate at modern data rates is a much harder. That’s the leap the team is now making. The same lens-based approach is being used to unlock high-speed communication once considered out of reach for ultra-low-power systems.
In a study published in Nature Communications, researchers in Professor Manos (Emmanouil) Tentzeris’ Agile Technologies for High-performance Electromagnetic Novel Applications (ATHENA) lab demonstrated a first-of-its-kind lens-enabled backscatter system capable of multi-gigabit data rates, reaching up to 4 gigabits per second (Gbps). At the same time, it operates using only a fraction of the power required by conventional wireless devices — bringing high-speed connectivity to systems that were never meant to support it.
For years, backscatter has been treated as a tradeoff: extremely low power, but extremely limited performance. Rather than generating its own radio signal, a backscatter device modulates and reflects existing wireless transmissions to communicate, allowing it to operate with minimal energy.
As a result, backscatter has typically been used only to send small amounts of data, most often in simple identification and sensing systems.
“What we’ve shown is that backscatter doesn’t have to be slow,” said Marvin Joshi, the research lead and Ph.D. candidate in the School of Electrical and Computer Engineering. “With the right architecture, it can operate at gigabit‑per‑second speeds while remaining ultra‑low power.”
The Lens That Makes It Possible
The Georgia Tech team’s dielectric lens — similar in spirit to an optical lens — focuses incoming millimeter-wave energy onto an array of tiny antenna elements, enabling both wireless energy capture and high‑speed backscatter communication within the same system.
The system reshapes and reflects existing wireless signals, with each element modulating the reflected signal to enable high-speed data transmission without requiring a traditional transmitter.
At millimeter-wave frequencies, used by 5G and future 6G systems, there is plenty of available bandwidth, but signals at these frequencies are highly directional and sensitive to alignment.
In practice, that means even small misalignment can break the link. This has been a major limitation for real-world deployment. The lens overcomes that constraint by enabling high gain and wide angular coverage simultaneously, without the need for active beam steering.
“Think of it like a camera lens for wireless signals,” Tentzeris said, who is a Ed and Pat Joy Chair Professor in ECE. “It captures energy coming from many different directions and focuses it efficiently onto the device.”
The result is a system that can communicate over a ±55-degree field of view, maintaining strong performance even when the device and the reader are not perfectly aligned.
Fiber-Level Speeds, Nearly Zero Power
In controlled experiments, the researchers achieved data rates of up to four Gbps, with sustained gigabit communication at distances of up to 20 meters, using high-order modulation schemes like those used in modern cellular networks.
For a system that doesn’t generate its own signal, those numbers are unexpectedly efficient. The system operates at just 0.08 picojoules per bit — approaching million-fold improvements compared to conventional wireless radios.
“To put that in perspective,” Tentzeris said, “a typical wireless transmitter burns milliwatts of power. This system operates at essentially near-zero power while pushing the data rates 1,000 times higher than what traditional backscatter could do.”
Taken together, the results point to a fundamentally different class of wireless system, according to Tentzeris, one that combines high data rates with ultra-low power in a way that hasn’t been demonstrated before.
Based on standard wireless modeling, the team estimates the technology could support Gbps communication over distances of kilometers when paired with existing 5G millimeter-wave infrastructure, extending high-speed, ultra-low-power links far beyond what has been achievable with backscatter systems.
“That combination is exactly what future wireless networks are moving toward. This capability aligns naturally with next‑generation 6G systems,” said Tentzeris, pointing to the growing importance of Integrated Sensing and Communication (ISAC) and Joint Communication and Sensing (JCAS) frameworks that require simultaneous communication, sensing, and localization.
From Smart Cities to Disaster Response
But speed and efficiency are only part of the story. Because the devices are low-cost, lightweight, and printable, they could be deployed at massive scale on buildings, roads, vehicles, drones, or wearable systems.
In a smart city, thousands of these tags could continuously exchange information about traffic, air quality, or structural health without ever needing batteries. That means dense, always-on sensing and communication without worrying about power or upkeep.
In disaster zones, temporary high-speed networks could be set up almost instantly, without cables or power infrastructure.
“Imagine an ambulance transmitting high-resolution medical images in real time, or first responders building a live digital map of a disaster area,” Joshi said. “You get fiber-like performance, but completely wireless and energy-efficient.”
What’s Next
The architecture also lends itself to intelligent optimization, where AI-based control can be enabled to dynamically enhance signal capture and system efficiency, further expanding performance in large-scale deployments.
“This is really about adding intelligence to anything, anywhere,” Tentzeris said. “When communication becomes this fast, efficient, and scalable, entirely new applications become possible.”
With the core architecture now demonstrated, the ATHENA Lab team is shifting focus from proof‑of‑concept to deployment. That means moving out of the lab and into real-world environments. The next phase includes testing the system outdoors, integrating it onto drones and mobile platforms, and exploring flatter, more compact lens designs that could be easier to mount on real-world infrastructure.
“We’re thinking about how this fits into the broader wireless ecosystem,” Joshi said. “We’ve shown what’s possible. Now the question is how far we can push it in the real world."
News Contact
Dan Watson
Titan, Msholo, Kelly, and Tara are just like any other African elephants — intelligent creatures that require mental stimulation in their everyday lives.
They would normally get this in their natural habitats while foraging for food and staying alert to predators that might target calves.
However, the four elephants reside at Zoo Atlanta, so they don’t have to worry about these things.
That’s why zoo caretakers are always on the lookout for better ways to help their elephants exercise their brains.
The caretakers at Zoo Atlanta found one when they met Arianna Mastali, a Ph.D. student in Georgia Tech’s School of Interactive Computing. Mastali designed an audio enrichment wall to help stimulate Zoo Atlanta’s elephants.
Many zoos build concrete enrichment walls to foster elephant problem-solving and critical thinking. The walls usually have holes for the elephants to reach through with their trunks as they search for food, treats, or playful objects on the other side.
Mastali enhanced Zoo Atlanta’s enrichment wall by adding an interactive audio component. A nearby speaker system emits distinctive low-frequency tones when an elephant sticks its trunk into a hole.
“They’re intelligent creatures that require a lot of complexity in their habitat,” Mastali said. “We wanted to add to that complexity while giving them more control.”
Experimenting in the Wild
Mastali’s system uses cameras and computer vision to detect when an elephant’s trunk is inside a hole and then sends a signal to the speakers to play a sound.
Mastali is a member of the Georgia Tech Animal Lab, directed by School of IC professor Melody Jackson. The lab often uses sensing technology to enhance animal wellness.
Mastali said she tried incorporating sensing devices into her project several times. She constructed an insert made of PVC pipe and attached a sensor to its base that used infrared beams to detect the elephant’s trunk.
However, she said it was difficult to account for the elephants’ strength. Their trunks would break the insert after a day or two.
She pivoted toward computer vision to remove the risk of damage and keep the enrichment wall as close to natural as possible.
“A big lesson we learned was that using existing materials the elephants are already familiar with was the best way to do things, and it simplified our design process,” she said.
Shane Rosse, a student in Georgia Tech’s Online Master of Science in Computer Science (OMSCS) program, assisted Mastali with the computer vision component.
Enhancing Environmental Enrichment
Mastali observed the elephants’ behavior at the wall seven days before and seven days after the installation of the audio enrichment system.
The number of times the elephants approached the wall after installation increased by 176%, and time spent at the wall increased by 71%
“We weren’t sure at first if they would care that much, so it was great to see how much time they spent at the wall, especially our less dominant females,” said Kirby Miller, senior elephant caretaker at Zoo Atlanta. “They seem to like it the most.”
Miller said the elephants used to only approach the wall when they knew there was food behind it. That started to change after the audio enrichment system was installed.
“We would be off somewhere else, and we’d hear the speaker playing the sounds, and we knew there wasn’t any food back there,” Miller said. “Tara had her trunk in one of the holes, just listening to the sound. That let us know they do like it, and they’re very curious about it.”
Miller said because elephants have sharp memories and acute senses of hearing and smell, their habitats must be designed with that in mind.
Zoo Atlanta’s African Savanna elephant habitat was redesigned in 2019. In addition to the enrichment wall, it includes a bathing pond, two waterfalls, and swing boom devices that hold hay for elephants to eat as they would in the wild.
Miller said elephants sheltered at any zoo or conservation would benefit from enrichment devices enhanced by technology.
“I think anything they can participate in that gives them choice and control is great for all zoo elephants,” she said. “It depends on the elephants, but with our elephants, they can hear much higher frequencies than we can. That noise isn’t that loud for us, but for them, they’re feeling that noise, and they can hear much more, which makes it more stimulating for them.”
News Contact
Nathan Deen
College of Computing
Georgia Tech
Generative artificial intelligence (AI) is best known for creating images and text. Now, it is helping industries make better planning decisions.
Georgia Tech researchers have created a new AI model for decision-focused learning (DFL), called Diffusion-DFL. Recent tests showed it makes more accurate decisions than current approaches.
Along with optimizing industrial output, Diffusion-DFL lowers costs and reduces risk. Experiments also showed it performs across different fields.
Diffusion-DFL doesn’t just surpass current methods; it also predicts more accurately as problem sizes grow. The model requires less computing power despite these high-performance marks, making it more accessible to smaller enterprises.
Diffusion-DFL runs on diffusion models, the same technology that powers DALL-E and other AI image generators. It is the first DFL framework based on diffusion models.
“Anyone who makes high-stakes decisions under uncertainty, including supply chain managers, energy operators, and financial planners, benefits from Diffusion-DFL,” said Zihao Zhao, a Georgia Tech Ph.D. student who led the project.
“Instead of optimizing around a single forecast, the model evaluates many possible scenarios, so decisions account for real-world risk and become more robust.”
To test Diffusion-DFL, the team ran experiments based on real-world settings, including:
- Factory manufacturing to meet product demand
- Power grid scheduling to meet energy demand
- Stock market portfolio optimization
In each case, Diffusion-DFL made more accurate decisions than current methods. It also performed better as problems became larger and more complex. These results confirm the model’s ability to make important decisions in real-world scenarios with noisy data and uncertainty.
The experiments also show that Diffusion-DFL is practical, not just accurate. Training diffusion models is expensive, so the team developed a way to reduce memory use. This cut training costs by more than 99.7%. As a result, Diffusion-DFL can reach more researchers and practitioners.
“Our score-function estimator cuts GPU memory from over 60 gigabytes to 0.13 with almost no loss in decision quality, reducing the requirement for massive computing resources,” Zhao said. “I hope this expands Diffusion-DFL into other domains, like healthcare, where decisions must be made quickly under complex uncertainty."
Beyond decision-making applications, Diffusion-DFL marks a shift in DFL techniques and in the broader use of generative AI models.
In supply chain management, planners estimate future demand before deciding how much product to stock. In this DFL problem, engineers align ML models with predetermined decision objectives, like minimizing risk or reducing costs.
One flaw of DFL methods is that they optimize around a single, deterministic prediction in an uncertain future.
Diffusion-DFL takes a different approach. Instead of making a single guess, it determines a range of possible outcomes. This leads to decisions based on many likely scenarios, rather than on a single assumed future.
To do this, the framework uses diffusion models. These generative AI models create high-quality data from images, text, and audio.
The forward diffusion process involves adding noise to data until it becomes pure noise. Models trained via forward diffusion can reverse diffusion. This means they can start with noisy data and then produce meaningful insights from training examples.
Real-world data is often noisy and uncertain. Traditional DFL methods struggle in these conditions, but diffusion models are designed to handle them.
Because of this, Diffusion-DFL can explore many possible outcomes and choose better actions. Like image-generation AI, the model works well with complex data from different sources. This enables its use across different industries.
“Diffusion models have achieved significant success in generative AI and image synthesis, but our work shows their potential extends far beyond that,” said Kai Wang, an assistant professor in the School of Computational Science and Engineering (CSE).
“What makes Diffusion-DFL unique is that the specific downstream application guides how the model learns to handle uncertainty.
“Whether we are scheduling energy for power grids, balancing risk in financial portfolios, or developing early warning systems in healthcare, we can explicitly train these highly expressive models to navigate the unique complexities of each domain.”
Zhao and Wang collaborated with Caltech Ph.D. candidate Christopher Yeh and Harvard University postdoctoral fellow Lingkai Kong on Diffusion-DFL. Kong earned his Ph.D. in CSE from Georgia Tech in 2024.
Wang will present Diffusion-DFL on behalf of the group at the upcoming International Conference on Learning Representations (ICLR 2026). Occurring April 23-27 in Rio de Janeiro, ICLR is one of the world’s most prestigious conferences dedicated to artificial intelligence research.
“ICLR is the perfect stage for Diffusion-DFL because it brings together the exact community that needs to see the bridge between generative modeling and high-stakes decision-making for real-world applications,” Wang said.
“Presenting Diffusion-DFL allows us to challenge the traditional training framework of diffusion models. It’s about sparking a broader conversation on how we can align the training objectives of generative AI directly with actual, downstream decision-making needs.”
News Contact
Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu
While people use search engines, chatbots, and generative artificial intelligence tools every day, most don’t know how they work. This sets unrealistic expectations for AI and leads to misuse. It also slows progress toward building new AI applications.
Georgia Tech researchers are making AI easier to understand through their work on Transformer Explainer. The free, online tool shows non-experts how ChatGPT, Claude, and other large language models (LLMs) process language.
Transformer Explainer is easy to use and runs on any web browser. It quickly went viral after its debut, reaching 150,000 users in its first three months. More than 563,000 people worldwide have used the tool so far.
Global interest in Transformer Explainer continues when the team presents the tool at the 2026 Conference on Human Factors in Computing Systems (CHI 2026). CHI, the world’s most prestigious conference on human-computer interaction, will take place in Barcelona, April 13-17.
“There are moments when LLMs can seem almost like a person with their own will and personality, and that misperception has real consequences. For example, there have been cases where teenagers have made poor decisions based on conversations with LLMs,” said Ph.D. student Aeree Cho.
“Understanding that an LLM is fundamentally a model that predicts the probability distribution of the next token helps users avoid taking its outputs as absolute. What you put in shapes what comes out, and that understanding helps people engage with AI more carefully and critically.”
A transformer is a neural network architecture that changes data input sequence into an output. Text, audio, and images are forms of processed data, which is why transformers are common in generative AI models. They do this by learning context and tracking mathematical relationships between sequence components.
Transformer Explainer demystifies how transformers work. The platform uses visualization and interaction to show, step by step, how text flows through a model and produces predictions.
Using this approach, Transformer Explainer impacts the AI landscape in four main ways:
- It counters hype and misconceptions surrounding AI by showing how transformers work.
- It improves AI literacy among users by removing technical barriers and lowering the entry for learning about AI.
- It expands AI education by helping instructors teach AI mechanisms without extensive setup or computing resources.
- It influences future development of AI tools and educational techniques by providing a blueprint for interpretable AI systems.
“When I first learned about transformers, I felt overwhelmed. A transformer model has many parts, each with its own complex math. Existing resources typically present all this information at once, making it difficult to see how everything fits together,” said Grace Kim, a dual B.S./M.S. computer science student.
“By leveraging interactive visualization, we use levels of abstraction to first show the big picture of the entire model. Then users click into individual parts to reveal the underlying details and math. This way, Transformer Explainer makes learning far less intimidating.”
Many users don’t know what transformers are or how they work. The Georgia Tech team found that people often misunderstand AI. Some label AI with human-like characteristics, such as creativity. Others even describe it as working like magic.
Furthermore, barriers make it hard for students interested in transformers to start learning. Tutorials tend to be too technical and overwhelm beginners with math and code. While visualization tools exist, these often target more advanced AI experts.
Transformer Explainer overcomes these obstacles through its interactive, user-focused platform. It runs a familiar GPT model directly in any web browser, requiring no installation or special hardware.
Users can enter their own text and watch the model predict the next word in real time. Sankey-style diagrams show how information moves through embeddings, attention heads, and transformer blocks.
The platform also lets users switch between high-level concepts and detailed math. By adjusting temperature settings, users can see how randomness affects predictions. This reveals how probabilities drive AI outputs, rather than creativity.
“Millions of people around the world interact with transformer-driven AI. We believe that it is crucial to bridge the gap between day-to-day user experience and the models' technical reality, ensuring these tools are not misinterpreted as human-like or seen as sentient,” said Ph.D. student Alex Karpekov.
“Explaining the architecture helps users recognize that language generated by models is a product of computation, leading to a more grounded engagement with the technology.”
Cho, Karpekov, and Kim led the development of Transformer Explainer. Ph.D. students Alec Helbling, Seongmin Lee, Ben Hoover, and alumni Zijie (Jay) Wang (Ph.D. ML-CSE 2024) and Minsuk Kahng (Ph.D. CS-CSE 2019) assisted on the project.
Professor Polo Chau supervised the group and their work. His lab focuses on data science, human-centered AI, and visualization for social good.
Acceptance at CHI 2026 stems from the team winning the best poster award at the 2024 IEEE Visualization Conference. This recognition from one of the top venues in visualization research highlights Transformer Explainer’s effectiveness in teaching how transformers work.
“Transformer Explainer has reached over half a million learners worldwide,” said Chau, a faculty member in the School of Computational Science and Engineering.
“I'm thrilled to see it extend Georgia Tech's mission of expanding access to higher education, now to anyone with a web browser.”
News Contact
Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu
The Atlanta Community-Engaged Research Student Network launched this semester. The program is co-led by Nicole Kennard, assistant director for Community-Engaged Research with the Brook Byers Institute for Sustainable Systems (BBISS), along with Associate Professor Richard Milligan and Associate Professor Sarah Ledford from Georgia State University, Associate Professor Emily Burchfield and Associate Teaching Professor Carolyn Keogh from Emory University, and Iesha Baldwin from Spelman College. The program also partners with several community-based organizations to co-develop strategic direction and provide training. They are Science for Georgia, Historic Westside Gardens, HBCU Green Fund, South River Watershed Alliance, and Food Well Alliance.
The primary aim of the Atlanta Student Community-Engaged Research (CER) Network is to use a peer learning approach to train graduate students with the skills to co-lead community-engaged and locally focused research, while at the same time building relationships with local community organizations. This approach will help address local sustainability and societal challenges, lay the foundation for community-engaged research programs, and enable young researchers interested in this work to thrive in the Atlanta area. Initial funding for the pilot program was provided by the Atlanta Global Studies Center and the Georgia Tech Provost's Excellence in Graduate Studies fund.
The program received a total of 41 applications from graduate students from Georgia Tech, Georgia State University, and Emory University. Thirty-five master’s and Ph.D. students were accepted into the cohort, spanning a wide range of disciplines, from the humanities, sciences, design, public health, engineering, and computing. The program has additionally engaged eight senior-level undergraduates from Spelman College to learn about graduate school tracks with community-engaged research opportunities.
This program provides a unique opportunity to learn engagement and leadership skills not typically taught in graduate programs. Students are attending one training a month over the course of the Spring 2026 semester. Here, they learn about the diversity of sustainability-focused, community-based organizations in the area, develop skills to engage meaningfully with community partners in research projects, and improve the ways they communicate to the public about research.
The Georgia Tech Provost's Excellence in Graduate Studies fund will provide a $2,500 stipend to five Georgia Tech students who will work on a research project with a community partner organization. These projects will take place over the spring and summer semesters this year, providing opportunities for graduate students to apply their newly acquired community-engagement skills to on-the-ground research, while also opening a new pathway for Georgia Tech’s engagement with community partners.
Fellows and projects include:
- Irene Jacob, M.S., city and regional planning, will work with the Food Well Alliance to update the implementation strategy for their 10-year community garden survey.
- Ethan Zhao, M.S., human-computer interaction, will work with Historic Westside Gardens to integrate new technologies into their community garden spaces and assess the benefits to the communities they serve.
- Virginia Cason, M.S., sustainable energy and environmental management, will work with Science for Georgia to translate data gathering and analysis into community-centered narratives.
- Sharon Rachel, Ph.D., history and sociology of technology and science, will work with the HBCU Green Fund to examine the environmental and community impacts of data center projects in Atlanta.
- Ella Neumann, Ph.D., interactive computing, will work with the South River Watershed Alliance to document and communicate the history and impact of the City of Atlanta's combined sewer consent decree, and assess if the intended results of the decree have been met.
Applicants expressed their passion for community-engaged research projects and working directly with local community members and organizations:
“Lived experience is just as valuable as academic expertise, and meaningful change only occurs when both work together. I think that this takes approaching problems with a lot of humility, care, and a genuine desire to listen to communities and their needs.” -Virginia Cason, M.S., sustainable energy and environmental management
“I want to do research that stems from a theoretical question, but is feasible in reality and benefits the community. One of the most efficient ways to achieve this goal is through doing research WITH the community.” -Keke Li, M.S., analytics
“Community-engaged research is not only a methodology, but a commitment to partnership, humility, and shared power.” -Grace Fraser, M.S., city and regional planning
“To me, community-engaged research means working with people, not just for them. CER is not only a method but also a mindset. True impact comes when research and community experience grow together.” -Bingjie Lu, Ph.D., civil engineering
The community partners involved in the program are equally enthusiastic about community-engaged research. As Fred Conrad of Food Well Alliance put it, “Food Well has been intentional about engaging our constituents since we began, and this is not only a continuation of that effort, but a significant refinement of how we accomplish that. I think all of us have deepened our understanding of the CER process since we began this journey.”
News Contact
Brent Verrill, Research Communications Program Manager, BBISS
Whether it’s a fire or a flood, a ship’s crew can only rely on itself and its training in emergencies at sea. The same is true for crews facing digital threats on oil tankers, cargo ships, and other commercial vessels.
New cybersecurity research from the Georgia Institute of Technology, however, revealed that crews aboard commercial vessels were often not adequately prepared to manage cyberattacks effectively due to systemic training gaps.
The findings are based on interviews conducted by researchers with more than 20 officer-level mariners to assess the maritime industry’s readiness to handle cybersecurity attacks at sea.
"Historically, cybersecurity research has focused heavily on cyber-physical systems like cars, factories, and industrial plants, but ships have largely been overlooked,” said Anna Raymaker, Ph.D. student and lead researcher.
“That gap is concerning when more than 90% of the world’s goods travel by sea. Recent incidents, from GPS spoofing to ships linked to subsea cable disruptions, show that maritime systems are increasingly part of the global cyber threat landscape.”
The researchers proposed four practical strategies to strengthen maritime cyber defenses and close the training gaps. Their findings were presented recently at the ACM SIGSAC Conference on Computer and Communications Security (CCS).
1. Make Cybersecurity Training Actually Maritime
Many of those interviewed for the study described current cybersecurity training as “boilerplate” — generic modules that don’t reflect real shipboard risks.
Researchers recommend:
- Role-specific instruction: Navigation officers should learn to detect and identify GPS spoofing. Engineers should focus on vulnerabilities in remotely monitored systems.
- Bridging IT and Operational Technology: Crews need to understand how attacks on IT systems can trigger physical consequences in operational technology — including collisions, groundings, or explosions.
- Hands-on delivery: Replace passive PowerPoints with drills and in-person exercises that build muscle memory.
- Accessible standards: Training must account for the wide range of educational backgrounds across crews and be standardized across ranks.
2. Move Beyond “Call IT”
At sea, crews can’t simply escalate a cyber incident to a shore-based IT department and wait. Operational resilience requires onboard readiness.
Researchers recommend:
- Vessel-specific response plans: Ships need clear, actionable protocols for threats such as AIS jamming or radar manipulation.
- Military-style drills: Adopting MCON (Emission Control) exercises — used by the U.S. Military Sealift Command — can train crews to operate safely without electronic systems.
- Stronger connectivity controls: High-bandwidth satellite systems like Starlink introduce new risks. Clear policies and network segregation are essential to prevent new entry points for attackers.
Related Article: When GPS lies at sea: How electronic warfare is threatening ships and their crews by Anna Raymaker
3. Create Unified, Ship-Specific Regulations
Maritime cybersecurity regulations are often reactive and fragmented. Researchers argue the industry needs a cohesive, domain-specific framework.
Key recommendations include:
- A unified global model: Like the energy sector’s NERC CIP standards, a maritime framework could mandate baseline controls such as encryption, network segmentation, and anonymous incident reporting.
- Rules built for real crews: Regulations designed for large naval operations don’t translate well to smaller merchant or research vessels. Standards must reflect actual shipboard conditions.
- Future-proofing requirements: Autonomous ships and remotely operated vessels expand the cyber-physical attack surface. Regulations must proactively address these emerging technologies.
4. Invest in Maritime-Specific Cyber Research
Finally, the researchers stress that long-term resilience requires deeper technical research focused on maritime systems.
Priority areas include:
- Real-time intrusion detection systems tailored to shipboard protocols.
- Proactive security risk assessments of interconnected onboard systems.
- Cyber-physical modeling to better understand cascading failures in complex maritime environments.
The Bottom Line
Cyber threats at sea are no longer hypothetical. Mariners report real-world incidents ranging from GPS spoofing to ransomware that disrupts global trade.
“Through our interviews with mariners, I saw firsthand how much dedication and pride they take in their work,” said Raymaker. “Our goal is for this research to serve as a call to action for researchers, policymakers, and industry to invest more attention in maritime cybersecurity and support the people who risk their lives every day to keep global trade, food, and energy moving."
A Sea of Cyber Threats: Maritime Cybersecurity from the Perspective of Mariners was presented at CCS 2025. It was written by Raymaker and her colleagues, Ph.D. students Akshaya Kumar, Miuyin Yong Wong, and Ryan Pickren; Research Scientist Animesh Chhotaray, Associate Professor Frank Li, Associate Professor Saman Zonouz, and Georgia Tech Provost and Executive Vice President for Academic Affairs Raheem Beyah.
News Contact
John Popham
Communications Officer II School of Cybersecurity and Privacy
The in-state rivalry between the Yellow Jackets and the Bulldogs usually heats up when Georgia Tech visits the University of Georgia. However, one Saturday last month, the focus shifted from competition to collaboration.
The Georgia Scientific Computing Symposium (GSCS) held its annual meeting on February 21 in Athens. Since 2009, the event has hosted researchers from across the Peach State to showcase homegrown advances in scientific computing.
The symposium highlighted Georgia’s reputation as a computing innovation hub. People from around the world come to Georgia universities to lead computing research. By advancing science, engineering, medicine, and technology, their work improves communities at home and abroad.
Faculty and students from Georgia Tech, UGA, Georgia State University, and Emory University presented at the symposium. Georgia Tech participants came from the colleges of Computing, Engineering, and Sciences.
This year’s organizers agreed to meet in Atlanta for the 2027 symposium. Georgia Tech’s School of Computational Science and Engineering (CSE) will host the 19th GSCS.
“From healthcare to computer chip design, scientific computing underpins many of the technological advances we see in our lives,” said Professor Edmond Chow, associate chair of the School of CSE.
“Scientific computing provides the mathematical models, simulations, and data‑driven tools that make modern innovation possible. It allows people to analyze complex systems, test ideas virtually before building them, and make faster, more accurate decisions across nearly every sector of society.”
Professor Haomin Zhou and Assistant Professor Helen Xu delivered two of the symposium’s five plenary talks.
Zhou presented a new method for solving the Schrödinger equation, a landmark equation in quantum mechanics. Drawing inspiration from the mathematics used in generative artificial intelligence models, his approach develops an algorithm that more effectively simulates waves, particle motion, and other physical systems.
Xu focused on improving how computers move and organize data during complex calculations. Her work uses “cache-friendly” layouts that help computers access data more efficiently, boosting performance for scientific and engineering applications.
“Speaking at GSCS was a great opportunity,” Xu said. “The symposium fostered connections within the scientific computing community and gave us a chance to share exciting research.”
The symposium showcased student work through a poster blitz and a poster session. During the blitz, 36 students each had one minute to introduce their research to the full audience. They then shared more details about their research during the poster session.
The student projects showed the range of fields supported by scientific computing. The session also provided attendees with an opportunity to connect and expand their professional networks, helping grow the field’s future impact.
“As an aerospace engineer by training and aspiring computational scientist, GSCS gave me the platform to network with other researchers in the field while showcasing my own research,” said M.S. student Kashvi Mundra.
“I was able to connect with scientists across different disciplines whose work intersects with my own in unexpected ways. Those conversations pushed my thinking beyond my own lab's perspective, helping me see my work on physics-informed machine learning for inverse problems in a broader scientific computing context.”
Georgia Tech students who presented posters included:
Abir Haque (CSE), Massively Parallel Random Phase Approximation Correlation Energy via Lanczos Quadrature
Antonio Varagnolo (CSE), Physics-Enhanced Deep Surrogates for the Phonon Boltzmann Transport Equation
Ben Burns (CSE), Infinite-Dimensional Stein Variational Inference with Derivative-Informed Neural Operators
Ben Wilfong (CSE), Shocks without Shock Capturing; Compressible Flow at 1 quadrillion Degrees of Freedom without Loss of Accuracy
Daniel Vickers (CSE), Highly-Parallel Fluid-Solid Interactions for Compressible Flows
Eric Fowler (CSE), High-Performance Tensor Contractions in Computational Chemistry
Haoran Yan (Math), Understanding Denoising Autoencoders through the Manifold Hypothesis: A Geometric Perspective
Kashvi Mundra (CSE), Autoregressive Multifidelity Neural Surrogate Modeling under Scarce Data Regimes
Sebastián Gutiérrez Hernández (Math/CSE), PDPO: Parametric Density Path Optimization
Vivian Zhang (AE), Multifidelity Operator Inference: Non-Intrusive Reduced Order Modeling from Scarce Data
Xian Mae Hadia (CSE), Data Efficiency of Surrogate Models: Learning Physics Data from Full Field Data vs. Inductive Bias from Approximate PDE Solvers
Xiangming Huang (CSE), Neural Operator Accelerated Evolutionary Strategies for PDE-Constraint Optimization
Zhaiming Shen (Math), Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods
Zhongjie Shi (Math), Towards Understanding Generalization in DP-GD: A Case Study in Training Two-Layer CNNs
News Contact
Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu
Pagination
- Page 1
- Next page