I mostly focus on supercomputing, C++, and AI. I received my Ph.D. for my work on runtime systems for HPC systems from Texas A&M University; you can find my dissertation at the Texas A&M University Libraries.
These days I am a software engineer at AMD Research and Advanced Development (RAD).
- intellikit (as ypapadop-amd): I've been building IntelliKit, a suite of tools, such as skills and Model Context Protocol (MCP) servers, that give LLMs structured access to performance engineering workflows, bridging the gap between AI assistants and software optimization for AMD products.
- ROCr (as ypapadop-amd): I contribute to the foundational layers of the ROCm ecosystem - most recently extending the ROCm Runtime (ROCr) to expose AMD's XDNA NPU architecture, bringing AI accelerator support into the same runtime that drives GPU compute.
- MLIR-AIE and IRON (as ypapadop-amd): Contributing to IRON, a close-to-metal toolkit that empowers performance engineers to create fast and efficient designs for Ryzen™ AI NPUs powered by AI Engines.
- STAPL: I developed the STAPL Runtime System (STAPL-RTS) which is the platform abstraction layer of the STAPL framework, a parallel superset of the C++ Standard Template Library. STAPL-RTS offers seemless parallel algorithm composition for large-scale computations by introducing asynchronous nested parallelism support. It provides a novel consistency model, and a unified distributed/shared memory communication and task scheduling primitives. The work demonstrated that parallel algorithm composition at scale doesn't sacrifice programmability. STAPL-RTS was developed as part of my Ph.D. dissertation under the supervision of Dr. Lawrence Rauchwerger.
You can find me at:
Code:
Academic:

