Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
167 commits
Select commit Hold shift + click to select a range
0d4ae79
[Frontend] Use ops instead of raw assembly code
YWHyuk Sep 9, 2025
bea9bd2
[Test] Add matmul vector fusion case
YWHyuk Sep 9, 2025
837b062
[Frontend] Fix ops conversion
YWHyuk Sep 9, 2025
a33659a
[Frontend] Use custom malloc in the validation wrapper code
YWHyuk Sep 10, 2025
4e2d0a0
[Device] Add missing operations
YWHyuk Sep 10, 2025
6e70edc
[Frontend] Add typecasting for logical operation
YWHyuk Sep 10, 2025
54f450a
[Device] register amp
YWHyuk Sep 10, 2025
8985ab8
[Frontend+Test] Support scatter pattern with a test case
YWHyuk Sep 12, 2025
1c2c8bf
[Fix] minor bugs
YWHyuk Dec 5, 2025
1895958
[Fix] Fix the acceess to wrong variable
YWHyuk Dec 8, 2025
cd14109
[Log] Add print lock to prevent log crash
YWHyuk Dec 8, 2025
5fe87e9
[Device] Add custom zero_, zeors_like
YWHyuk Dec 8, 2025
db18cbd
[Frontend/Spike] Use 64byte aligned buffer size
YWHyuk Dec 8, 2025
1152428
[Refactor] Seperate OpOverrides
YWHyuk Dec 8, 2025
8452f5c
[Test] Add Llama1&2 test cases
YWHyuk Dec 8, 2025
00cd8c7
[TOGSim] Add error handling
YWHyuk Dec 8, 2025
a8d96cd
[Scheduler] Use given config file for compilations
YWHyuk Dec 8, 2025
8aac3ab
[Fix/ops] Fix wrong implementation of sigmoid
YWHyuk Dec 9, 2025
fd6a846
[Tests] Use manual mask for Llama
YWHyuk Dec 9, 2025
dea7f47
[TOGSim] Use YAML instead of json
YWHyuk Dec 9, 2025
d66df91
[Frontend] Use YAML config file instead of json
YWHyuk Dec 9, 2025
dce58d0
[Test] Change attention masek for Llama
YWHyuk Dec 9, 2025
1c2ab36
[Autotune] Fix autotune log path
YWHyuk Dec 9, 2025
20af550
[Fix] Fix codegen error in ops.select
YWHyuk Dec 9, 2025
2276450
Merge pull request #164 from PSAL-POSTECH/ops
YWHyuk Dec 11, 2025
c39c3a3
[Tutorial] Update environment setting for the tutorial
YWHyuk Dec 11, 2025
8678fe6
[Tutorial] Add tutorail env setting scripts
YWHyuk Dec 12, 2025
0a5d0e7
[Tutorial] Change format of config files to yml
YWHyuk Dec 15, 2025
008cf4c
[Tutorial] Fix typo dockerfile
YWHyuk Dec 15, 2025
18d7bab
[Tutorial] Fix wrong config name
YWHyuk Dec 15, 2025
1e4d72a
[Fix] configuration reference in DNNServing.ipynb
YWHyuk Dec 17, 2025
232c4a6
Change log level from warn to debug for unused tags
YWHyuk Dec 17, 2025
8b0f535
Add placeholder echo command in Dockerfile
YWHyuk Dec 17, 2025
6021315
[Frontend] prevent reload device
YunseonShin Dec 17, 2025
7c5dccc
[fix] setup_device to class method
YunseonShin Dec 17, 2025
88d9eb8
[Fix] typo in TOGSIM_CONFIG
YunseonShin Dec 17, 2025
d1ffac2
Remove echo command from Dockerfile
YWHyuk Dec 17, 2025
7c45f80
Refactor NPU module variable naming convention
YWHyuk Dec 17, 2025
4ac84ae
Merge pull request #199 from PSAL-POSTECH/tutorial
YWHyuk Jan 5, 2026
af48bc3
[Fix] Indirect store & add a test case
YWHyuk Jan 5, 2026
6d043ad
[Fix] relax vlane_stride constraints to resolve tile size conflicts #201
YWHyuk Jan 5, 2026
f6ada1f
[Refactor] Remove unused env vars
YWHyuk Jan 5, 2026
3ccfc11
[CI] Add CI for pytorch2.8
YWHyuk Jan 6, 2026
0abfffe
PyTorch version upgrade: tested on single-operator tests
wok1909 Sep 24, 2025
b7a275e
[Test] Add torch.no_grad(), change to use torch.nn.ReLU, fuion off
wok1909 Sep 24, 2025
5c5e61c
[Implement] Hook and GuardImpl for extension device
wok1909 Nov 6, 2025
74704b8
[CI] Change the trigger condition
YWHyuk Jan 6, 2026
d3f3298
[CI] Use CMake 3 to build pytorchsim
YWHyuk Jan 6, 2026
0763363
[CI] Seperate base image
YWHyuk Jan 6, 2026
4591403
[Fix] PyTorch2.8 support (WIP)
YWHyuk Jan 7, 2026
b9d4144
[Fix] Use official prologue fusion path
YWHyuk Jan 7, 2026
9abc060
[Fix] Don't split a reduce kernel
YWHyuk Jan 7, 2026
2c7264b
[Fix] Add a missing reduction fusion condition
YWHyuk Jan 7, 2026
b951b95
[Fix] update indirect_index interface for v2.8
YWHyuk Jan 7, 2026
c6ba98c
[Fix] Allow cpp kernel code in the wrapper function
YWHyuk Jan 7, 2026
fd07eda
[Ops] Use V.kernel instead of argument passing
YWHyuk Jan 8, 2026
4bed31b
[Fix] Set epilogue fusoin condition
YWHyuk Jan 8, 2026
758b5b3
[Fix] Support Identity indexing + Fix wrapper codegen
YWHyuk Jan 8, 2026
a7ab604
[Fix] Keep contextvar after reset()
YWHyuk Jan 8, 2026
cd52f57
[Frontend] Add decompsition of default attetnion
YWHyuk Jan 8, 2026
08e0c8b
[Fix] Add missing case
YWHyuk Jan 8, 2026
1d1508a
[Test] Add GQA test file
YWHyuk Jan 8, 2026
862ba44
[Fix+Log] Change logging system + Fix meta_code interface
YWHyuk Jan 9, 2026
75207a4
[Test] Wrap softmax module
YWHyuk Jan 9, 2026
8df5fef
[Log] Add progress bar for auto-tuning
YWHyuk Jan 9, 2026
d7c16b1
[Test/MoE] Disable compiling sparse dispatcher
YWHyuk Jan 9, 2026
c88cabc
[Fix] Support identity in the dram_stride extraction
YWHyuk Jan 12, 2026
67612bb
[Fix] index to float casting
YWHyuk Jan 12, 2026
50ceb58
[Fix] Change vlane_split_axis in case of group-dim
YWHyuk Jan 12, 2026
319fd6c
[Frontend] Fix any operation codegen
YWHyuk Jan 13, 2026
c223258
[Decompose] Use F.softmax for decomposed SDPA
YWHyuk Jan 13, 2026
07be94b
[Frontend] Add recompiliation for ModularIndexing
YWHyuk Jan 13, 2026
e999bfc
[Test] Fix minor bugs in the test folder
YWHyuk Jan 13, 2026
d747e7e
[Log] Add progress bar in spike simulation
YWHyuk Jan 13, 2026
b49b679
[Fix] Use extraction for vlane_offset + Register extract op
YWHyuk Jan 15, 2026
729b999
[Tests/Diffusion] Add embedding test case
YWHyuk Jan 15, 2026
7fa8d54
[Tests/MoE] Add patch to avoid dynamo bug
YWHyuk Jan 15, 2026
7919094
[Fix] Change wrong TORCHSIM_DUMP_PATH usage
YWHyuk Jan 15, 2026
1ca3348
[Scheduler] Validate pytorchsim_timing_mode != 0 in Scheduler constru…
YWHyuk Jan 15, 2026
8df3bee
[Fix] Move rename_indexing before load cacheing
YWHyuk Jan 15, 2026
ea79ad0
[Fusion] Fix template codegen + Add custom fusion hook
YWHyuk Jan 16, 2026
0c6175f
[Template] Fix template fusion codegen
YWHyuk Jan 19, 2026
a90f114
[Fix] Fusion axis mechanism change
YWHyuk Jan 20, 2026
78613ad
[Test] Fix syntax error in experiment scripts
YWHyuk Jan 22, 2026
21d08f2
[CI] Change base image for OpenReg build
YWHyuk Jan 22, 2026
24e67ed
[OpenReg] Use OpenReg style Custom device
YWHyuk Jan 22, 2026
468f414
[Device] Use torch.device(npu)
YWHyuk Jan 22, 2026
a625409
[SDPA] Use math as a default
YWHyuk Jan 23, 2026
a053314
[AMP] Add amp interface for OpenReg style device
YWHyuk Jan 23, 2026
eda34ff
[Tests] Cleanup unnecessary code in tests
YWHyuk Jan 23, 2026
3f8b866
[Cleanup] Remove built libraries
YWHyuk Jan 23, 2026
174e10f
[Device] Rename deivce PyTorchSimDevice2 to PyTorchSimDevice
YWHyuk Jan 23, 2026
89546d7
[Test] Add YOLOv5 test file
Jagggged Jan 24, 2026
70f0f6c
Merge pull request #208 from Jagggged/fix/add_yolov5_2.8_ljg
YWHyuk Jan 27, 2026
d5be66e
[Cleanup] Fix indent error
YWHyuk Jan 27, 2026
5ec144d
[Test #204] Add yolov5 test ci
YWHyuk Jan 27, 2026
730fce9
[Fix] Remove comments
YWHyuk Jan 27, 2026
47c563e
[Frontend] Fix Identity handling for index expr
YWHyuk Feb 2, 2026
d3cf863
[OpenReg] Add Python interface for device stream, event API
YWHyuk Feb 3, 2026
5224cc9
[Scheduler] Reimplement Scheduling mechanism
YWHyuk Feb 3, 2026
09753bc
[TOGSim] Rename scheduler_graph to enqueue_graph
YWHyuk Feb 5, 2026
235bb5c
[TOGSim] Add comments feature in trace files
YWHyuk Feb 5, 2026
9dbe037
[Eager] Add eager mode POC
YWHyuk Feb 5, 2026
f9a9f5f
[Eager] Add eager to graph fallback API
YWHyuk Feb 6, 2026
a13f37b
[Template] Conv warpper minor fix
YWHyuk Feb 6, 2026
e840786
[Fix] Index_expr ops codegen issue
YWHyuk Feb 11, 2026
f60cbe5
[Codegen] Use ops instead of raw assembly
YWHyuk Feb 23, 2026
014cb11
[Test] Add DeepSeek v3 base test file and etc. (WIP)
Jagggged Feb 19, 2026
9a27549
[Fix] Polish the error handling of dram_stride calculation
YWHyuk Feb 25, 2026
9b92f11
[Frontend/template] add SDPA modules
student-Jungmin Mar 2, 2026
88e79e0
[CI] Update for torch 2.8 based image
YWHyuk Mar 3, 2026
e359ac8
Merge pull request #166 from PSAL-POSTECH/torch_v2.8
YWHyuk Mar 3, 2026
fc247be
[Template] Add cat & sort template + Multi-output (WIP)
Jagggged Mar 1, 2026
f615178
[Fix] Prevent fallback to eager mode after reaching compilation limit…
student-Jungmin Mar 4, 2026
8ca5d02
[FIX] Add idx_map to the first matmul for logical consistency
student-Jungmin Mar 4, 2026
41288bc
[Template] Polish template kernel of cat operation
YWHyuk Mar 3, 2026
434bbb1
[WIP]
YWHyuk Mar 4, 2026
5295dfb
[Template] Delay def_dma_op codegen
YWHyuk Mar 4, 2026
61caebd
[Template/Cat] Fix apply offset setting
YWHyuk Mar 4, 2026
47684a7
[TOGSim] Add help print
YWHyuk Mar 5, 2026
a24f1f1
[Template/Cat] Limit maximum rank of tile
YWHyuk Mar 5, 2026
4e4300e
[Template/Cat] Refactor cat + Support explicit dram+stride in def_dma_op
YWHyuk Mar 5, 2026
3d9cb38
[Frontend/template] Connect SDPA template to NPU using Torch OpenReg
student-Jungmin Mar 5, 2026
591e8a9
[Templte/Cat] Apply copy operation when node has view
YWHyuk Mar 5, 2026
dab3495
[Refactor] Refactored TopK test code for the OpenReg device
student-Jungmin Mar 7, 2026
a15f5d2
[Template/Sort] Add template code for Bitonic sort
YWHyuk Mar 11, 2026
752cbb8
[Template] Use buffer type instead of hard-coded type
YWHyuk Mar 11, 2026
7af91de
[Frontend] Fix incorrect constant key usage and boolean scientific-no…
HamHyungkyu Mar 12, 2026
7bad17a
[Fix] Refactor MLIR precision handling to be dtype-driven
YWHyuk Mar 11, 2026
fadba78
[Fix] malloc size align + fix origin info
YWHyuk Mar 12, 2026
0189ab9
[TOGSim] Fix local/remote memory stat
YWHyuk Mar 12, 2026
f7f2696
Merge branch 'feat/deepseek' into feature/TopK
YWHyuk Mar 13, 2026
37474cd
Merge pull request #218 from student-Jungmin/feature/TopK
YWHyuk Mar 13, 2026
5268be2
[Frontend/template] add SPDA decode GQA template imlementation
HamHyungkyu Mar 12, 2026
59bd8f8
WIP
YWHyuk Mar 12, 2026
bfc2b22
[Frontend/template] SPDA implementation debug
HamHyungkyu Mar 13, 2026
ce93306
[Template/SPDA] Remove subtile size temporarily
YWHyuk Mar 13, 2026
f2717e1
[Template/SPDA] minor fix
YWHyuk Mar 13, 2026
be23638
[Cleanup] Unflag debug option
YWHyuk Mar 16, 2026
e925ae4
[CI] Add deepseek test case
YWHyuk Mar 16, 2026
db85991
[Template/SPDA] Cleanup test case + Add an activate option
YWHyuk Mar 17, 2026
dd71c70
[Frontend] Handle RecompileSignal in MLIRKernel code generation
HamHyungkyu Mar 17, 2026
c5f085e
[Frontend] Enhance vector size handling for low-precision paths in ML…
HamHyungkyu Mar 17, 2026
fdd5b54
[Refactor] move to TOGSimulator-based scheduler API
YWHyuk Mar 18, 2026
3847f9b
[CI] Add missing package + Add test cases
YWHyuk Mar 19, 2026
1d7a3a9
[FIX] Fix zero systolic array utilization during SDPA execution in TO…
student-Jungmin Mar 22, 2026
10f5923
[Frontend/Fix] Enforce vector length constraints and resolve ext() wi…
HamHyungkyu Mar 17, 2026
a32f9e0
[Frontend] Add optimized GQA decode implementation with tile-based so…
HamHyungkyu Mar 17, 2026
9e20d95
[PyTorchSim/Frontend] Use kernel specific filelock to avoid race
YWHyuk Mar 23, 2026
070c43a
[Fix] replace outdated config name
YWHyuk Mar 23, 2026
9fc0811
[Experiment] use timing mode for validation script
YWHyuk Mar 23, 2026
cf56c59
[CI] Run validation script only for vector_lane==128
YWHyuk Mar 23, 2026
8d22583
[TOGSim] Add error handling of idle stat couting
YWHyuk Mar 23, 2026
0b60ddd
[TOGSim] Update DRAM Bw stat with exact number
YWHyuk Mar 23, 2026
6bc1204
[Experiment] Fix ils script to use updated config
YWHyuk Mar 24, 2026
336fdf3
[CI] Remove dump folder mount for test
YWHyuk Mar 24, 2026
8838bfe
[Decompse] Add naive group convolution decomposition + test
YWHyuk Mar 24, 2026
9b0ab3b
[Frontend] Fix attribute passing to TOGSIM
YWHyuk Mar 25, 2026
5cbe9d1
[Frontend] Fix loop_size argument passing
YWHyuk Mar 25, 2026
f03f727
[Script] Add utility option
YWHyuk Mar 25, 2026
1ae39bf
[Cleanup] #219 cleanup the deprecated scheduler module
YWHyuk Mar 26, 2026
8ca844a
[Frontend/MobileNet] Add MobileNet CI and 1x1 spatial conv linear dec…
YWHyuk Mar 26, 2026
699d9b9
Merge pull request #220 from student-Jungmin/feat/deepseek
YWHyuk Mar 26, 2026
6f74722
[Test] Add missing mobilenet test script
YWHyuk Mar 31, 2026
1c28159
Merge pull request #215 from PSAL-POSTECH/feat/deepseek
YWHyuk Apr 2, 2026
7b6cfe5
[TOGSim] Migration to Ramulator2.1
HamHyungkyu Apr 4, 2026
dd991c1
[CI] Add thirdparty release manifest; pin base image tag and build on…
YWHyuk Apr 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 0 additions & 68 deletions .github/workflows/docker-base-image.yml

This file was deleted.

115 changes: 97 additions & 18 deletions .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,68 +3,147 @@ name: Docker image CI
on:
pull_request:
branches: [ "master", "develop" ]
workflow_dispatch:

env:
BASE_IMAGE_REPO: ghcr.io/psal-postech/torchsim_base
# PR: head commit; otherwise workflow_dispatch uses the branch SHA
SOURCE_SHA: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}

jobs:
ensure-base:
runs-on: ubuntu-latest
outputs:
base_image: ${{ steps.pin.outputs.base_image }}
permissions:
contents: read
packages: write

steps:
- name: Checkout Code
uses: actions/checkout@v4
with:
ref: ${{ env.SOURCE_SHA }}
submodules: recursive

- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: PyTorch base image from manifest
run: |
PYTORCH_IMAGE=$(python3 -c "import json; from pathlib import Path; v=json.loads(Path('thirdparty/github-releases.json').read_text()).get('pytorch_image'); print(v or '')")
if [ -z "$PYTORCH_IMAGE" ]; then echo "thirdparty/github-releases.json: pytorch_image is required" >&2; exit 1; fi
echo "PYTORCH_IMAGE=$PYTORCH_IMAGE" >> "$GITHUB_ENV"

- name: Thirdparty pin
id: pin
run: |
PIN="$(bash scripts/ci/thirdparty_base_pin.sh)"
echo "pin=${PIN}" >> "$GITHUB_OUTPUT"
echo "base_image=${BASE_IMAGE_REPO}:thirdparty-${PIN}" >> "$GITHUB_OUTPUT"
echo "BASE_IMAGE=${BASE_IMAGE_REPO}:thirdparty-${PIN}" >> "$GITHUB_ENV"

- name: Check base image exists
id: exists
run: |
if docker manifest inspect "${BASE_IMAGE}" > /dev/null 2>&1; then
echo "ok=true" >> "$GITHUB_OUTPUT"
else
echo "ok=false" >> "$GITHUB_OUTPUT"
fi

- name: Resolve GitHub release asset IDs
if: steps.exists.outputs.ok != 'true'
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: bash scripts/ci/thirdparty_github_asset_env.sh >> "$GITHUB_ENV"

- name: Build and push base image (missing pin)
if: steps.exists.outputs.ok != 'true'
uses: docker/build-push-action@v4
with:
context: .
file: ./Dockerfile.base
push: true
build-args: |
PYTORCH_IMAGE=${{ env.PYTORCH_IMAGE }}
GEM5_ASSET_ID=${{ env.GEM5_ASSET_ID }}
LLVM_ASSET_ID=${{ env.LLVM_ASSET_ID }}
SPIKE_ASSET_ID=${{ env.SPIKE_ASSET_ID }}
tags: ${{ env.BASE_IMAGE }}

build-and-test:
needs: ensure-base
runs-on: self-hosted

permissions:
contents: read
packages: write

steps:
# Step 1: Checkout the repository
- name: Checkout Code
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
ref: ${{ env.SOURCE_SHA }}
submodules: recursive

# Step 2: Log in to GitHub Container Registry
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

# Step 3: Build and Push Docker Image
- name: Build and Push Docker Image
uses: docker/build-push-action@v6
with:
context: .
file: ./Dockerfile
push: true
no-cache: true
tags: ghcr.io/psal-postech/torchsim-test:${{ github.sha }}
build-args: |
BASE_IMAGE=${{ needs.ensure-base.outputs.base_image }}
tags: ghcr.io/psal-postech/torchsim-test:${{ env.SOURCE_SHA }}

# Step 4: Wait for GHCR propagation
# Do not use GITHUB_SHA here: on pull_request it is the merge commit, while the image tag uses SOURCE_SHA (PR head).
- name: Wait for GHCR propagation
env:
IMAGE_SHA: ${{ env.SOURCE_SHA }}
run: |
for i in {1..30}; do
IMG="ghcr.io/psal-postech/torchsim-test:${IMAGE_SHA}"
echo "Verifying tag matches push: ${IMAGE_SHA}"
for i in $(seq 1 30); do
echo "Checking if image exists in GHCR (attempt $i)..."
if docker manifest inspect ghcr.io/psal-postech/torchsim-test:${GITHUB_SHA} > /dev/null 2>&1; then
if docker buildx imagetools inspect "$IMG" > /dev/null 2>&1; then
echo "Image is now available in GHCR."
exit 0
fi
echo "Image not yet available, retrying in 30 seconds..."
if [ "$i" -eq 1 ]; then
echo "buildx imagetools inspect failed; stderr (first attempt):"
docker buildx imagetools inspect "$IMG" 2>&1 || true
fi
echo "Image not yet available, retrying in 20 seconds..."
sleep 20
done
echo "Image did not become available in GHCR within expected time."
exit 1

test-pytorchsim-wrapper:
test-pytorchsim-wrapper1:
needs: build-and-test
uses: ./.github/workflows/pytorchsim_test.yml
with:
image_name: ghcr.io/psal-postech/torchsim-test:${{ github.sha }}
image_name: ghcr.io/psal-postech/torchsim-test:${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
vector_lane: 128
spad_size: 128

# call-test2:
# needs: build-and-test
# uses: ./.github/workflows/pytorchsim_test.yml
# with:
# image_name: ghcr.io/psal-postech/${GITHUB_SHA}
# vector_lane: 8
# spad_size: 32
test-pytorchsim-wrapper2:
needs: build-and-test
uses: ./.github/workflows/pytorchsim_test.yml
with:
image_name: ghcr.io/psal-postech/torchsim-test:${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
vector_lane: 32
spad_size: 32
2 changes: 1 addition & 1 deletion .github/workflows/docker-tutorial-image.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@ jobs:
uses: docker/build-push-action@v4
with:
context: .
file: ./Dockerfile.ksc2025
file: ./tutorial/jupyterhub/Dockerfile.ksc2025
push: true
tags: ghcr.io/psal-postech/torchsim_ksc2025:latest
Loading
Loading