Skip to content

petrosbal/mmb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Matrix Multiplication Benchmark

License Status

C Python K3s Wasm

Cgroup Monitoring Metrics

Overview

A benchmark comparing floating-point performance and resource efficiency across different container runtimes (standard Linux, static binary, WebAssembly).

Project Structure

  • Source code:

    • mmb.c: Benchmark source. Performs matrix multiplication with integrated validity check. Outputs iterations and throughput in MFLOPS.

    • single_env_bench.py: A Python orchestrator that handles pod deployment, log parsing, and cgroup resource monitoring for a single execution environment.

    • metaorchestrator.py: The main benchmarking script. Coordinates a benchmark suite across multiple execution environments. It is configured exclusively by bench_config.yaml

  • Dockerfile: Multi-stage build file defining three targets:

    • debian: Standard GCC build (Debian Slim)
    • static: Statically linked binary (Scratch)
    • wasm: WebAssembly build using WASI SDK (Scratch).
  • Makefile: Automates building the Docker images and importing them directly into the K3s containerd registry.

Prerequisites

  • OS: Linux (root required for /sys/fs/cgroup access).
  • Cluster: K3s (Makefile targets k3s ctr).
  • Tools: Docker, kubectl, Python 3 (pyyaml required).
  • Wasm: KWasm Operator.

Installation & Setup

  1. Build and Import Images: Use the Makefile to build all Docker container images and import them into the K3s registry.
make image-build
flowchart TD
    Root[make image-build]

    subgraph DebianFlow [Debian]
        direction TB
        DebBuild[build standard image]
    end

    subgraph StaticFlow [Static]
        direction TB
        StatBuild[build static binary image]
    end

    subgraph WasmFlow [WASM]
        direction TB
        WasmBuild[build WASM image]
    end


    Import[import to k3s containerd]

    %% Connections
    Root --> DebianFlow
    Root --> StaticFlow
    Root --> WasmFlow

    %% Converge to single import step
    DebBuild --> Import
    StatBuild --> Import
    WasmBuild --> Import
Loading

Creates mmb-debian:latest, mmb-static:latest, mmb-wasm:latest

  1. Clean Up:
  • To remove images from Docker/K3S: make reset
  • To also delete build tools autogenerated by Docker: make hard-reset

Usage

sudo python3 metaorchestrator.py

Note: Root privileges (sudo) are required to read from /sys/fs/cgroup.

The metaorchestrator.py script automates the benchmarking process across many environments by executing single_env_bench.py for each image in bench_config.yaml, using the shared arguments defined below them.

These are the pre-defined parameters of the benchmark suite:

environments:
  - image: mmb-debian:latest
    runtime_class: default

  - image: mmb-static:latest
    runtime_class: default

  - image: mmb-wasm:latest
    runtime_class: wasmtime

  - image: mmb-wasm:latest
    runtime_class: wasmedge

  - image: mmb-wasm:latest
    runtime_class: wasmer

  - image: mmb-wasm:latest
    runtime_class: wamr

shared_args:
  sizes: [256, 512]
  trials: 3
  duration: 30
  warmup: true
  interval: 1.0
  namespace: default
  results_subfolder_name: initial_bm

Output Results

---
config:
  look: classic
  theme: redux-dark
---
flowchart TB
 subgraph DataSources["Data Sources"]
        PodLogs["mmb.c output 
        (via kubectl logs)"]
        Cgroups["cgroup filesystem 
        (/sys/fs/cgroup)"]
  end
 subgraph Host[" "]
  direction BT
        Orchestrator[("Orchestrator")]
        DataSources
  end
    PodLogs -- Iterations, MFLOPS, Validity --> Orchestrator
    Cgroups -- CPU Usage, Memory, Throttling --> Orchestrator
Loading

The results are saved in the /results directory, with one .json file for each execution environment tested. Each entry (<size>_<mode>) contains:

1. Phases (Timestamps)

Lifecycle events, recorded as Unix timestamps:

  • start: Deployment trigger time
  • running_time: Pod reached Running state
  • bench_start: The moment the C application prints BENCH_START (after optional warmup).
  • end: The moment the C application prints BENCH_END.
---
config:
  theme: mc
---
gantt
    title Trial execution flow
    dateFormat X
    axisFormat %

    section Orchestrator
    apply & wait                 :orch1, 0, 2
    logs streaming               :orch3, after orch1, 7s
    cgroup monitor thread active :crit, orch4, after ben2, 4s
    save / cleanup               :orch5, after ben4, 1s

    section Pod Status
    ContainerCreating            :pod1, 0, 2
    Running                      :pod2, after pod1, 7s
    Terminating                  :pod3, after pod2, 1s
    
    section App (mmb.c)
    init                         :ben1, after pod1, 1s
    warmup (5s)                  :ben2, after ben1, 1s
    workload loop                :crit, ben3, after ben2, 4s
    validation & exit            :ben4, after ben3, 1s

    section Timestamps
    start                        :milestone, m0, 0, 0s
    running_time                 :milestone, m1, after pod1, 0s
    bench_start                  :milestone, m2, after ben2, 0s
    ‎ end                 :milestone, m3, after ben3, 0s
Loading

2. Parsed Metrics (C Application Log)

Metrics extracted directly from the C application's standard output:

  • iterations: Total matrix multiplications completed.
  • throughput_mflops: Speed in MFLOPS.
  • valid: true if calculation check passed.

3. Samples (Time Series)

A list of raw snapshots captured from /sys/fs/cgroup at the defined --interval. Each sample object contains:

  • timestamp: Time of the snapshot.
  • usage_usec: Total CPU time consumed in microseconds.
  • user_usec: CPU time spent in user space.
  • system_usec: CPU time spent in kernel space.
  • nr_throttled: Cumulative count of throttle events.
  • mem_bytes: Total memory usage (RSS + Cache).
  • rss_bytes: Resident Set Size.

4. Additional Metrics (Computed)

Derived from processing the raw cgroup samples:

  • cold_start_time: Startup latency (running_time - start).
  • avg_cpu_cores: Average CPU usage normalized to cores (e.g., 1.0 = 1.0 cores used).
  • peak_mem_bytes: The highest memory usage observed during the trial.
  • avg_mem_bytes: The average memory usage throughout the execution.
  • throttled_events: The total number of times the CPU was throttled by the cgroup controller.

About

An automated Kubernetes (k3s) benchmarking suite that evaluates container runtime performance between Linux and WebAssembly. Features configuration via YAML, Python orchestration, streamlined Docker containerization, low-level resource monitoring, and a C-based matrix multiplication workload.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors