Skip to content

PQCP Inclusion Proposal: ml-kem and ml-dsa — Portable C++20 constexpr Header-Only Zero-Dependency Implementations of FIPS 203 & FIPS 204 #204

@itzmeanjan

Description

@itzmeanjan

Proposer: Anjan Roy (Github: @itzmeanjan)
Repositories:

Current licenses:

  • ml-kem: BSD-3-Clause
  • ml-dsa: MIT

(To be relicensed to Apache 2.0 — see section below)


Overview

I would like to propose two related projects for consideration as PQCP sub-projects:

  • ml-kem — a C++20 implementation of ML-KEM (NIST FIPS 203), supporting all three parameter sets.
  • ml-dsa — a C++20 implementation of ML-DSA (NIST FIPS 204), supporting all three parameter sets, with both hedged (randomized) and deterministic signing modes, and pre-hash signing via only SHA3 family of functions.

Both libraries are header-only, fully constexpr, portable, have zero third-party dependencies and written in C++20. Integration into existing C++ projects requires no build system dependency beyond pointing the compiler at the include/ directory or with CMake FetchContent passing git repository URL.


How these complement existing PQCP implementations?

The existing PQCP portfolio covers C90, Assembly, Jasmin, and Rust — but has no C++ implementations at all. These two projects will fill that gap.

Project Language Primary focus Formal verification
mlkem-native C90 + Assembly Native performance, platform-optimized assembly CBMC + HOL Light
mlkem-libjade Jasmin Formally verified native code EasyCrypt
rust-libcrux Rust Formally verified, portable + AVX2 Yes
liboqs C Unified multi-algorithm API wrapping reference implementations No
ml-kem (this proposal) C++20 Header-only, fully compile-time evaluable, proven absence of core language undefined behaviour (UB) Not yet
ml-dsa (this proposal) C++20 Header-only, fully compile-time evaluable, proven absence of core language undefined behaviour (UB) Not yet

What value does ml-kem and ml-dsa bring?

  • The public API uses statically-sized std::span throughout — no raw pointers, no implicit buffer size arguments. No pointer arithmetic. Key, ciphertext, and signature sizes are encoded in the type system itself, making API misuse a compile-time error rather than a runtime bug.
  • The two libraries and all their first-party dependencies are fully constexpr - meaning not only one can fully compile-time evaluate them, they also prevent a class of core C++ language undefined behaviours (UB) such as buffer overflow, out of bounds access, use after free etc. - a feature required by C++ standard.
  • Known-answer test (KAT) vectors can be validated via static_assert at compile time, costing higher compilation latency but zero runtime cost. Compiler simply fails to compile the program if ml-kem or ml-dsa has a bug, that gets triggered by the deployed KAT.
  • For firmware and embedded C++ targets where seeds are known at build time, keys and shared secrets can be computed entirely by the compiler and baked into the binary as constants.
  • [[nodiscard]] on for functions returning a value — the compiler warns if the caller ignores the return value. This prevents a class of API misuse bugs at compile time.
  • consteval parameter validation — invalid parameter combinations are rejected at compile time, not at runtime.
  • We compile with -Wall -Wextra -Wpedantic -Wshadow -Wconversion -Wformat=2 -Wcast-qual -Wold-style-cast -Wundef -Werror compiler flags, i.e., treating any warning as compile-time error.
  • clang-tidy based static analysis is integrated — it's configured with WarningsAsErrors: '*' across bugprone, cert, clang-analyzer, concurrency, cppcoreguidelines, hicpp, modernize, performance, portability, and readability checks. It helps us maintain high code quality.

A representative example for ML-KEM-512 — full keygen, encaps, and decaps executed entirely at compile-time:

/**
 * Filename: compile-time-ml-kem-512.cpp
 *
 * Use `cmake --install` to install ml-kem system-wide and compile with:
 *
 * $ g++ -std=c++20 -Wall -Wextra -Wpedantic -fconstexpr-ops-limit=67108864 compile-time-ml-kem-512.cpp && ./a.out
 * $ clang++ -std=c++20 -Wall -Wextra -Wpedantic -fconstexpr-steps=33554432 compile-time-ml-kem-512.cpp && ./a.out
 */

#include "ml_kem/ml_kem_512.hpp"
#include <string_view>

// Compile-time hex character to nibble conversion.
constexpr uint8_t
hex_digit(char c)
{
  if (c >= '0' && c <= '9') return static_cast<uint8_t>(c - '0');
  if (c >= 'a' && c <= 'f') return static_cast<uint8_t>(c - 'a' + 10);
  if (c >= 'A' && c <= 'F') return static_cast<uint8_t>(c - 'A' + 10);
  return 0;
}

// Compile-time hex string to byte array conversion.
template<size_t L>
constexpr std::array<uint8_t, L>
from_hex(std::string_view str)
{
  std::array<uint8_t, L> res{};
  for (size_t i = 0; i < L; i++) {
    res[i] = static_cast<uint8_t>((hex_digit(str[2 * i]) << 4) | hex_digit(str[(2 * i) + 1]));
  }
  return res;
}

// Compile-time evaluation of ML-KEM-512 keygen, encapsulation and decapsulation, using seeds from KAT vectors.
constexpr bool
eval_ml_kem_512()
{
  constexpr auto seed_d = from_hex<32>("7c9935a0b07694aa0c6d10e4db6b1add2fd81a25ccb148032dcd739936737f2d");
  constexpr auto seed_z = from_hex<32>("b505d7cfad1b497499323c8686325e4792f267aafa3f87ca60d01cb54f29202a");
  constexpr auto seed_m = from_hex<32>("eb4a7c66ef4eba2ddb38c88d8bc706b1d639002198172a7b1942eca8f6c001ba");

  std::array<uint8_t, ml_kem_512::PKEY_BYTE_LEN> pubkey{};
  std::array<uint8_t, ml_kem_512::SKEY_BYTE_LEN> seckey{};
  std::array<uint8_t, ml_kem_512::CIPHER_TEXT_BYTE_LEN> cipher{};
  std::array<uint8_t, ml_kem_512::SHARED_SECRET_BYTE_LEN> sender_key{};
  std::array<uint8_t, ml_kem_512::SHARED_SECRET_BYTE_LEN> receiver_key{};

  ml_kem_512::keygen(seed_d, seed_z, pubkey, seckey);
  const auto encaps_ok = ml_kem_512::encapsulate(seed_m, pubkey, cipher, sender_key);
  ml_kem_512::decapsulate(seckey, cipher, receiver_key);

  return encaps_ok && (sender_key == receiver_key);
}

int
main()
{
  // Entire ML-KEM-512 keygen + encaps + decaps round-trip, evaluated at compile-time.
  static_assert(eval_ml_kem_512(), "ML-KEM-512 keygen/encaps/decaps must be correct at compile-time");
  return 0;
}

The same pattern works for ML-DSA: compile-time keygen, signing and verification. This is not possible with any current PQCP implementation. This opens the door for moving most validations to program compilation time, instead of runtime.

One primary goal of these two libraries is to be high-assurance, by using modern C++ language features and compiler tooling to enforce strictness at compile-time - reduce runtime liabilities.


Known Gaps

  • ml-dsa pre-hash mode is not yet implemented for SHA2 family of hash functions.
  • Structure-aware fuzzing work not yet finalized.
  • No constant-timeness enforcement using Valgrind (dynamic analysis) and Binsec (static/ symbolic analysis) yet. Work in progress.
  • No platform specific optimisation yet - 4x SIMD parallel keccak-based hashing would bring better performance on table.
  • No formal proof of memory safety or functional correctness.
  • Neither implementation has been audited by third-party.

Performance

ml-kem-768:

Platform keygen encaps decaps
Intel x86_64, GCC 14 19.3 µs 21.9 µs 25.8 µs
Graviton4 aarch64, GCC 13 28.8 µs 32.8 µs 39.3 µs

ml-dsa-65:

Platform keygen sign (min) verify
Intel x86_64, GCC 14 82.1 µs 587 µs 86.3 µs
Graviton4 aarch64, GCC 13 126.2 µs 879 µs 134.4 µs

These are portable implementations without any platform-specific intrinsics or assembly, yet. These performance numbers are ~3x slower compared to baseline mlkem-native and mldsa-native, which feature 4x SIMD parallel keccak-based hashing and hand-optimized NTT implementation.


Dependency Chain

ml-kem CMake dependencies:

Dependency Role Main Author License
subtle Constant-time comparison utilities @itzmeanjan MIT
sha3 FIPS 202 hash functions and XOFs, used internally @itzmeanjan BSD-2-Clause
RandomShake TurboSHAKE256-based CSPRNG, used in tests/examples only @itzmeanjan BSD-2-Clause
google-test Test runner Google Apache 2.0
google-benchmark Benchmark runner harness Google Apache 2.0

ml-dsa CMake dependencies:

Dependency Role Main Author License
sha3 FIPS 202 hash functions and XOFs, used internally @itzmeanjan BSD-2-Clause
RandomShake TurboSHAKE256-based CSPRNG, used in tests/examples only @itzmeanjan BSD-2-Clause
google-test Test runner Google Apache 2.0
google-benchmark Benchmark runner harness Google Apache 2.0

License

Current state:

  • ml-kem is BSD-3-Clause
  • ml-dsa is MIT
  • All first-party CMake dependencies (sha3, subtle, RandomShake) are BSD-2-Clause or MIT

Plan:
We will open a dedicated GitHub issue in each affected repository, explicitly asking every contributor to confirm their consent to relicense their contributions under Apache 2.0. The contributor set is small:

  • ml-kem: 3 contributors
  • ml-dsa: 2 contributors
  • sha3, subtle, RandomShake: 2 contributors

Once all consents are received, all LICENSE files, SPDX headers, and NOTICE files will be updated. Given the small set of external contributors, we expect this process to complete before or shortly after any TSC vote.


Governance Readiness

We will adopt the PQCA Code of Conduct and prepare CONTRIBUTING.md, SECURITY.md, and GOVERNANCE.md documents prior to onboarding. We understand and accept the requirement to transfer repository assets to the Linux Foundation upon acceptance. We target the Incubation stage of the PQCA Production project lifecycle.


Current Maintainers

  • Anjan Roy (@itzmeanjan) — primary author and maintainer

We are seeking a second maintainer. Introductions from the TSC community — particularly anyone with C++ PQC interest — would be more than welcome.


Roadmap

  1. Apache 2.0 relicensing — Open consent issues and complete the process across all affected repos.
  2. Second maintainer — Recruit and onboard a maintainer from a separate organization.
  3. Security audit — Engage a third-party auditor to formally assess timing safety and correctness. This is the most significant current gap relative to high-assurance status. Help from PQCP would really help make this happen.
  4. Filling known gaps — We acknowledge there are gaps in these two implementations, which can be bridged to make them even more high-assurance. In coming days, improving upon that would stay on focus.

Community Interest

  • ml-kem: 127 stars, 44 forks, 610 commits
  • ml-dsa: 58 stars, 9 forks, 481 commits

No production deployments have been confirmed to date. The fork counts suggest active experimentation by others in the community.


Personal Opinion

I believe both of these projects ml-kem and ml-dsa, offer a genuinely complementary addition to the PQCP portfolio, serving the C++ developer ecosystem with a zero-dependency, fully compile-time-evaluable implementation that none of the existing sub-projects address. I welcome TSC feedback and I am happy to present at one of the upcoming TSC meetings.


References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions