Conversation
b6c0a68 to
eb08843
Compare
Pure addition of semantic boolean accessors for checking whether a tensor uses dense (plain) or non-dense (exotic) storage, replacing the pattern of try_as_dense().is_ok() / storage.as_dense().is_none().
Use the new semantic accessors in typed model's constant folding and eval paths.
Aligns internal enum naming with the public is_plain()/is_exotic() API. StorageKind is pub(crate), so no downstream impact.
Rename all Dense-related types, methods, variables, and comments to use Plain/Exotic terminology consistently: - DenseStorage → PlainStorage - DenseView/DenseViewMut → PlainView/PlainViewMut - as_dense/as_dense_mut/into_dense → as_plain/as_plain_mut/into_plain - try_as_dense/try_as_dense_mut → try_as_plain/try_as_plain_mut - to_dense_array_view/to_dense_array_view_mut → to_plain_array_view/to_plain_array_view_mut - dense_view.rs → plain_view.rs - All local variable names and doc comments updated
Migrate 6 sites that checked datum_type().is_opaque() on actual tensors to use the storage-based is_exotic()/is_plain() predicates instead. This decouples the "is this tensor exotic?" question from the DatumType, moving toward eventual removal of DatumType::Opaque. Sites changed: - typed.rs constant inlining: is_plain() subsumes both checks - compare.rs reference overwrite guard: is_plain() - konst.rs opaque_fact validation: is_plain() - optimized.rs packed matrix debug_asserts: is_exotic() - utils.rs clarify_tvalue: is_exotic()
…() checks Step 2 of removing DatumType::Opaque: decouple fact-level opaque checks from datum_type by querying opaque_fact presence directly. All 16 call sites across core, nnef, and cli now use the new methods.
Exotic tensors (BlockQuantStorage, PackedMatrixStorage) now carry the logical element type (f32, f16, etc.) instead of DatumType::Opaque. Exotic storage is already distinguished by StorageKind::Exotic / opaque_fact.is_some(), so datum_type can carry the real type. - Add `dt` parameter to BlockQuantStorage::into_tensor_with_shape and PackedMatrixStorage::into_tensor - Propagate input datum_type in output_facts for all ops that produce exotic outputs - Migrate remaining Opaque::datum_type() checks to is_exotic()/is_plain() - Update consistency check in TypedFact to match new invariant
Now that exotic tensors carry real datum types (f32, f16) instead of DatumType::Opaque, code that used is_number() as a proxy for "plain tensor" needs an explicit is_plain() guard. - EinSum::eval: skip ad-hoc optimization for exotic inputs - dequant_inputs: dequantize exotic tensors even though dt is numeric - kernel_selection: skip fast-path mmm lookup for exotic inputs - prop_const: don't eagerly propagate exotic constants - Gather: don't apply scalar-index slice optimization on exotic inputs - PackedBlockQuantFormat::prepare_one: don't re-quantize exotic inputs
During the transition period where some exotic tensors carry the real datum type (f32) and others still carry DatumType::Opaque (cuda-side quantized activations), normalise both to the logical type before comparing.
Now that exotic tensors carry real datum types, is_supported_dts() and resolve_output_facts() could incorrectly accept Q4_0 inputs as plain f32/f16 tensors: - MlxGemm/MfaGemm::is_supported_dts: require is_plain() on both inputs - GgmlGemm::is_supported_dts: require is_plain() for regular type matching, preserving the Q4_0-specific path via as_quant_fact - MetalGemm::resolve_output_facts: add is_plain() guard to the is_number() check, so Q4_0 inputs fall through to as_quant_fact
…c tensors FloatPrecisionTranslator now skips Cast insertion for exotic (BQ/packed) inputs and outputs, and skips casting exotic constants. PropConst moves the is_plain() check to top-level to prevent eager eval of exotic constants. matmul_semantic_output_dt prefers plain inputs when determining operating_dt.
Exotic tensors (BlockQuant, Q8_1) now carry their logical element type (f32) instead of DatumType::Opaque, consistent with the core crate changes. Key fixes: - arena_view: detect BQ via opaque_fact instead of dt==Opaque - CudaTensor::uninitialized_opaque: use f32 dt for BQ and Q8_1 - pad_q40_weights: use f32 dt for padded BQ fact - CudaGgmlQuantQ81: propagate input dt instead of hardcoding Opaque - GgmlGemm::output_dt: remove Opaque→F32 normalization hack - DeviceMemoryPool: pass dt through scalar_opaque_tensor_for_node - Remove Opaque from SUPPORTED_DT and tname
Mechanical rename reflecting that these facts describe exotic storage (BlockQuant, packed matrices, device tensors) rather than opaque data. - OpaqueFact -> ExoticFact (trait and all type references) - opaque_fact -> exotic_fact (fields, methods, variables) - uninitialized_opaque -> uninitialized_exotic - uninitialized_device_opaque_tensor -> uninitialized_device_exotic_tensor - scalar_opaque_tensor_for_node -> scalar_exotic_tensor_for_node - make_scalar_opaque_tensor_for_node -> make_scalar_exotic_tensor_for_node 55 files changed across data, linalg, core, nnef, gpu, cuda, metal, cli, hir, and test-rt crates.
DatumType::Opaque was the last vestige of the old design where exotic tensors (block-quantized, packed matrices, device tensors) masqueraded as Opaque scalars. Now that exotic storage is properly distinguished via StorageKind::Exotic and ExoticFact, the Opaque datum type has no remaining users. Removed: - DatumType::Opaque variant - Opaque struct, OpaquePayload trait, DummyPayload - is_opaque() method on DatumType - datum!(Opaque) impl, dispatch macro arms - Tensor hash/drop/zero-init/clone branches for Opaque - Dead clarify_tvalue code in CLI - Stale consistency check in fact.rs - Commented-out validation in typed.rs
Mechanical cleanup of local variable names (opaque → input_value, a_raw, tensor, etc.) and error/comment strings across cuda, metal, gpu, core, nnef, and test-rt crates.
Exotic BlockQuant tensors now carry a logical datum_type (f32) instead of DatumType::Opaque, so is_number() returns true. Add is_plain() guards to prevent the plain-tensor code path from being taken for exotic inputs with empty shapes.
- Fix shape doubling in resolve_output_facts: exotic tensors now carry full logical shapes, so chaining with BlockQuantFact.shape() doubled them (e.g. [2048,2048] -> [2048,2048,2048,2048]). - Fix can_convert_to_cuda_gemm: require is_plain() for the regular type-match branch so BQ weights (now datum_type=f32) don't bypass the input-swap logic.
The gemv kernel dispatcher matched dts[1]==F32 before checking q40_b, selecting kernel_mul_mv_f32_f32 instead of kernel_mul_mv_q4_0_f32 for block-quantized weights (which now carry datum_type=f32). The f32 kernel read quantized bytes as floats, producing NaN/garbage. Fix: check q40_b first in mv_kernel_name_and_dispatch_params.
… fallible From<Arc<Tensor>> for TypedFact only reconstructed exotic_fact for BlockQuantStorage, silently dropping it for other exotic storages (packed matrices, device tensors). This is a behavioral regression now that exotic tensors carry a logical dt instead of Opaque. Fix by adding exotic_fact(shape) to the TensorStorage trait so each storage backend produces its own ExoticFact. PlainStorage returns Ok(None), BlockQuantStorage and PackedMatrixStorage return the appropriate fact, and DeviceTensor bails (it lacks origin info and should never hit this path). Because DeviceTensor can fail, the conversion is now TryFrom instead of From, propagating errors instead of silently losing information. Also adds debug_assert!(t.is_plain()) to TypedFact::shape_and_dt_of and simplifies TypedModel::add_const which no longer needs a manual BlockQuantStorage special-case.
2c0bf85 to
14b026d
Compare
The generic bound was unused in the function body and no longer satisfiable after the TryFrom conversion.
58eca31 to
420aefb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.