Document CNN architecture compatibility by oulrich1 · Pull Request #15 · oulrich1/Network

oulrich1 · 2025-11-17T03:25:58Z

No description provided.

This commit establishes the foundation for supporting Convolutional Neural Networks (CNNs) through a pluggable architecture pattern while maintaining full backward compatibility with existing ANN/MLP code. ## New Components ### 1. Tensor<T> Class (tensor.h) - N-dimensional array support (1D, 2D, 3D, 4D, etc.) - Shape manipulation: reshape, transpose, flatten, squeeze, unsqueeze - Element-wise operations: +, -, *, / (tensor-tensor and tensor-scalar) - Interoperability with ml::Mat<T> (fromMat, toMat) - Memory efficient using std::shared_ptr - Factory methods: zeros, ones, random, randn - Statistics: sum, mean, max, min - Comprehensive test coverage (9/9 tests passing) ### 2. im2col/col2im Utilities (im2col.h) - Transform convolution to matrix multiplication (industry standard) - im2col: Extract image patches into column matrix - col2im: Inverse operation for backpropagation - Supports arbitrary kernel size, stride, and padding - Batch processing support - Helper functions for dimension calculation and gradients - Comprehensive test coverage (9/9 tests passing) ### 3. Documentation - ARCHITECTURE_DESIGN.md: Comprehensive design document (800+ lines) * Pluggable architecture pattern explanation * Design decisions and rationale * CNN vs ANN comparison (mathematical and structural) * Implementation phases and roadmap * API examples and usage patterns - CNN_COMPATIBILITY.md: User-facing compatibility guide (600+ lines) * Core differences between ANN and CNN * Why CNNs for image data (parameter efficiency) * Mathematical operations comparison * im2col algorithm explanation with examples * Migration path for existing code * Performance comparison - IMPLEMENTATION_PROGRESS.md: Status tracking * Completed components * Testing strategy * Next steps and timeline * Quality metrics ## Design Principles 1. **Dependency Injection for Neural Architectures** - Different layer types (Dense, Conv, Pool) as injectable components - All conform to ILayer<T> interface 2. **Backward Compatibility** - Tensor<T> added alongside Mat<T> (non-breaking) - Existing Layer<T> will become alias to DenseLayer<T> - All existing code continues to work 3. **Open/Closed Principle** - Open for extension (new layer types) - Closed for modification (core Network unchanged) ## Test Results All tests passing: - test_tensor.cpp: 9/9 tests ✓ - test_im2col.cpp: 9/9 tests ✓ ## Next Steps 1. Implement Conv2D layer with forward/backward pass 2. Implement MaxPool2D and AvgPool2D layers 3. Refactor Layer<T> to DenseLayer<T> 4. Create CNN MNIST example (LeNet-5) 5. Validate >95% accuracy on MNIST ## Technical Details - im2col approach: 2× memory overhead, but simple and maintainable - Leverages existing Mat<T> operations and OpenMP parallelization - Compatible with existing optimizer infrastructure - Ready for future RNN support (3D tensors) Estimated time to working CNN: ~6 hours from this point

Implements a fully-functional 2D convolutional layer for CNN support. ## Conv2D Layer (conv_layer.h) ### Features - Forward pass using im2col for efficient matrix multiplication - Backward pass with gradient computation for: * Input gradients (for previous layer) * Kernel/filter gradients * Bias gradients - Multiple activation functions: ReLU, Sigmoid, Tanh, Linear, LeakyReLU, ELU - Configurable hyperparameters: * Kernel size (height, width) * Stride (vertical, horizontal) * Padding (vertical, horizontal) - He initialization for ReLU activations - Xavier initialization for other activations - Support for: * Multiple input/output channels * Batch processing * Arbitrary input dimensions ### Implementation Details - Uses im2col to transform convolution into matrix multiplication - Caches intermediate values for efficient backpropagation - Gradient updates via simple SGD (can be extended to other optimizers) - Proper shape tracking and validation ## Testing (test_conv_layer.cpp) ### Test Coverage (11/11 passing) 1. Construction with parameter validation 2. Weight initialization (He/Xavier) 3. Forward pass (basic 2x2 kernel) 4. Forward pass with padding 5. Forward pass with stride > 1 6. Multi-channel input/output 7. Batch processing (batch > 1) 8. Different activation functions 9. **Numerical gradient checking** (relative error < 0.0002!) 10. Weight updates 11. MNIST-like dimensions (28x28 input) ### Gradient Verification Numerical vs analytical gradients match with relative error: 0.000162 This confirms backpropagation is implemented correctly. ## Example Usage ```cpp // Create Conv2D layer: 32 filters, 5x5 kernel, ReLU activation Conv2D<float> conv(32, 5, 5, ActivationType::RELU); conv.setInputChannels(1); // Grayscale input conv.init(); // Forward pass: [batch, channels, height, width] Tensor<float> input({8, 1, 28, 28}); // 8 images, 28x28 auto output = conv.forward(input); // → [8, 32, 24, 24] // Backward pass Tensor<float> d_output = ...; // Gradient from next layer auto d_input = conv.backward(d_output); // Update weights conv.updateWeights(0.01); // Learning rate = 0.01 ``` ## Performance - Leverages existing OpenMP parallelization in matrix multiplication - im2col approach is industry-standard (used by Caffe, PyTorch, TensorFlow) - Memory overhead: ~2× input size (acceptable trade-off for correctness) ## Next Steps - MaxPool2D and AvgPool2D layers (simpler, no learnable parameters) - Integration with existing Network<T> class - CNN MNIST example to validate end-to-end

Implements pooling layers for CNN spatial downsampling. ## Pooling Layers (pooling_layer.h) ### MaxPool2D - Takes maximum value within each pooling window - Provides translation invariance - Reduces spatial dimensions while preserving important features - Backpropagation routes gradients only to max positions - Configurable pool size and stride ### AvgPool2D - Takes average value within each pooling window - Smoother downsampling compared to max pooling - Distributes gradients evenly during backpropagation - Useful for certain architectures ### GlobalAvgPool2D - Pools over entire spatial dimensions - Reduces [batch, channels, height, width] → [batch, channels, 1, 1] - Common alternative to fully-connected layers before classification - Reduces parameter count in final layers ## Key Features - No learnable parameters (pooling is a fixed operation) - Preserves number of channels - Supports overlapping and non-overlapping windows - Batch processing support - Efficient gradient routing during backpropagation ## Testing (test_pooling_layer.cpp) ### Test Coverage (11/11 passing) 1. MaxPool construction 2. MaxPool forward pass (basic) 3. MaxPool backward pass (gradient routing to max positions) 4. MaxPool with overlapping windows 5. MaxPool with multiple channels 6. MaxPool batch processing 7. AvgPool forward pass (basic) 8. AvgPool backward pass (gradient distribution) 9. GlobalAvgPool functionality 10. Channel preservation across pooling types 11. MNIST-like dimensions (24x24 → 12x12) ## Example Usage ```cpp // Max pooling: 2x2 window, stride=2 (non-overlapping) MaxPool2D<float> maxpool(2, 2, 2, 2); Tensor<float> input({8, 32, 24, 24}); auto output = maxpool.forward(input); // → [8, 32, 12, 12] // Backward pass Tensor<float> d_output = ...; auto d_input = maxpool.backward(d_output); // Average pooling AvgPool2D<float> avgpool(2, 2); auto avg_output = avgpool.forward(input); // Global average pooling (for classification) GlobalAvgPool2D<float> gap; Tensor<float> features({8, 512, 7, 7}); auto global_features = gap.forward(features); // → [8, 512, 1, 1] ``` ## Design Notes - MaxPool stores indices of max values for efficient backpropagation - AvgPool distributes gradients evenly (1/pool_size to each position) - Both preserve the number of channels (only spatial downsampling) - GlobalAvgPool is commonly used in modern CNN architectures ## Performance - Pooling is computationally cheap (no matrix multiplications) - Max pooling: O(pool_size² × output_size) - Backward pass is equally efficient - No memory overhead (no learnable parameters)

claude added 4 commits November 17, 2025 03:22

Add test binary exclusions to .gitignore

92ec1ad

oulrich1 merged commit 31fda42 into master Nov 17, 2025
1 check passed

oulrich1 deleted the claude/document-cnn-compatibility-01VxNNQroEzmhFQq2djSPTM4 branch November 17, 2025 03:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document CNN architecture compatibility#15

Document CNN architecture compatibility#15
oulrich1 merged 4 commits intomasterfrom
claude/document-cnn-compatibility-01VxNNQroEzmhFQq2djSPTM4

oulrich1 commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

oulrich1 commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants