Skip to content

tuni56/ecommerce-streaming-data-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kafka Grafana

Real-Time Ecommerce Streaming Data Platform

This repository documents the design and implementation of a real-time data streaming platform for an ecommerce use case.

The project focuses on data engineering fundamentals: event-driven architecture, Kafka-based ingestion, observability, and infrastructure decisions that protect data producers at scale.


Problem Statement

Ecommerce platforms generate continuous streams of events:

  • page views
  • cart interactions
  • purchases

These events are:

  • high volume
  • bursty
  • business critical

A key constraint guided this design:

data producers must never break, even as the platform evolves.


Architectural Overview

All events enter the platform through a single, stable endpoint:

events.ecommerce-domain.com

Amazon Route 53 is used as a strategic routing layer to:

  • decouple producers from backend infrastructure
  • enable safe evolution of pipelines
  • support failover and traffic spikes

Behind this entry point, Kafka handles durable ingestion and streaming. real-time_ecommerce_pipeline


Event Types

The platform processes three core event categories:

  • Page Views
    • High volume, append-only
  • Cart Events
    • Bursty traffic, user-driven
  • Purchase Events
    • Low volume, business critical

Each event type is defined using explicit schemas to enforce contracts between producers and consumers.


Repository Structure

Captura desde 2025-12-22 08-08-43

Kafka Design

  • Producers publish events asynchronously
  • Topics are partitioned based on access patterns
  • Consumers are designed to be idempotent
  • Consumer lag is treated as a first-class metric

Kafka configuration reflects traffic patterns and business criticality rather than uniform defaults.


Observability

Grafana dashboards track:

  • ingestion rate
  • consumer lag
  • processing latency
  • error rates

Observability is used not only for monitoring, but to inform routing and scaling decisions.


Future Enhancements

  • Traffic simulation for load testing
  • Schema versioning and compatibility checks
  • Infrastructure as Code (Terraform / CloudFormation)
  • Stream processing with Kafka Streams or Flink

Context

This project was built as a hands-on exercise to showcase data engineering skills through realistic system design and documented technical decisions.

About

Real-time ecommerce streaming data platform using Kafka, AWS Route 53 routing, event-driven architecture, and observability with Grafana.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors