Skip to content

bhavay1998/sql_ecomm_analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

121 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL: Ecommerce and Web Analytics

The project assumes the role of a eCommerce database analyst for a newly launched eCommerce startup named Maven Fuzzy Factory. The project's primary focus is not only to develop advanced SQL queries, but to thoroughly emphasise the business logic and decision support behind it. This project is based on Advanced SQL: MySQL for Ecommerce & Web Analytics, a Udemy course by John Pauler. However, outside of the problem statement framework, everything in the project is done completely by me.

Living Business Approach

The project is aimed to simulate a realistic environment where business decisions are linked and evolve with time, rather than being a series of disconnected snapshots. By accounting for the passage of time, the project demonstrates how decisions made in early stages shape the company’s overall growth trajectory. The timespan begins from March 2012.

Project Ownership & Disclosure

This project is based on a series of problem statements related to an e-commerce startup's growth and operations. While the high-level business questions and raw datasets were provided as a framework, all SQL architecture, data querying & transformation logic, query building, and output verification were independently developed and executed solely by me. As an extension of the provided framework, I have learned the business context in which high-level business questions have been framed.

About the OLTP Database

The project is based on a highly comprehensive eCommerce database built from scratch by John Pauler, member of the Maven Analytics team. Instead of working on random samples of data, the database is designed to closely mimic real-word databases specific to eCommerce startups and internet retailers that heavily rely on online stores to generate sales. The official link of the dataset can be visited to retrieve the database.

Relational Database Schema

Schema Diagram



The database contains six related tables:

  • orders - contains purchase orders placed by customers; order might contain multiple items
  • order_items - can be linked to the orders table to get information about the number of items contained within a particular order
  • products - can be linked with order_items to get product level information for a placed order
  • order_item_refunds - can be linked to the orders table to get information about refunds made on orders with customer complaints
  • website_sessions - helps identify the source of website traffic (via utm and related parameters) through which orders are being generated
  • website_pageviews - contains information on pages of a website visited by a user; for a specific website session

Objective

The aim of the project is to help the management (i.e. CEO, Marketing Director and Website Manager) grow the eCommerce business and understand how to improve performance. Analysis is performed to optimize the business’ marketing channels, website, and product portfolio. All analyses are performed using the MavenFuzzyFactory database on MySQL Workbench. The project is intended to perform BI analysis by querying the database, instead of working on database warehousing, building and maintenance.

Project Highlights

The highlights mentioned for each section of the project can be verified by looking at the output tables.

Traffic Source Analysis

Assisted the Marketing Director in optimizing paid marketing channels using SQL-based traffic source analysis:

  • Analyzed UTM parameters to identify primary sources of website traffic, finding gsearch non-brand as the dominant paid channel.
  • Evaluated conversion rates against a 4% CVR benchmark to reveal that the paid traffic bids were not justified.
  • Performed weekly trend analysis to measure changes in paid traffic volume following bid adjustments.
  • Identified a significant CVR gap between desktop and mobile users (3.73% v. 0.97% resp.), leading to increased bids on desktop traffic.
  • Post-optimization, desktop sessions and overall paid traffic volume increased over time, shown via weekly trend analysis.

Website Content and Conversion Funnel Analysis

Analyzed on-site user behavior and conversion performance to identify content bottlenecks and optimize the purchase journey:

  • Ranked most-viewed website pages by session volume to understand how users interact with site content, identifying /home as the dominant entry page.
  • Calculated homepage bounce rate (~60%) by isolating landing-page sessions, highlighting poor engagement for paid search traffic.
  • Designed an A/B test for a new paid-search landing page /lander-1, showing a lower bounce rate for /lander-1 vs. /home.
  • Conducted weekly trend analysis to track traffic re-routing from /home to /lander-1, demonstrating a drop in overall bounce rate from ~60% to ~50%.
  • Built a full conversion funnel from landing page to order confirmation, quantifying user drop-off at each step of the purchase journey.
  • Identified key funnel leakage points (/lander-1, /products, and /billing pages) using step-level click-through rates.
  • Performed an A/B test on billing page redesign (/billing vs /billing-2), showing a significant lift in billing-to-order conversion (≈45% → 62%).

Interim Growth Report

Analyzed how the business is doing in the first 8 months of its operations, to help the CEO prepare for the board meeting:

  • gsearch sessions and orders increased steadily month over month, confirming it as the primary growth channel.
  • Rising brand based sessions and orders from gsearch traffic indicate increasing user awareness and intent.
  • desktop traffic generates most sessions and orders, but mobile sessions and orders also increased over time despite bid increases applied only to desktop.
  • Growth in direct-source sessions indicates increasing organic brand recognition where the business doesn't have to pay to attract users.
  • Session to order conversion rate in Nov 2012 is 38% higher than in Mar 2012, alongside 578% session growth and 835% order growth.
  • /lander-1 outperforms /home with a 5% higher entry-page clickthrough rate, within a seven stage conversion funnel.

Channel Portfolio Management

Identified company's channel portfolio to comprise four marketing channels, namely, nonbrand traffic (paid), brand traffic (paid), direct traffic (unpaid) and organic traffic (unpaid). Analyzed channel portfolio to improve the effectiveness of the marketing budget:

  • Within 3 months of launch bsearch nonbrand weekly sessions reached ~40% of gsearch nonbrand volume, validating bsearch as a meaningful secondary paid channel.
  • mobile accounts for 24.5% of gsearch sessions but only 8.6% for bsearch, indicating that bsearch is predominantly desktop driven and should not mirror gsearch device strategy.
  • gsearch outperforms bsearch in terms of CVR on both desktop (4.5% vs 3.8%) and mobile (1.14% vs 0.8%), justifying reduction of bsearch bids to avoid inefficient spend.
  • sessions from brand, direct, and organic channels grew both absolute terms & from ~0.5% to ~7% of nonbrand volume (in % terms), suggesting increased brand awareness and reduced dependency on paid viewer acquisition

About

Project Repository for the Maven Fuzzy Factory Marketing Team.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors