A leading retail company aims to better understand its customers’ shopping behavior to improve sales, customer satisfaction, and long-term loyalty. There have been noticeable changes in purchasing patterns across demographics, product categories, and sales channels (online versus offline). The goal is to uncover factors such as discounts, reviews, seasons, or payment preferences that drive consumer decisions and repeat purchases.
“How can the company leverage consumer shopping data to identify trends, improve customer engagement, and optimize marketing and product strategies?”
- Analyze customer shopping behavior using transactional data.
- Identify key patterns and trends in spending, customer segmentation, and purchase drivers.
- Provide actionable insights to guide strategic business decisions.
- Data Preparation & Modeling (Python): Clean and transform raw datasets.
- Data Analysis (SQL): Organize data into a structured format, simulate business transactions, and extract insights.
- Visualization & Insights (Power BI): Build an interactive dashboard to communicate findings.
- Report & Presentation: Summarize key findings and business recommendations.
- Well-Structured GitHub Repository: Host Python scripts, SQL queries, and Power BI dashboards.
- Rows: 3,900
- Columns: 18
- Key Features:
- Customer demographics: Age, Gender, Location, Subscription Status.
- Purchase details: Item Purchased, Category, Purchase Amount, Season, Size, Color.
- Shopping behavior: Discount Applied, Promo Code Used, Review Rating, Frequency of Purchases, Shipping Type.
- Missing Data: 37 values in the "Review Rating" column.
- Data Cleaning:
- Imputed missing values for "Review Rating" using the median rating by product category.
- Renamed columns to snake_case for consistency.
- Engineered new features such as age groups and purchase frequency.
- Integration: Loaded cleaned data into a PostgreSQL database for structured SQL analysis.
Using SQL, we answered critical business questions:
- Revenue by Gender: Compared total revenue generated by male vs. female customers.
- Top Products and Categories: Identified the highest-rated and most-purchased products in each category.
- Customer Segments:
- Segmented customers into "New," "Returning," and "Loyal."
- Observed key spending patterns by segment.
- Impact of Discounts:
- Found products most dependent on discounts for sales.
- Identified discount users who generated above-average revenue.
- Effect of Subscription Status: Measured spend and revenue differences between subscribers and non-subscribers.
- Shipping Choices: Compared purchasing behavior of customers using standard vs. express shipping.
- Demographic Insights: Analyzed revenue contribution by customer age groups.
- Designed an interactive dashboard to provide stakeholders with key insights at a glance.
- Highlighted customer segments, revenue breakdowns, and actionable trends.
- High Revenue Segments: Certain age groups and express-shipping users contribute significantly to revenue.
- Discount Efficiency: While discounts drive sales, they must be balanced against profit margins.
- Customer Loyalty: Loyal customers and subscribers spend significantly more over time.
- Boost Subscriptions: Enhance subscription benefits to drive loyalty.
- Reward Repeat Buyers: Develop loyalty programs to drive repeat purchases.
- Optimize Discount Strategies: Balance sales boosts with margin protection.
- Product Campaigns: Focus marketing on top-rated and top-selling products.
- Targeted Marketing:
- Prioritize high-revenue age groups.
- Incentivize express-shipping users for additional purchases.
- Libraries: Pandas, NumPy, Matplotlib, Seaborn.
- Use Case: Data cleaning, feature engineering, preliminary analysis.
- Platform: PostgreSQL.
- Use Case: Business transaction simulations, structured data analysis.
- Use Case: Creating interactive dashboards for visualization of key trends and insights.
- Python 3.8+
- PostgreSQL
- Power BI Desktop
- Clone this repository.
- Follow instructions in the Python scripts for data preparation.
- Execute SQL queries using your preferred SQL client.
- Open the Power BI dashboard for an interactive visualization experience.