Zepto is India’s fastest growing e-grocery

Business Problem:

- Batch Data Pipelines:

Zepto's initial data pipelines were constructed using the Batch method, resulting in significant data lag. This delay in data processing adversely affected the timeliness and accuracy of key performance indicators (KPIs), hindering real-time decision-making and operational efficiency.

- Real-time ETA and Demand-Supply Metrics:

Zepto faced challenges in identifying and incorporating the right driver metrics to accurately calculate the estimated time of arrival (ETA) and manage demand-supply dynamics in real-time. The absence of timely and precise metrics hindered effective delivery coordination and efficient resource allocation.

- Packer Tracking:

Difficulties in tracking and monitoring the movement of packers within the warehouse. The lack of real-time visibility into packer activities impeded efficient task allocation, coordination, and performance evaluation, leading to potential delays in order fulfillment.

- Real-time Inventory Management:

Challenges with real-time inventory management in their warehouses, compromising the availability of products for customer orders. This posed challenges in maintaining optimal stock levels, tracking inventory movement, and ensuring seamless order fulfillment.

To address these challenges, we helped them to build a real-time data pipeline. By transitioning from batch processing to real-time data processing, Zepto can reduce data lag, enable faster insights, and enhance operational efficiency. Implementing real-time ETA and demand-supply metrics will facilitate precise delivery coordination. Developing robust real-time tracking mechanisms will provide visibility into packer activities. Through these strategic improvements, Zepto can achieve enhanced operational efficiency, and superior customer experience, and gain a competitive edge in the dynamic e-grocery market.

Solution:


- Identify Required Datasets and Tables:

Conduct a thorough analysis to determine all the datasets and tables necessary for calculating the required metrics.

- Setup CDC-Based Replication with Debezium Connectors:

Utilize Debezium connectors to establish Change Data Capture (CDC)-based replication from the identified datasets.

Implement Debezium connectors to capture and replicate data changes in real-time.

Leveraging MSK Connect for Improved Scalability and Management:

Deploy the Debezium connector into the Managed Streaming for Apache Kafka (MSK) Connect platform.

Utilize MSK Connect to enhance scalability and streamline the management of the replication process.

Complete CDC-Based Replication using MSK Kafka:

Ensure the successful completion of CDC-based replication, which relies on the reliable and managed Kafka service provided by AWS MSK.

Deploy PostgreSQL-Based Datamart:

Set up a datamart using PostgreSQL as the underlying database technology.

Utilize JDBC sink connectors to facilitate the data transfer from Kafka to the PostgreSQL-based datamart.

This approach efficiently replicates data changes from identified datasets using CDC-based replication with Debezium connectors. Deploying MSK Connect and utilizing MSK Kafka services improves scalability and simplifies management. The PostgreSQL-based datamart acts as a centralized repository for the replicated data, enabling effective analysis.

Business impact:

On-Time Shipping:

Immediate access to inventory data allows for prompt refilling of products from the warehouse, ensuring that customer demands are met without delays.

Enhanced visibility into inventory levels enables timely decision-making, minimizing stockouts and improving customer satisfaction.

Increased Visibility into Drivers' Key Performance Indicators (KPIs):

Real-time data analysis enables the identification of available drivers who can efficiently meet the demand for timely product delivery.

Accurate monitoring of drivers' KPIs facilitates effective resource allocation and helps optimize delivery schedules to improve overall operational efficiency.

Improved Packer Availability:

Near real-time dashboards provide packers with a clear view of their tasks and timelines, making it easier for them to deliver products to stores on time.

Streamlined packer movements result in reduced delivery delays, ensuring efficient order fulfillment and customer satisfaction.

Near Real-Time Inventory Management:

Real-time updates of inventory details across all stores enable field managers to maintain optimal stock levels at each location.

Timely inventory insights empower managers to make data-driven decisions, preventing stockouts and ensuring that stores have an adequate supply of products.

Real-time data platforms brought significant benefits to the company. On-time shipping improved customer satisfaction and loyalty. Enhanced visibility into drivers' KPIs optimized resource allocation and streamlined delivery. This case study demonstrates the power of real-time data analysis in driving business success.

Industry:

E-commerce

Outcome:

Accelerating Decision-Making by 4x