01
Overview
ShopMy, a fast-growing American eCommerce platform, faced significant challenges with fragmented data sources, legacy infrastructure limitations, and scalability bottlenecks. Operating multiple CRM systems like Salesforce and HubSpot, along with several custom payment processors, ShopMy struggled to consolidate insights due to inconsistent data formats, manual reporting processes, and infrastructure constraints.
To address these limitations and unlock the full potential of its operational data, Solverix designed and implemented a modern, cloud-native data warehousing solution on Google Cloud Platform, leveraging BigQuery for scalable storage and Apache Airflow for orchestrated ETL workflows. The goal: build a unified, secure, and analytics-ready data ecosystem capable of supporting real-time insights and long-term growth.
02
Solution Aproach
Stakeholder Workshops: Engaged Sales, Finance, IT, and Ops teams to map out critical reporting needs and pain points.
Data Discovery & Gap Analysis: Cataloged all data sources including Salesforce, HubSpot, custom SQL databases, and third-party APIs. Identified major inconsistencies in data types and quality.
ETL Pipeline Development: Deployed Apache Airflow to automate data ingestion from CRMs and payment processors. Implemented DAGs for granular task orchestration and monitoring.
Data Transformation: Used Python (pandas) for normalization, cleaning, and deduplication. Ensured consistency in formats such as timestamps and naming conventions.
Warehouse Architecture: Built a star schema in BigQuery with optimized partitioning and clustering for efficient querying.
Data Quality & Security: Set up validation scripts with automated Slack alerts, and enforced column-level security using Google IAM for compliance with data governance policies.
Change Management: Delivered training sessions and internal documentation to ease the transition from monthly static reports to on-demand analytics.
03
Technologies used
A brief overview of the implemented technology stack used to ensure scalability, performance, and data integrity.
Google BigQuery – Scalable, serverless data warehousing
Apache Airflow – ETL pipeline orchestration and scheduling
Google Cloud Storage (GCS) – Staging area for raw and transformed data
Python (pandas) – Data cleansing and transformation
Salesforce & HubSpot APIs – CRM data ingestion
Custom SQL Databases – Integration with ERP modules
Google IAM & Column-Level Security – Access control and compliance
04
Key Outcomes
The main results and improvements achieved through Solverix's technical solutions.
80% faster report generation – Reduced analytics turnaround from days to on-demand (e.g., campaign analysis now instant).
70% increase in data accuracy – Due to schema normalization and automated quality checks.
30% lower infrastructure costs – Achieved via serverless scaling and query optimization in BigQuery.
Improved governance & compliance – Role-based access and PII protection met internal audit standards.
Key Statistics
Report Generation
Our Projects
Answers
We've gone ahead and answered some of the questions you might have.
Let's talk!
Office:
Kompleksi Square 21,Ap28
Tirana
Albania
Local time: