From Manual Mapping to 7 Million SKUs in Under an Hour

We built a custom merchandising engine that turned a manual, error-prone SKU mapping process into an automated system capable of generating 7Β million+ mappings in under an hour.

SKU mappings generated per bulk run
0 M+
Full catalog processing time
< 0 hr
Products mapped automatically
0 +

Real-Time

Incremental updates via event-driven processing

Zero

Manual mapping required for routine catalog updates

Overview

Rethinking merchandising data at the scale of millions of SKUs

A leading luxury furniture company with a catalog spanning 80,000+ products and millions of SKUs needed to fundamentally rethink how it managed product merchandising data across its ecommerce operations.

The company’s ecommerce merchandising had been built on a legacy platform (Oracle ATG) that served them well when the catalog was smaller and the business rules were simpler. But as product lines expanded, the cracks in the system had become impossible to ignore.

Overview-image

How one product becomes thousands of SKUs

Sofa Model
0
Sizes
0
Leg Finishes
0
Fabric Options
0 +
SKU Variations
0 +

Γ— 80,000 products = Millions of relationships

The challenges

Cracks in a system built for a simpler era

The problems surfaced gradually, in different parts of the business, often looking like isolated issues before anyone connected the dots.

01   Delayed collection launches

New collections consistently launched late, and nobody could pinpoint a single cause. Every new product required its SKUs to be mapped to options, color codes, and swatches by hand β€” and each mapping repeated across multiple environments. A seasonal launch of thousands of SKU combinations could take weeks, meaning misaligned campaigns, wasted spend, and shrinking revenue windows.

02   The catalog trust problem

Products were appearing on the storefront with incorrect options, missing swatches, or broken configurations. A sofa displaying fabric choices that weren't actually available. A table missing a finish that existed in the catalog. These weren't dramatic failures. Such subtle inaccuracies erode customer confidence in a brand that trades on precision and curation. With millions of SKU relationships managed manually, even a fractional error rate translated into thousands of incorrect listings.

03   Merchandising hours lost to repetitive tasks

Web merchants spent the vast majority of their time on manual data entry: mapping SKUs, verifying options, replicating work across environments, troubleshooting inconsistencies. The more the catalog grew, the less time they had for work that actually required human judgment β€” strategy, presentation, and promotional planning.

04   Nobody had the full picture

There was no practical way for merchants, product managers, or leadership to query the catalog at scale. Straightforward questions went unanswered: How many SKUs in this collection have complete mappings? Which color codes are missing swatch assignments? What would happen if a specific business rule changed?

The solution

A purpose-built merchandising automation engine

RBMSoft’s approach was not to patch the existing system or layer automation on top of legacy infrastructure. We built a purpose-built merchandising automation engine from the ground up β€” designed specifically for the scale, complexity, and business rules of the client’s catalog.

The architecture and capability choices were driven by a set of strategic priorities that came directly from the problems the business was experiencing:
BUSINESS PRIORITYDESIGN PRINCIPLE
Eliminate launch delaysAutomate the full SKU mapping lifecycle, from EPIM data ingestion to storefront-ready output, so the system does in minutes what used to take weeks.
Restore catalog integrityBuild validation and error detection into the core process, not as an afterthought. Catch errors before deployment, not after.
Free the merchandising teamRemove manual data entry from the workflow entirely. Let merchants define rules; let the system execute them.
Enable merchandising intelligenceProvide real-time query and analysis capabilities across the full catalog so the team can make data-driven decisions at scale.

What we built

Five capabilities that replaced weeks of manual work

1

Automated mapping at catalog scale

The engine processes the full EPIM data and generates the complete web of relationships that make a product shoppable β€” item numbers to options, color codes to swatches, swatches to valid option combinations, and the business rules governing which configurations are permissible. In bulk mode it generates over 7 million mappings in under an hour. The rules engine is merchant-configurable, so the team can adjust logic and re-trigger automation without engineering involvement.

2

Real-time processing with event-driven architecture

Bulk processing handles seasonal launches and catalog-wide updates; day to day, product data changes constantly. The engine handles both modes natively. Apache Kafka powers real-time event streaming, so when data changes in any upstream source, affected mappings are processed incrementally with no batch window to wait for. Amazon SQS provides horizontal scalability β€” flexing capacity during high-load launches without over-provisioning in quieter periods. Supporting both natively means the business never has to choose between speed and completeness.

3

Built-in error detection and reporting

Error detection is treated as a core system capability, not a reporting feature. The engine validates every mapping against business rules and data-completeness requirements during generation itself. Missing data, broken relationships, and rule violations are flagged before they reach any deployment environment. When issues are found, merchants get structured reports telling them exactly what's wrong and which products are affected β€” they fix the source data and re-trigger automation for that subset, with no developers involved.

4

Deployment governance and visual validation

Standardized workflows move validated data through authoring, preview, and production with consistent automation at each stage β€” eliminating the duplicated effort of replicating work across environments. A preview tool lets merchants see exactly how a product will appear on the storefront β€” SKUs, swatches, colorized images, and options β€” before it goes live, catching what rules alone can't.

5

Merchandising intelligence layer

To close the visibility gap, we built a real-time analytics layer using ClickHouse β€” fed via Debezium change data capture and Apache Flink stream processing β€” plus a purpose-built query builder. Merchants can search across millions of SKUs, filter by any combination of attributes, analyze mapping completeness by collection or category, and model the impact of rule changes before applying them. The client described the problem as "we can't see what's in our catalog." What we built was the ability to not just see it, but interrogate it β€” moving the team from data entry to data-driven decision-making.

How the engine works

A dual-mode, event-driven pipeline

The outcome

A merchandising operation transformed

What we delivered was a fundamental change in how this company’s merchandising operation works. SKUs that once took days to map now flow through the system in real time. Data-quality issues are caught before deployment, not after. The merchandising team, previously consumed by manual data work, now operates as a strategic function using real-time catalog intelligence.

The system scales with the business. Catalog growth, seasonal peaks, and new product lines are absorbed without performance degradation.

Because the architecture is modular and event-driven, new capabilities and business rules can be added without re-architecting the core platform.

BEFORE vs AFTER

Β  BEFORE AFTER
SKU Mapping Manual, spreadsheet-driven Fully automated with configurable business rules
Processing Time Weeks for seasonal launches Under 1 hour for full catalog
Error Detection Post-deployment, often by customers and internal users Pre-deployment, automated validation with merchant reporting
Environment Sync Manual replication across each environment Standardized automated workflows across all stages
Catalog Querying Not possible at scale Real-time query builder across millions of SKUs
Product Preview Only in deployed preview environments Built-in visual preview before deployment
Team Focus Data entry and troubleshooting Merchandising strategy and decision-making

Ready to rethink merchandising at scale?

Let’s talk about turning a manual, error-prone process into an automated engine built for the complexity of your catalog β€” and the growth ahead.

Thanks!

We’ve sent the framework to your email. Please check inbox.

Thanks For Reaching Out!

We’re mobilizing the right person to connect with you. While we prep, come hang out on our social pages!