Duplicating data is bad. But we still did it anyway. Here’s why.

At Voltade, we’re doing what others believe is impossible - making customized software scalable. Today, we'd like to present a little design challenge that represents just a small piece of the larger puzzles we solve daily.

The Challenge

We’re building a platform for a machinery company where the operations team can add both individual products and packages (bundles of products) to their orders. These packages consist of complementary items, and here's the twist: users can customize these packages by adding or removing products. Think of adding a set menu to your cart but, you can add, remove or update each item in this set.

The Question: How would you structure the database to effectively track each customer's order, considering this level of customization?

Take a moment to think about it. What tables would you create? How would you manage the relationships between products, packages, and orders to allow for such flexibility?

Our Approach at Voltade

Here's how we tackled this problem:

  1. Products Table

    We start with a Products table—a comprehensive list of all available items, complete with prices and stock levels.

  1. Packages and PackageProducts Tables

    Next, we introduce Packages and PackageProducts tables. The Packages table lists all predefined bundles, while PackageProducts maps individual products to these packages via product IDs.

  1. Orders and OrderItems Tables

    To track customer purchases, we create Orders and OrderItems tables. When a customer adds a package to their order, we duplicate the relevant package and product data into the OrderItems table. This duplication allows customers to customize package contents—adding or removing products—without affecting the original Packages or Products tables.

Customizability

Now, you may want to change the name of the package in your order, or even adjust the price if you are part of an internal team. With this design, you can do that without affecting original package data. Simply create a new OrderItem with a packageId but no productId, to identify it as a parent package item.

// using drizzle 
import { z } from 'zod';
import { drizzle } from '../middlewares/drizzle.ts'
import { factory } from '../factory.ts'
import { nanoid } from 'lib/nanoid.ts'

const createOrderSchema = z
  .object({
    details: z.object({
      customerId: z.string().min(1),
    }),
    selectedProducts: z
      .object({
        id: z.string(),
        quantity: z.number(),
      })
      .array(),
    selectedPackages: z
      .object({
        id: z.string(),
        quantity: z.number(),
      })
      .array(),
  })
  .refine(
    ({ selectedPackages, selectedProducts }) =>
      selectedPackages.length > 0 || selectedProducts.length > 0,
    { message: 'Order cannot be empty ' },
  );
export const route = factory
  .createApp()
  .post(
    '/',
    zValidator('json', createOrderSchema),
    drizzle(),
    async (c) => {
      const { details, selectedPackages, selectedProducts } =
        c.req.valid('json');
      const { tx } = c.var;

      const [createdOrder] = await tx
        .insert(orders)
        .values({
          ...details,
        })
        .returning();

      if (!createdOrder) throw new HTTPException(400);

      const selectedProductIds = selectedProducts.map((p) => p.id);
      const productsData = await tx
        .select({
          id: products.id,
          name: products.name,
          price: products.pricerice,
        })
        .from(products)
        .where(inArray(products.id, selectedProductIds));

      if (productsData.length !== selectedProductIds.length) {
        throw new HTTPException(400); // Incorrect product IDs passed
      }

      const productById = Object.fromEntries(
        productsData.map((pd) => [pd.id, pd]),
      );

      const orderItemsToInsert: (typeof orderItems.$inferInsert)[] = [];

      orderItemsToInsert.push(
        ...selectedProducts.map((p) => ({
          orderId: createdOrder.id,
          name: productById[p.id]!.name,
          quantity: p.quantity.toString(),
          price: (productById[p.id]!.price ?? 0).toString(),
        })),
      );

      const packageIds = selectedPackages.map((p) => p.id);

      const packagesData = await tx
        .select({
          id: packages.id,
          name: packages.name,
        })
        .from(packages)
        .where(inArray(packages.id, packageIds));

      if (packagesData.length !== selectedPackages.length) {
        throw new HTTPException(400); // Incorrect package IDs passed
      }

      const packageById = Object.fromEntries(
        packagesData.map((p) => [p.id, p]),
      );

      const packageProductsData = await tx
        .select({
          packageId: packageProducts.packageId,
          productId: packageProducts.productId,
          quantity: packageProducts.quantity,
          name: products.name,
          price: products.price,
        })
        .from(packageProducts)
        .innerJoin(products, eq(packageProducts.productId, products.id))
        .where(inArray(packageProducts.packageId, packageIds));

      for (const selectedPackage of selectedPackages) {
        const { id, type, quantity } = selectedPackage;
        for (let i = 0; i < quantity; i++) {
          const parentOrderItemId = nanoid();
          const parentOrderItem: typeof orderItems.$inferInsert = {
            orderId: createdOrder.id,
            id: parentOrderItemId,
            packageId: id,
            name: packageById[id]!.name,
            price: '0',
            quantity: '1',
          };

          const subOrderItems: (typeof orderItems.$inferInsert)[] =
            packageProductsData
              .filter((p) => p.packageId === id)
              ?.map((p) => ({
                orderId: createdOrder.id,
                parentOrderItemId: parentOrderItemId,
                packageId: p.packageId,
                productId: p.productId,
                name: p.name,
                quantity: p.quantity.toString(),
                price: p.price ?? '0',
              }));

          orderItemsToInsert.push(parentOrderItem);
          orderItemsToInsert.push(...subOrderItems);
        }
      }

      const insertedOrderItems = await tx
        .insert(orderItems)
        .values(orderItemsToInsert)
        .returning();

      return c.json({
        details: createdOrder,
        items: insertedOrderItems,
      });
    },
  )

Exploring the Complexity

At first glance, this might seem like a straightforward solution. However, allowing customers to modify package contents introduces significant complexity:

  • Data Integrity: We need to ensure that changes to an order don't impact the original product and package data.
  • Performance: Duplicating data can lead to redundancy, affecting database performance if not managed correctly.
  • Scalability: The system must handle a growing number of products, packages, and customizations without degrading performance.

Why We Chose Denormalization

Our solution employs denormalization—intentionally duplicating data to optimize for specific use cases. While this introduces redundancy, it provides the necessary flexibility for customers to customize their orders freely.

The Benefits

  • Customer Flexibility: Users can tailor packages to their preferences, enhancing satisfaction.
  • Operational Efficiency: Simplifies processes like applying discounts or recording invoice line items on customized packages.
  • Data Stability: Keeps the core product and package data consistent across the platform.

Join Our Team of Problem Solvers

This brain teaser is just a glimpse into the complex problems we solve at Voltade. If you're an engineer who enjoys tackling challenging puzzles and developing innovative solutions, we'd love to hear from you.