/
Race condition in Event-based orders sync.

Race condition in Event-based orders sync.

What is the purpose of this article?

The purpose of this article is to provide clarity on the race condition that we encounter on a daily basis while pulling shipment events, what are the issues other than the race condition with events update in Algolia, steps taken for short-term and long-term solutions.

How does Event-based sync function?

The updates on any shipment are firstly logged in the database via the WMS module. These updates are stored in the shipment_status_logs table in the eshopbox_wms_production database.

Clickpost also sends us the updates that they receive from the courier partner which are logged in the database via the WMS module.

Once logged, the WMS module publishes the events on a PUBSUB topic.

In the Client-Portal module, there are pull-based subscribers against the PUBSUB topic. These subscribers pull events in batches from the PUBSUB topic, till all the events are acknowledged by the subscriber.

These pull-batches are, then published in another PUBSUB topic in the eshopbox-client-portal-prod project that creates the Algolia object from the events and updates in Algolia.

What is the race condition in Event-based orders sync?

The events published by the WMS module in PUBSUB are sequential in nature. These events when pulled via the subscriber do not maintain the sequence in which they were published.

Due to this, events of the same shipment, when pulled in batches, can exist in the same batch or in different batches altogether.

The problem arises when two or more events of the same shipments are pulled together. The sequence of the events differs due to which the current/ latest event of a shipment is over-ridden with a previous shipment update.

Like if “intransit“ and “delivered“ are pushed in the topic in sequence, but when pulled these two events can exist in batch(s) in the following ways:

  1. When both events are pulled in the same batch, but “intransit“ pulled after the “delivered“ event.

  2. When “intransit“ and “delivered” events are pulled in separate batches, whereas “delivered“ was pulled in the first batch and “intransit“ pulled in the next batches.

This race condition creates a discrepancy in updates of a shipment in Algolia.

Here, intransit was pulled in a separate batch after delivered status.

How can we resolve the race condition?

There are 3 solutions (short-term + long-term) which are as follows:

  1. SHORT-TERM solution:

    1. To handle situation 1, when both events are pulled in the same batch, the following steps are required:

      1. In the shipment events, send “shipment_status_logs.id“ of that status against the “latestStatusLogId” key.

      2. Pull the events from the WMS topic.

      3. When processing a batch, before passing into the transformation f(n), create a selection function.

      4. The selection function will then iterate through all the shipment events in the batch and create a map of the “shipments.externalShipmentID“ key with the “latestStatusLogId“.

      5. If in the map, shipments.extenalShipmentID is not present, save the key and value pair in the map.

      6. If shipments.externalShipmentID exist in the map, check if the latestStatusLogId is greater than the value present in the map. If so, then replace the value with the current latestStatusLogId, else do nothing and continue.
        At the same time prepare a separate map of “shipments.externalShipmentID“ as key and the shipment event stored as the value. If any greater latestStatusLogId is met, replace the previous event with the current event.

      7. Once the iteration is over, we will be left with the shipment events of the latest log in the batch.

The above solution works well for those events which are pulled in the same event. To handle those events which are pulled in separate batches, we will follow a two-tier plan of action.

 

  1. Plan 1.

    1. When processing a batch of shipment events, using externalShipmentID, query the database and create a map of all the shipment events' externalShipmentID and their latest shipment_status_logs.id of that shipment.

      SELECT MAX(shipment_status_logs.id), shipments.externalShipmentID FROM shipments LEFT JOIN shipment_status_logs ON shipments.id = shipment_status_logs.shipment_id WHERE shipments.externalShipmentID IN ("ESBM959652", 'BLCK168190-271-76') GROUP BY shipments.id;
    2. Filter events that match the latest shipment_status_logs.id from the database to that of the “latestStatusLogId“ in the shipment event.

    3. These filtered events will be processed further.

  2. Plan 2 (LONG-TERM solution).

    1. In Algolia, there is a property called PartialUpdateOperation which allows updates only when the value (integer) passed for the property is greater than the value present in the record on Algolia.

    2. In events sync, we will be passing the latestStatusLogId as the PartialUpdateOperation property, through which if any previous event id is passed in the update, Algolia itself will not update the record.

Add label

Related content