About the Problem
When the execution of an event’s handlers takes too long — for example, 800ms — after a certain amount of time a retry occurs (or more, depending on whether execution is slow again). I believe this happens because the service that triggers the event has a timeout and performs the necessary retries until the event responds in less time than the configured timeout.
How to Reproduce
- Clone the events-example repository.
- Make the event take longer than 800ms (you can use a setTimeout).
- Check whether retries are occurring by watching the terminal.
What Is My Scenario?
I am emitting N events to process products in batches (50 products per batch). The events are as follows:
Event 1: Retrieve the product IDs from a specific category, then emit and send the data to Event 2.
Event 2: Create specifications and specification values, then emit and send the data to Event 3.
Event 3: Update the product specifications, then emit and send the data to Event 4.
Event 4: Save some information to Master Data, then emit and send the data to Event 1.
The data passed between Events 1, 2, and 3 consists of the product IDs for the current iteration. In Event 4, the next page is sent to continue the event chain.
So the event chain looks like this:
Event 1 => Event 2 => Event 3 => Event 4 => Event 1 => …
Until it finishes, when the page number exceeds the last page (total products / 50).
Goals
- Get more information on how events work in the VTEX IO backend.
- Get more information on what factors can trigger retries — timeouts, errors, HTTP error codes, etc.
- Recommendations on how to handle events that may take longer than expected or that involve a complex event chain, and how to explicitly avoid retries (middleware?).
Suggestion
It would be great if you could create some tutorials or advanced guides on the events feature.