Jelou - Workflow Intermittency – Incident details

All systems operational

Workflow Intermittency

Resolved
Major outage
Started 9 days agoLasted about 1 hour

Affected

Channels

Partial outage from 4:26 PM to 5:26 PM

WhatsApp

Partial outage from 4:26 PM to 5:26 PM

Facebook Messenger

Partial outage from 4:26 PM to 5:26 PM

Instagram Direct Messages

Partial outage from 4:26 PM to 5:26 PM

Web Widget

Partial outage from 4:26 PM to 5:26 PM

Updates
  • Postmortem
    Postmortem

    Incident Summary

    On May 19, 2026, during an automated hardening and infrastructure controls strengthening process, a temporary disruption occurred in some platform services. The intervention began at approximately 11:10 (UTC-5); the infrastructure team applied the corresponding fix at around 11:30, conducted behavior monitoring until 11:50, and confirmed full service stabilization at approximately 12:07.

    Impact

    The total period between the application of the change and full service stabilization was approximately one hour. During that interval, certain services related to workflows and integrations experienced intermittency and availability errors for approximately 30 minutes, while the team applied the fix and validated recovery. The impact was limited to temporary service availability. There was no data loss. There was no security compromise or unauthorized access.

    Detection

    The behavior was flagged by the service Health Status controls, which notify via Slack and email upon detection of disruptions, allowing the infrastructure team to immediately begin reviewing the event.

    Response

    Once the behavior was identified, the infrastructure team analyzed the affected services and components, determined the origin of the event, applied the fix at 11:30, carried out validations and post-fix monitoring until 11:50, and confirmed full platform stabilization at 12:07.

    Root Cause

    During an automated hardening and infrastructure controls strengthening process, an internal connectivity dependency between platform services reacted in an unexpected way. The dependency had a lower-than-optimal level of detail in the configuration inventory, which caused a temporary degradation in communication between components and, consequently, the transient unavailability of the affected services during the stabilization period.

    Resolution

    As part of the continuous improvement of controls already operated by the engineering team, the following mechanisms are being reinforced: deepening the validation controls applied before and after each infrastructure change, incorporating more thorough connectivity tests into the standard intervention protocol; expanding the scope of monitoring and early alerting mechanisms already in use, with more agile escalation to on-call teams to further shorten detection and response times; strengthening the internal connectivity dependency inventory maintained by the team, expanding its level of detail so that each future change is systematically cross-referenced against it prior to implementation; and reinforcing the incremental change protocol already in place, strengthening the practice of creating and validating new specific configurations before retiring previous ones. These measures enhance the operational maturity of the platform, reduce the risk of recurrence, and consolidate the resilience with which the team already manages infrastructure maintenance and strengthening activities. We sincerely apologize for any inconvenience caused and thank you for your understanding.

  • Resolved
    Resolved
    This incident has been resolved.
  • Monitoring
    Monitoring
    We implemented a fix and are currently monitoring the result.
  • Identified
    Identified

    We are working on a fix for this incident.

  • Investigating
    Investigating

    We are currently investigating an issue that may be affecting agent performance. Our technical team is actively reviewing the situation and will provide an update within the next 30 minutes. We apologize for any inconvenience this may cause.