Jelou - Historial de avisos

Sistemas funcionando con normalidad

Historial de avisos

may 2026

WhatsApp Messaging Service – Increased Latency & Delayed Delivery
  • Después de la muerte
    Después de la muerte

    Incident Summary

    An overload was recorded in the bot system that limited its ability to process incoming messages in a timely manner. As support resources became saturated, the service experienced excessive wait times, preventing end users from receiving automated responses during the event.

    Impact

    The impact was limited to the interruption of service flows that depend on real-time message processing and user responses. Static or independent flows continued to run normally.

    Detection

    The incident was identified through automatic performance alerts and high latency in bot responses, in addition to reports from the support team and customers experiencing failures in digital signature flows. This confirmed service degradation in the processing of interactive messages.

    Response

    Once the issue was identified, the engineering team diagnosed the root cause and deployed a fix in under 30 minutes from escalation, increasing the performance and capacity of the resources supporting the bot to normalize the service.

    Root Cause

    The incident was triggered by a mass event caused by the simultaneous sending of campaigns by two customers. This generated a sudden and concentrated spike in activity volume that exceeded the response speed of the automatic scaling mechanisms, preventing the system from reacting in time to contain the load and causing temporary resource saturation.

    Resolution

    As a definitive resolution, scaling controls were adjusted and the base performance capacity of the bot infrastructure was increased. This adjustment ensures the system has the necessary headroom to absorb mass events and sudden activity spikes without degrading service.

  • Resuelto
    Resuelto
    This incident has been resolved.
  • Supervisando
    Supervisando
    We implemented a fix and are currently monitoring the result.
  • Investigando
    Investigando

    WhatsApp messages are currently experiencing delays in responses. We are currently investigating this incident.

Workflow Intermittency
  • Después de la muerte
    Después de la muerte

    Incident Summary

    On May 19, 2026, during an automated hardening and infrastructure controls strengthening process, a temporary disruption occurred in some platform services. The intervention began at approximately 11:10 (UTC-5); the infrastructure team applied the corresponding fix at around 11:30, conducted behavior monitoring until 11:50, and confirmed full service stabilization at approximately 12:07.

    Impact

    The total period between the application of the change and full service stabilization was approximately one hour. During that interval, certain services related to workflows and integrations experienced intermittency and availability errors for approximately 30 minutes, while the team applied the fix and validated recovery. The impact was limited to temporary service availability. There was no data loss. There was no security compromise or unauthorized access.

    Detection

    The behavior was flagged by the service Health Status controls, which notify via Slack and email upon detection of disruptions, allowing the infrastructure team to immediately begin reviewing the event.

    Response

    Once the behavior was identified, the infrastructure team analyzed the affected services and components, determined the origin of the event, applied the fix at 11:30, carried out validations and post-fix monitoring until 11:50, and confirmed full platform stabilization at 12:07.

    Root Cause

    During an automated hardening and infrastructure controls strengthening process, an internal connectivity dependency between platform services reacted in an unexpected way. The dependency had a lower-than-optimal level of detail in the configuration inventory, which caused a temporary degradation in communication between components and, consequently, the transient unavailability of the affected services during the stabilization period.

    Resolution

    As part of the continuous improvement of controls already operated by the engineering team, the following mechanisms are being reinforced: deepening the validation controls applied before and after each infrastructure change, incorporating more thorough connectivity tests into the standard intervention protocol; expanding the scope of monitoring and early alerting mechanisms already in use, with more agile escalation to on-call teams to further shorten detection and response times; strengthening the internal connectivity dependency inventory maintained by the team, expanding its level of detail so that each future change is systematically cross-referenced against it prior to implementation; and reinforcing the incremental change protocol already in place, strengthening the practice of creating and validating new specific configurations before retiring previous ones. These measures enhance the operational maturity of the platform, reduce the risk of recurrence, and consolidate the resilience with which the team already manages infrastructure maintenance and strengthening activities. We sincerely apologize for any inconvenience caused and thank you for your understanding.

  • Resuelto
    Resuelto
    This incident has been resolved.
  • Supervisando
    Supervisando
    We implemented a fix and are currently monitoring the result.
  • Identificado
    Identificado

    We are working on a fix for this incident.

  • Investigando
    Investigando

    We are currently investigating an issue that may be affecting agent performance. Our technical team is actively reviewing the situation and will provide an update within the next 30 minutes. We apologize for any inconvenience this may cause.

abr 2026

Operator Connection Status Issues
  • Después de la muerte
    Después de la muerte

    Incident Summary
    During a production deployment related to visual and behavioral improvements of the operator profile component, the status selector (“Connected / Unavailable / Disconnected”) stopped responding correctly for some users.
    The incident lasted approximately 15 minutes.

    Root Cause
    It was identified that a recent update introduced a regression in the handling of the component’s internal states, specifically in the synchronization of the selected state after the dropdown was rendered. This caused the selection event to fail in properly updating the operator’s status in certain scenarios.

    Mitigation Applied
    Once the anomalous behavior was detected, an immediate rollback of the affected version was executed to restore normal operation of the component and minimize impact.

    Post-Fix Correction
    A subsequent fix was implemented on the affected component, adjusting the update and validation logic of the selected state to prevent inconsistencies during rendering and ensure proper propagation of selector events.

    Preventive Actions
    We already perform validations prior to each deployment on critical components. In this case, the behavior did not occur during pre-deployment testing and only appeared under a specific condition in production.

    As an additional measure, we have strengthened post-deployment validations and monitoring of these components to detect similar behaviors earlier.

  • Resuelto
    Resuelto
    This incident has been resolved.
  • Supervisando
    Supervisando

    The issue affecting the operator connection status has been identified and resolved. Connection indicators are now functioning as expected.

    Our team will continue to monitor the platform to ensure stability and prevent recurrence.

  • Investigando
    Investigando

    We are currently experiencing issues affecting the connection status of operators on the platform. Some users may observe incorrect or inconsistent availability indicators.

    Our team is actively investigating the root cause and working to restore normal behavior as soon as possible.

    Next Update: We will provide further information as soon as progress is made.

Delayed Delivery of WhatsApp Messages
  • Después de la muerte
    Después de la muerte

    Incident Summary
    On April 17, between 08:00 a.m. and 12:15 p.m. (Ecuador time), a platform-level issue occurred that caused intermittent errors in the message delivery service.
    This situation directly affected message sending within the platform, resulting in occasional failures in some processes. Due to its intermittent nature, the service was not completely unavailable, but it did exhibit inconsistent behavior.

    Impact
    The incident impacted users consuming the message delivery service, causing intermittent failures in the querying and management of phone numbers associated with WABA accounts.
    The impact was classified as medium, as not all requests failed and the service was not fully unavailable.

    Detection
    The issue was identified through error reports and service monitoring, where an irregular failure rate in responses was observed.

    Response
    Once the incident was detected, the team analyzed recent platform behavior and identified a recently deployed change as the potential cause.
    The change was immediately rolled back, and a fix was deployed across all clusters, accompanied by monitoring to validate service stability.

    Root Cause
    The incident was caused by a recent platform-level change that introduced unexpected behavior in the message delivery service, resulting in intermittent errors.

    Resolution and Preventive Measures

    Applied Solution:

    • Rollback of the change that caused the incident

    • Full deployment of the fix across all clusters

    • Validation of endpoint stability

    Preventive Measures:

    • Strengthen pre-deployment testing for critical endpoints

    • Implement stricter monitoring to detect intermittent errors

    • Apply progressive deployments (controlled rollout)

    • Improve post-deployment validation

  • Resuelto
    Resuelto
    This incident has been resolved.
  • Supervisando
    Supervisando

    A fix has been implemented by the provider. We are currently monitoring the results.

  • Investigando
    Investigando

    We are currently experiencing delays in WhatsApp message delivery. Our team is investigating and working to resolve the issue.

abr 2026 a jun 2026

Siguiente