Jelou - RCA - Latency in bots associated with an automatic component update – Detalles del incidente

Sistemas funcionando con normalidad

RCA - Latency in bots associated with an automatic component update

Resuelto
Interrupción mayor
Iniciado el hace 9 díasDuró alrededor de 1 hora

Afectado

Channels

Interrupción mayor de 11:00 AM a 12:00 PM

WhatsApp

Interrupción mayor de 11:00 AM a 12:00 PM

Actualizaciones
  • Resuelto
    Resuelto
    This incident has been resolved.
  • Investigando
    Investigando

    1. Incident Summary

    On December 31, 2025, between 6:00 a.m. and 7:00 a.m. (local time), a temporary impact on bot functionality was observed, associated with an automatic update of a system component, which caused the bots not to respond correctly during that period.

    The technical team detected the situation in a timely manner and performed a manual rollback, successfully restoring normal service operation.
    The total duration of the incident was approximately 60 minutes.

    2. Impact

    During the period between 6:00 a.m. and 7:00 a.m., the bots experienced intermittent response behavior, which may have resulted in a suboptimal experience for end users.
    No data loss or full platform unavailability was identified.

    3. Detection

    The incident was identified by the technical team through system monitoring, after detecting variations in response times following an automatic update.

    Additionally, it was identified that an automated database backup process may have contributed to the latency observed during the same period.

    4. Response

    The technical team carried out the following actions:

    • Review of recent changes applied to the environment.

    • Identification of the automatic component update as the associated factor.

    • Execution of a manual rollback to return to a stable version.

    • Continuous monitoring to confirm service stability.

    The incident was successfully resolved by the internal team.

    5. Cause

    • Automatic update of a system component, which caused a temporary impact on bot response capacity.

    • Simultaneous execution of an automated database backup process, which may have increased latency.


    • 6. Solution

    • Manual rollback of the updated component.

    • Validation of proper bot functionality.

    • Confirmation of system stability following the intervention.