Monitoring and Troubleshooting¶
This section presents background information about monitoring of the Business Bot Platform.
Why is Monitoring of Chatbots Important?¶
Before a chatbot and business logic can be made available for production, it must be integrated into the monitoring solution. Indeed, like any piece of computing, chatbots and business logics can fail for various reasons: insufficient system resources (insufficient memory and storage), operating system, database or application server failure, etc. Monitoring can reduce chatbot downtime in the event of a failure, and can even be used as a preventive measure to resolve problems before they occur.
Another advantage of monitoring chatbots is that it can be used to verify
to check whether or not it complies with its Service Level Agreement (SLA) terms & conditions. Your chatbot/business logic’s SLA can include guaranties in term of availability of your chatbot/business logic, frequency of data backups, and retention duration of backups.
Who Is Involved by Chatbot and Business Logic Monitoring?¶
Deployed chatbots and business logics can be monitored by various teams:
- L1 (Level 1) team. People working in a L1 team are assigned monitoring alerts which cannot be automatically fixed. An L1 person must strictly follow a step-by-step checklist dedicated to the component involved in the alert. For example, for an alert on a database, a first task in the check list might be to run the restart script for the database, a second one (if the first one did not fix the issue) could be to cleanup some temporary data and then restart the database, etc. If the procedures described in the checklist have been exhausted without the issue being fixed, then the L1 operator must transfer the issue to another team called L2 team.
- L2 (Level 2) team. People working in a L2 team know your chatbot and business logic’s deployment architecture, can perform advanced analysis of the problem cause using all data collected by the monitoring solution, can run any L1 procedure and can use advanced methods to gather more data and try to fix the issue.
Monitoring Architecture¶
The monitoring of the Business Bot Platform services and servers is achieved with third party monitoring agent tools. These tools are not delivered with the various Business Bot Platform utilities, but as part of a separate, dedicated product.
The Monitoring Agent is a standalone process called from the command line. An agent must be installed on each computer on which the BBP code is installed. Certain agents must be installed on machines to perform integrity checks through the same load balancers used by users to access these services.
When the agent is configured to monitor a BBP service or a BBP server instance, it periodically collects monitoring data (called metrics) for the service or server, assigns it to an event/monitoring format, and stores it on the local computer where the agent is running. The administrator can then use the results via the company’s own monitoring console.