Controller Manager Log Monitor Test
The Kubernetes Controller Manager handles background processes, like node lifecycles, job control, and replication. Its logs provide critical insights into these operations, such as resource reconciliation, scheduling delays, or errors. Monitoring these logs ensures controllers function as expected.
The Controller Manager Log Manager Test continuously monitors the Controller Manager and reports the key metrics. Through the analysis of these metrics administrators can identify if there are any issues with system.
Target of the test : A Kubernetes Master Node
Agent deploying the test : A remote agent
Outputs of the test : One set of results for each type of log entry in the target Kubernetes master node being monitored
Parameter |
Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The IP address of the host for which this test is to be configured. |
Port |
Specify the port at which the specified Host listens. By default, this is 6443. |
Timeout |
Specify the duration (in seconds) beyond which the test will timeout in the Timeout text box. The default value is 10 seconds. |
Container log search pattern |
Enter the specific patterns of alerts to be monitored. The pattern should be in the following format: <PatternName>:<Pattern>, where <PatternName> is the pattern name that will be displayed in the monitor interface and <Pattern> is an expression of the form - *expr* or expr or *expr or expr*, etc. A leading '*' signifies any number of leading characters, while a trailing '*' signifies any number of trailing characters. For example, say you specify ORA:ORA-* in the SearchPattern text box. This indicates that "ORA" is the pattern name to be displayed in the monitor interface. "ORA-*" indicates that the test will monitor only those lines in the alert log which start with the term "ORA-". Similarly, if your pattern specification reads: offline:*offline, then it means that the pattern name is offline and that the test will monitor those lines in the alert log which end with the term offline. A single pattern may also be of the form e1+e2, where + signifies an OR condition. That is, the <PatternName> is matched if either e1 is true or e2 is true. Multiple search patterns can be specified as a comma-separated list. For example: ORA:ORA-*,offline:*offline*,online:*online |
Container log exclude pattern |
Provide a comma-separated list of patterns to be excluded from monitoring in the Exclude Pattern text box. For example *critical*, *exception*. By default, this parameter is set to 'none'. |
Container log mount path |
Provide the full path where log file is mounted. |
Container name include list |
Provide the comma separated list of container names which need to be included. |
Container name exclude list |
Provide the comma separated list of container names which need to be excluded. |
Alias name |
Specify the alias name in textbox. If the alias is EG_ENV_APP_NAME, the test will report the metrics from app name from environment config, if it is POD_Name, the test will report metrics from pod, if it is CONTAINER_NAME, it will report the metrics from pod. |
Max containers to monitor |
Specify the maximum number of containers to report the log metrics from. |
Lines |
Specify two numbers in the format x:y. This means that when a line in the alert file matches a particular pattern, then x lines before the matched line and y lines after the matched line will be reported in the detail diagnosis output (in addition to the matched line). The default value here is 0:0. Multiple entries can be provided as a comma-separated list. If you give 1:1 as the value for Lines, then this value will be applied to all the patterns specified in the SearchPattern field. If you give 0:0,1:1,2:1 as the value for Lines and if the corresponding value in the SearchPattern filed is like ORA:ORA-*,offline:*offline*,online:*online then: 0:0 will be applied to ORA:ORA-* pattern 1:1 will be applied to offline:*offline* pattern 2:1 will be applied to online:*online pattern |
Alter file |
In the ALERTFILE text box, the default value will be auto_discover which represents that the container log file location will be automatically decided by eG Agent as shown in console logs of docker logs <<container_name>> or kubectl logs <<podname>> -c <<container_name>>. Your AlertFile specification can also be of the following format: Name@logfilepath_or_pattern. Here, Name represents the display name of the path being configured. Accordingly, the parameter specification for the 'dblogs' and 'applogs' example discussed above can be: dblogs@/tmp/db/*dblogs*,applogs@/tmp/app/*applogs*. In this case, the display names 'dblogs' and 'applogs' will alone be displayed as descriptors of this test. Every time this test is executed, the eG agent verifies the following: Whether any changes have occurred in the size and/or timestamp of the log files that were monitoring during the last measurement period; Whether any new log files (that match the AlertFile specification) have been newly added since the last measurement period; If a few lines have been added to a log file that was monitored previously, then the eG agent monitors the additions to that log file, and then proceeds to monitor newer log files (if any). If an older log file has been overwritten, then, the eG agent monitors this log file completely, and then proceeds to monitor the newer log files (if any). |
Rollover File |
By default, this flag is set to False. Set this flag to True if you want the test to support the 'roll over' capability of the specified AlertFile. A roll over typically occurs when the timestamp of a file changes or when the log file size crosses a pre-determined threshold. When a log file rolls over, the errors/warnings that pre-exist in that file will be automatically copied to a new file, and all errors/warnings that are captured subsequently will be logged in the original/old file. For instance, say, errors and warnings were originally logged to a file named error_log. When a roll over occurs, the content of the file error_log will be copied to a file named error_log.1, and all new errors/warnings will be logged in error_log. In such a scenario, since the RolloverFile flag is set to False by default, the test by default scans only error_log.1 for new log entries and ignores error_log. On the other hand, if the flag is set to True, then the test will scan both error_log and error_log.1 for new entries. |
Exclude Files |
Specify the pattern to use for excluding files from container log file location. |
Unique Match |
By default, the UniqueMatch parameter is set to False, indicating that, by default, the test checks every line in the log file for the existence of each of the configured SearchPatterns. By setting this parameter to True, you can instruct the test to ignore a line and move to the next as soon as a match for one of the configured patterns is found in that line. For example, assume that Pattern1:*fatal*,Pattern2:*error* is the SearchPattern that has been configured. If UniqueMatch is set to False, then the test will read every line in the log file completely to check for the existence of messages embedding the strings 'fatal' and 'error'. If both the patterns are detected in the same line, then the number of matches will be incremented by 2. On the other hand, if UniqueMatch is set to True, then the test will read a line only until a match for one of the configured patterns is found and not both. This means that even if the strings 'fatal' and 'error' follow one another in the same line, the test will consider only the first match and not the next. The match count in this case will therefore be incremented by only 1. |
Case Sensitive |
This flag is set to No by default. This indicates that the test functions in a 'case-insensitive' manner by default. This implies that, by default, the test ignores the case of your AlertFile and SearchPattern specifications. If this flag is set to Yes on the other hand, then the test will function in a 'case-sensitive' manner. In this case therefore, for the test to work, even the case of your AlertFile and SearchPattern specifications should match with the actuals. |
Encode Format |
By default, this is set to none, indicating that no encoding format applies by default. However, if the test has to use a specific encoding format for reading from the specified AlertFile , then you will have to provide a valid encoding format here - eg., UTF-8, UTF-16, etc. Where multiple log files are being monitored, you will have to provide a comma-separated list of encoding formats – one each for every log file monitored. Make sure that your encoding format specification follows the same sequence as your AlertFile specification. In other words, the first encoding format should apply to the first alert file, and so on. For instance, say that your alertfile specification is as follows:D:\logs\report.log,E:\logs\error.log, C:\logs\warn_log. Assume that while UTF-8 needs to be used for reading from report.log, UTF-16 is to be used for reading from warn_log . No encoding format need be applied to error.log. In this case, your EncodeFormatspecification will be: UTF-8,none,UTF-16. |
Use Sudo |
By default, the eG agent does not require any special permissions to parse and read messages from the log file to be monitored. This is why, the Use Sudo parameter is set to No by default. In some highly-secure Unix environments however, the eG agentinstall user may not have the permission to read the log file to be monitored. In such environments, you will have to follow the steps below to ensure that the test is able to read the log file and report metrics: Edit the SUDOERS file on the target host and append an entry of the following format to it: <eG_agent_install_user> ALL=(ALL) NOPASSWD: <Log_file_with_path> For instance, if the eG agent install user is eguser, and the log file to be monitored is /usr/bin/logs/procs.log, then the entry in the SUDOERS file should be: eguser ALL=(ALL) NOPASSWD: /usr/bin/logs/procs.log Finally, save the file. Then, when configuring this test using the eG admin interface, set the Use Sudo parameter to Yes. Once this is done, then every time the test runs, it will check whether the eG agent install user has the necessary permissions to read the log file. If the user does not have the permissions, then the test runs the sudo command to change the permissions of the user, so that the eG agent is able to read from the log file. |
Sudo Path |
This parameter is relevant only when the Use Sudo parameter is set to ‘Yes’. By default, the Sudo Path is set to none. This implies that the sudo command is in its default location - i.e., in the /usr/bin or /usr/sbin folder of the target host. In this case, once the Use Sudo flag is set to Yes, the eG agent automatically runs the sudo command from its default location to allow access to the configured log file. However, if the sudo command is available in a different location in your environment, you will have to explicitly specify the full path to the sudo command in the Sudo Path text box to enable the eG agent to run the sudo command. |
Change File Permission |
If the user could not read the container log file, then set this flag to Yes . After setting this flag to Yes , the Administrator who manages this eG Manager environment changes the file permission and then the user could further read it. By default, this flag is set to Yes . |
DD Frequency |
Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency. |
Measurement |
Description |
Measurement Unit |
Interpretation |
---|---|---|---|
Recent Messages |
Indicates the number of errors, exceptions or warnings logged in the target master node log. |
Number |
Errors or exceptions in the logs indicate that system is not behaving as expected, while warnings indicate a potential issue. |