GPUServer messages

Message origin

GPUServer logs messages from various modules, plugins and features. The format of the messages vary according to the origin.
Message prefix Origin.
GBSC, GSRV, GSDx, GSAx Primary server x is an alphanumeric character.
GBST Subserver.
GBRS Secondary server.

Message format

[timestamp] <level> message-id [[P:process-id] | [R:process-id] | [C:process-id] | [S:process-id:connection-id] | [R:process-id:connection-id]] message-text
[timestamp] This part is always displayed.
Timestamp is in format [YYYY-MM-DD hh:mm:ss.pppppp], for example [2013-04-26 15:25:58.835795]
<level> This part is always displayed.
Message level can be one of the following:
  • ciritcal - The most serious and critical error, in most cases OServer will be terminated.
  • error - Errors, but mostly GPUServer is allowed to continue, only in some cases it can be terminated.
  • warning - Small errors, mostly related to the requests, GPUServer will continue processing. It is worth analyzing and examining the reported issue.
  • notice - It is worth notifying the administrator about some changes or events.
  • info - Normal, reporting level of logging, it is only an informational level but it provides useful data.
  • debug - More detailed information.
  • trace - Even more detailed information.
  • message-id This part is always displayed.
    Message-id has the following format: AAAA-BB-CCC where:
  • AAAA is a prefix of a module, plugin or feature, more details about the real origin of the messages can be found in the Message origin section.
  • BBmajor message-id
  • CCCminor message-id
  • process-id This part may or may not be displayed.
    GPUServer consists of several concurrent running processes and subservers, several of which can be started, it is necessary to show the origin of the message. The process-id is very useful to correlate a particular process with those listed by the agpubox listprocess command.
    Not all of the messages contain process-id as its origin can be easly derived from message-id

    It is easy to distinguish the primary and the secondary sever. Seconary server broadcasts its process-id by the message:
    [2013-04-28 11:00:56.110357] <NOTICE > GBSC-SR-90A [18895] Rest server bound on:203.0.113.11:8080, PID: 18895
    additionally the primary server is a parent process to the secondary server and subservers.

    connection-id This part may or may not be displayed.
    It is a unique identification number incremented by 1 within every new request.
    There are two counters - one for the secondary server and the other for the primary server and subserver
    When the new connection from Client arrives, the same connection ID is passed from the primary server to subserver.
    [P:process-id] This part may or may not be displayed.
  • P stands here for the primary server.
  • process-id the primary server process identifier.
  • [R:process-id] This part may or may not be displayed.
  • R stands here for the secondary server.
  • process-id the secondary server process identifier.
  • [C:process-id] This part may or may not be displayed.
  • C stands here for the configuration process. The configuration process sends the information to OServer about the GPU devices configuration.
  • process-id the configuration process identifier.
  • [S:process-id:connection-id] This part may or may not be displayed.
  • S stands here for subserver.
  • process-id subserver process identifier.
  • connection-id the consecutive number of requests.
  • [R:process-id:connection-id] This part may or may not be displayed.
  • R stands here for secondary server.
  • process-id secondary server process identifier.
  • connection-id consecutive number of requests.
  • message-text This part is always displayed.
    The content of the message and its format are depending on the origin, severity of the message and its purpose.
    Some parts of the same message can vary, those variables are marked as {message_variable} in message descriptions.

    Message description

    The following topic shows the format of the message descriptions. Depending on the origin of the message, some of the description fields may or may not be displayed. Message description consist of several parts:

    message-id [info_type_C | info_type_I | info_type_R] message-text

    Message type:
    message-type
    message-description
    Source: message-origin
    Result: result
    Solution: solution

    Message part Description
    message-id Message identifier.

    bold text

    info_type_C Additional information about the origin of message.
    Information has format:server_type:pid_id:connection_id
    Where
  • server_type:
  • P - primary server
  • C - configuration process, starts when GPUServer send GPUs configuration to OServer
  • S - subserver
  • pid_id - process identifier
  • connection_id - unique request's connection ID
  • normal text

    info_type_I Additional information about the origin of message.
    Information has format:server_type:pid_id
    Where:
  • server_type:
  • P - primary server
  • C - configuration process, starts when GPUServer send GPUs configuration to OServer
  • S - subserver
  • pid_id - process identifier
  • normal text

    info_type_R Additional information about the origin of message.
    Information has format:server_type:pid_id
    Where:
  • server_type:
  • R - secondary server, RESTful service
  • pid_id - process identifier
  • connection_id - unique request's connection ID
  • pid_id - process identifier
  • connection_id - unique request's connection ID
  • normal text

    message-text The full message text can contain the message variables marked as {message_variable}.

    normal text

    message-type One of the following labels:
  • critical
  • error
  • warning
  • information
  • debug
  • trace
  • normal text

    message-description Full message description.

    normal text

    message-origin The message origin is correlated with a prefix.

    normal text

    result What GPUServer does in a result of the condition reported by the message. Depending on the level of the message, the results could include: continue processing, request terminated, authorization failed etc.

    normal text

    solution Instructions for the GPUBox administrator, including actions to take, further investigations and decisions to make. When the solution cannot be found or it is expected to not solve the issue on its own, as a last resort solution redirects the administrator to contact the support.

    normal text

    Messages

    AXSH-XS-500 Auxiliary server started on {hostname}:{port_number}

    Message type:
    information
    Auxiliary server just started on host {hostname} and listen on port {port_number}.
    Source: Auxiliary server
    Result: auxiliary server is ready to accept new connection on port {port_number}
    Solution: none
    AXSH-XS-32A auxiliary server - received stop

    Message type:
    debug
    Auxiliary server received stop request.
    Source: Auxiliary server
    Result: auxiliary server is starting to terminate.
    Solution: none
    AXSH-XS-32B auxiliary server - received stop

    Message type:
    debug
    Auxiliary server received stop request.
    Source: Auxiliary server
    Result: auxiliary server is starting to terminate.
    Solution: none
    AXSH-XS-300 auxiliary server accepted connection from {remote_ip_address}:{port_number}

    Message type:
    debug
    Auxiliary server accept connection.
    Source: Auxiliary server
    Result: auxiliary server continue with accepted request..
    Solution: none
    AXSH-XS-31B all tasks of auxiliary server finished.

    Message type:
    debug
    Auxiliary server is being terminated and all request and threads are finished.
    Source: Auxiliary server
    Result: auxiliary server is being terminated.
    Solution: none
    AXSH-XC-800 auxiliary server connection in error

    Message type:
    error
    One of auxiliary server thread and its connection received error.
    Source: Auxiliary server
    Result: auxiliary server's connection and thread is terminated.
    Solution: examine previous messages
    AXSH-XC-510 auxiliary connection intentionally interrupted

    Message type:
    information
    Current connection received request to close.
    Source: Auxiliary server
    Result:
    • current connection is closed
    • auxiliary service continue processing
    Solution: none
    OSVC-MN-84A {detailed_message}

    Message type:
    error
    RESTfull service cannot be started due to error described in {detailed_message}.
    Source: GPUServer, Secondary Server
    Result: RESTful service will be terminated.
    Solution:
    • Examine message {detailed_message}, fix the issue and restart GPUServer.
    • The most common problem is already bound port.
    GSRV-GS-91A [{info_type_I}] Unexpected exception

    Message type:
    critical
    GPUServer's process with {process_pid} caught unexpected and unknown exception.
    Source: GPUServer
    Result:
    • GPUServer process {process_pid} is terminated immediately.
    Solution: Examine previous messages in log if you cannot fix please contact support.
    GBSC-CX-80A [{info_type_I}] Child process ended, {detailed_messages}

    Message type:
    error
    The termination of child process could be handled already by different module.
    Source: GPUServer, Primary Server
    Result: Child process ended.
    Solution: Verify with # ps -A | grep gpuserver if the number of processes is not too high i.e. the number of processes must be 2 (Primary Server + Secondary Server) + number of client's processes (Subserver).
    GBSC-CX-50Z [{info_type_I}] Subprocess ended

    Message type:
    information
    Process ended normally.
    Source: GPUServer, Primary Server
    Result: Process ended.
    Solution: none.
    GBSC-CX-50A [{info_type_I}] Subprocess {pid} ended

    Message type:
    information
    Process ended normally.
    Source: GPUServer, Primary Server
    Result: Process {pid} ended.
    Solution: none.
    GBSC-CX-50B [{info_type_I}] Subprocess {pid} terminated by signal

    Message type:
    information
    Child process received signal and as a consequences ended.
    This message is shown when child process is killed by SIGKILL (-9) signal.
    Source: GPUServer, Primary Server
    Result: Process {pid} ended.
    Solution: none.
    GBSC-CX-50C [{info_type_I}] Subprocess {pid} ended

    Message type:
    information
    Process received signal and ended.
    Source: GPUServer, Primary Server
    Result: Process {pid} ended.
    Solution: none.
    GBSC-CX-50D [{info_type_I}] Subprocess {pid} stopped by signal

    Message type:
    information
    Process received signal and stopped.
    Source: GPUServer, Primary Server
    Result: Process {pid} has been stopped by signal.
    Solution: none.
    GBSC-CX-50X [{info_type_I}] Subprocess {pid} ended

    Message type:
    information
    Process ended normally.
    Source: GPUServer, Primary Server
    Result: Process {pid} ended.
    Solution: none.
    GBSC-CF-51B [{info_type_I}] Configuration file: {config_file}

    Message type:
    information
    Source: GPUServer, Primary Server
    Result: Configuration parameters are being processed from file {config_file}.
    Solution: none.
    GBSC-GI-600 [{info_type_I}] Version: {version}

    Message type:
    notice
    Display GPUServer version
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-81A [{info_type_I}] Configuration parameter 'gpuserver_bind_ip' not found in configuration file

    Message type:
    error
    Parameter 'gpuserver_bind_ip' is not defined in configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer will terminate
    Solution: Update parameter according to manual to bind GPUServer to specific IP interface and restart GPUServer.
    GBSC-GI-71B [{info_type_I}] Configuration parameter 'gpuserver_bind_port' not found in configuration file, default value used: 9393

    Message type:
    warning
    Parameter 'gpuserver_bind_port' is not defined in configuration file.
    Source: GPUServer, Primary Server
    Result:
    • GPUServer is bound on default port 9393
    • GPUServer starting continue
    Solution: Update parameter according to manual to bind GPUServer on specific port
    GBSC-GI-81B [{info_type_I}] Invalid port number for parameter 'gpuserver_bind_port': {port_number}

    Message type:
    error
    Parameter 'gpuserver_bind_port' has invalid value, valid port range is 1..65535.
    {port_number} is invalid port specified in configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer will terminate
    Solution: Change port number and restart GPUServer.
    GBSC-GI-71C [{info_type_I}] Configuration parameter 'gpuserver_rest_oserver_address' not found in configuration file, default value used: http://127.0.0.1:8081

    Message type:
    warning
    Parameter 'gpuserver_rest_oserver_address' is not defined in configuration file, it's a communication service point for OServer
    Source: GPUServer, Primary Server
    Result:
    Solution: Update parameter according to manual to bind GPUServer on specific port
    GBSC-GI-71D [{info_type_I}] Configuration parameter 'gpuserver_rest_bind' not found in configuration file, default value used: 127.0.0.1:8080

    Message type:
    warning
    Parameter 'gpuserver_rest_bind' is not defined in configuration file.
    Source: GPUServer, Primary Server
    Result:
    • GPUServer is bound on default interface 127.0.0.1:8080
    • GPUServer start continue
    Solution: Update parameter according to manual to bind GPUServer on specific port
    GBSC-GI-81D [{info_type_I}] Cannot find infrastructure token

    Message type:
    error
    Infrastructure token parameter 'auth_token' is not specified in configuration file.
    Source: GPUServer, Primary Server
    Result: GPUserver is begin terminated
    Solution: Set correct infrastructure token in 'auth_token' and restart GPUServer
    GBSC-GI-71H [{info_type_I}] Configuration parameter 'gpuserver_infiniband_enabled' not found in configuration file, default value used: no

    Message type:
    warning
    Config parameter 'gpuserver_infiniband_enabled' enables or disables InfiniBand support communication between GPUBox client and GPUServer. Although parameter is set to 'yes' both client and server needs to support InfiniBand i.e. both needs to have InfiniBand interfaces and libInfiniBand-gpubox.so library installed and available.
    Source: GPUServer, Primary Server
    Result:
    • default value 'no' used
    • GPUServer start continue
    • When GPUServer starts with InfiniBand but system does not have installed the proper hardware and software driver, GPUServer will continue with TCP/IP communication.
    Solution: Set desired 'gpuserver_infiniband_enabled' parameter
    GBSC-GI-71I [{info_type_I}] Configuration parameter 'gpuserver_infiniband_enabled' is invalid, default value used: no

    Message type:
    warning
    Config parameter 'gpuserver_infiniband_enabled' has to be 'yes' or 'no' only.
    Source: GPUServer, Primary Server
    Result:
    • default value 'no' used
    • GPUServer start continue
    • When GPUServer starts with InfiniBand but system does not have installed the proper hardware and software driver, GPUServer will continue with TCP/IP communication.
    Solution: Set desired 'gpuserver_infiniband_enabled' parameter according to manual
    GBSC-GI-61J [{info_type_I}] Parameter 'gpuserver_infiniband_device' take precedence over environment variable 'GPUBOX_IBDEV', {gpubox_ibdev}

    Message type:
    notice
    Parameter 'gpuserver_infiniband_device' was specified but environment variable 'GPUBOX_IBDEV' was also defined.
    Value of variable is displayed as {gpubox_ibdev}.
    Source: GPUServer, Primary Server
    Result: Parameter take precedence over variable.
    Solution: none
    GBSC-GI-72J [{info_type_I}] Set parameter 'gpuserver_infiniband_device' failed, {detailed_message}

    Message type:
    warning
    Parameter 'gpuserver_infiniband_device' was specified and GPUServer was trying to set environment variable 'GPUBOX_IBDEV' but the action failed.
    Source: GPUServer, Primary Server
    Result: Value from 'gpuserver_infiniband_device' does not have effect, first device will be used.
    Solution: Examine {detailed_message}.
    GBSC-GI-71J [{info_type_I}] Configuration parameter 'gpuserver_infiniband_device' specified but InfiniBand is disabled

    Message type:
    warning
    Parameter 'gpuserver_infiniband_device' was specified but InfiniBand communication is not enabled in parameter 'gpuserver_infiniband_enabled' or the parameter is missing.
    Source: GPUServer, Primary Server
    Result: Value from 'gpuserver_infiniband_enabled' does not have effect, first device will be used.
    Solution: Enable InfiniBand communication, specify in configuration file: gpuserver_infiniband_enabled="yes"
    GBSC-GI-61K [{info_type_I}] Parameter 'gpuserver_infiniband_ports' take precedence over environment variable 'GPUBOX_IBPORTS', {gpubox_ibports}

    Message type:
    notice
    Parameter 'gpuserver_infiniband_ports' was specified but environment variable 'GPUBOX_IBPORTS' was also defined.
    Value of variable is displayed as {gpubox_ibports}.
    Source: GPUServer, Primary Server
    Result: Parameter take precedence over variable.
    Solution: none
    GBSC-GI-72K [{info_type_I}] Type of 'gpuserver_infiniband_ports' not supported, first port will be used

    Message type:
    warning
    Parameter 'gpuserver_infiniband_ports' value is invalid and first InfiniBand port will be used.
    Source: GPUServer, Primary Server
    Result:
    • First InfiniBand port will be used
    • GPUServer start continues
    Solution: none
    GBSC-GI-72L [{info_type_I}] Set parameter 'gpuserver_infiniband_ports' failed, {detailed_message}

    Message type:
    warning
    Parameter 'gpuserver_infiniband_ports' was specified and GPUServer was trying to set environment variable 'GPUBOX_IBPORTS' but the action failed.
    Source: GPUServer, Primary Server
    Result: Value from 'gpuserver_infiniband_ports' does not have effect, first device will be used.
    Solution: Examine {detailed_message}.
    GBSC-GI-71K [{info_type_I}] Configuration parameter 'gpuserver_infiniband_ports' specified but InfiniBand is disabled

    Message type:
    warning
    Parameter 'gpuserver_infiniband_ports' was specified but InfiniBand communication is not enabled in parameter 'gpuserver_infiniband_enabled' or the parameter is missing.
    Source: GPUServer, Primary Server
    Result: Value from 'gpuserver_infiniband_ports' does not have effect, first device will be used.
    Solution: Enable InfiniBand communication, specify in configuration file: gpuserver_infiniband_enabled="yes"
    GBSC-GI-51A [{info_type_I}] GPUServer's hostname {hostname}

    Message type:
    information
    At GPUServer's start, it displays hostname for GPUserver. Hostname is used for OServer communication and for 'agpubox' command. Please see 'agpubox' command reference for more details.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51B [{info_type_I}] gpuserver_bind_ip = {gpuserver_bind_ip}

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_bind_ip' parameter from configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51C [{info_type_I}] gpuserver_bind_port = {gpuserver_bind_port}

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_bind_port' parameter from configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51D [{info_type_I}] gpuserver_rest_bind = {gpuserver_rest_bind}

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_rest_bind' parameter from configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51F [{info_type_I}] gpuserver_rest_oserver_address = {gpuserver_rest_oserver_address}

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_rest_oserver_address' parameter from configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51G [{info_type_I}] gpuserver_infiniband_enabled = yes

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_infiniband_enabled' parameter from configuration file.
    'gpuserver_infiniband_enabled' = yes
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51H [{info_type_I}] gpuserver_infiniband_enabled = no

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_infiniband_enabled' parameter from configuration file.
    'gpuserver_infiniband_enabled' = no
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51J [{info_type_I}] gpuserver_infiniband_device = {gpuserver_infiniband_device}

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_infiniband_device' parameter from configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-51K [{info_type_I}] gpuserver_infiniband_ports = {gpuserver_infiniband_ports}

    Message type:
    information
    At GPUServer's start, it displays 'gpuserver_infiniband_ports' parameter from configuration file.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-52A [{info_type_I}] GPU devices in use: [ {id0}, {id1}, {id2} ]

    Message type:
    information
    At GPUServer's start, it resolves and displays 'gpuserver_gpus' parameter from configuration file. Message displays number of GPU devices that GPUServer will serve, numbers match CUDA ordinal numbers. When specified 'all' this message displays all available devices.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-52B [{info_type_I}] All GPUs will be used

    Message type:
    information
    At GPUServer's start, it informs that all GPU devices are used.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-GI-52F [{info_type_I}] String in 'gpuserver_gpus' parameter not supported {value}, all GPUs will be used

    Message type:
    information
    At GPUServer's start, it informs that all GPU devices are used. 'all' string was not used. Parameter 'gpuserver_gpus' value is invalid and all GPU devices will be used.
    Source: GPUServer, Primary Server
    Result:
    • All GPUs will be used
    • GPUServer start continues
    Solution: none
    GBSC-GI-72A [{info_type_I}] Type of 'gpuserver_gpus' not supported, all GPUs will be used

    Message type:
    warning
    At GPUServer's start, it informs that all GPU devices are used. Neither array of numbers nor 'all' statement was used. Parameter 'gpuserver_gpus' value is invalid and all GPU devices will be used.
    Source: GPUServer, Primary Server
    Result:
    • All GPUs will be used
    • GPUServer start continues
    Solution: none
    GBSC-GI-72B [{info_type_I}] Cannot find 'gpuserver_gpus parameter', all GPUs will be used

    Message type:
    warning
    At GPUServer's start, it informs that all GPU devices are used. Parameter 'gpuserver_gpus' was not found and all GPU devices will be used.
    Source: GPUServer, Primary Server
    Result: GPUServer start continue
    Solution: none
    GBSC-CF-500 [{info_type_I}] Path to configuration file given by program parameter: {config_file}

    Message type:
    information
    Configuration file path was specified from command line within '-c' parameter.
    Source: GPUServer, Primary Server
    Result: Configuration parameters are processed from file {config_file}.
    Solution: none.
    GBSC-CF-51A [{info_type_I}] Path to configuration file from environment variable GPUSERVER_CONF: {config_file}}

    Message type:
    information
    Configuration file path was specified under environment variable GPUSERVER_CONF.
    Source: GPUServer, Primary Server
    Result: Configuration parameters are processed from file {config_file}.
    Solution: none.
    GBSC-CF-520 [{info_type_I}] Configuration file processed from {configuration_file}

    Message type:
    information
    Configuration file path was specified in /etc/gpuserver.conf.
    Source: GPUServer, Primary Server
    Result: Configuration parameters are proceed from file
    Linux: /etc/gpuserver.conf.
    Windows: PROGRAMDATA%.conf
    Solution: none.
    GBSC-AX-80A [{info_type_C}] Failed to parse data from OServer, authorization failed

    Message type:
    error
    Data parser detect syntax errors while processing authorization request from OServer.
    Specific error can be displayed in message 'GBSC-AX-050' when 'gpuserver_log_level = "TRACE"' in configuration file.
    Source: GPUServer, Primary Server
    Result:
    • authorization request failed
    • GPUServer processing continue.
    Solution: Restart GPUServer with 'gpuserver_log_level = "TRACE"' and analyze the error in message 'GBSC-AX-050'.
    GBSC-AX-050 [{info_type_C}] JSON error: {specific JSON error description}

    Message type:
    trace
    JSON parser detect syntax errors. Message contains JSON error description
    Source: GPUServer, Primary Server
    Result:
    • request authorization failed
    • GPUServer processing continue.
    Solution: none
    GBSC-AX-81A [{info_type_C}] Empty authorization vector of GPU devices

    Message type:
    error
    During authorization, parser cannot find GPU in data from OServer. Authorization vector is generated by OServer and defines what GPU devices Client can use on this GPUServer.
    Source: GPUServer, Primary Server
    Result:
    • request authorization failed
    • GPUServer processing continue.
    Solution: Likely client is not authorized or did not allocate GPU yet.
    Check user credentials and current user's allocations - for more details refer to 'agpubox command' reference.
    This message is at error level because it can also be an indicator of attacks or attempts of unauthorized access.
    GBSC-AX-80B [{info_type_C}] Too many GPU assigned to single user: {gpu_count}

    Message type:
    error
    Passed number of GPUs exceeded number of allowed devices in GPUServer.
    Source: GPUServer, Primary Server
    Result:
    • request authorization failed
    • GPUServer processing continue.
    Solution:
    • Likely data passed from OServer are incorrect, please contact support.
    • The message could also indicate on an unauthorized access attempts to GPUServer sersources.
    GBSC-AX-80D [{info_type_C}] Userid not specified

    Message type:
    error
    During authorization, parser cannot find userid statement in data from OServer.
    Source: GPUServer, Primary Server
    Result:
    • request authorization failed
    • GPUServer processing continue.
    Solution:
    • Likely data passed from OServer are incorrect. please contact support.
    • The message could also indicate on an unauthorized access attempts to GPUServer sersources.
    GBSC-AX-80E [{info_type_C}] Username not specified

    Message type:
    error
    During authorization, parser cannot find username statement in data from OServer.
    Source: GPUServer, Primary Server
    Result:
    • request authorization failed
    • GPUServer processing continue.
    Solution:
    • Likely data passed from OServer are incorrect. please contact support.
    • The message could also indicate on an unauthorized access attempts to GPUServer sersources.
    GBSC-AX-80F [{info_type_C}] Invalid format of authorization vector of GPU devices

    Message type:
    error
    Data parser detect syntax errors, Authorization vector of GPU devices is incorrect or corrupted.
    Source: GPUServer, Primary Server
    Result:
    • request authorization failed
    • GPUServer continues processing.
    Solution:
    • Likely data passed from OServer are incorrect, please contact support.
    • The message could also indicate on an unauthorized access attempts to GPUServer sersources.
    GBSC-AX-91X [{info_type_C}] GPUServer internal error, user authorization failed

    Message type:
    critical
    Unexpected internal error occurred.
    Source: GPUServer, Primary Server
    Result:
    • request authorization failed
    • GPUServer processing continue.
    Solution:
    • Likely data passed from OServer are incorrect. please contact support.
    • The message also shows when unauthorized access attempts and/or data passed to authorization is entirely invalid.
    GBSC-CF-81B [{info_type_I}] I/O error while reading configuration file {config_file}

    Message type:
    error
    While opening configuration file {config_file} an error occurred.
    Source: GPUServer, Primary Server
    Result: GPUServer will be terminated with message 'GSA2'.
    Solution: Check if configuration path to file and the permissions are correct and restart GPUServer
    GBSC-CF-81C [{info_type_I}] Parse error in configuration file: {config_file}

    Message type:
    error
    Syntax error detected in configuration file
    Source: GPUServer, Primary Server
    Result: GPUServer will be terminated with message 'GSA2'.
    Solution: Correct parameter and restart GPUServer .
    GBSC-HB-81L [{info_type_C}] Connection with OServer lost

    Message type:
    error
    Connection with OServer is lost. Likely OServer was terminated or network failure occurred.
    Source: GPUServer, Primary Server
    Result:
    • heartbeat process interrupted
    • GPUServer will be trying to reconnect OServer.
    Solution:
    • Verify if OServer is working properly, examine OServer's log.
    • Check if network is up and running, ping to OServer's IP address.
    GBSC-HB-31R [{info_type_C}] Connection with OServer error: {error_code}, {error_message}

    Message type:
    debug
    Connection to OServer is lost. Cause of the issue is displayed in {error_message}. This is only debug message and should come in pair with 'GBSC-HB-81R' or 'GBSC-HB-82R'.
    Source: GPUServer, Primary Server
    Result:
    • heartbeat process interrupted
    • GPUServer will be trying to reconnect OServer.
    Solution: none
    GBSC-HB-81R [{info_type_C}] Connection closed to OServer by peer

    Message type:
    error
    OServer was terminated and connection closed correctly.
    Source: GPUServer, Primary Server
    Result:
    • heartbeat process interrupted
    • GPUServer will be trying to reconnect OServer.
    Solution: start OServer to reconnect.
    GBSC-HB-82R [{info_type_C}] Connection to OServer error, connection lost

    Message type:
    error
    Connection to OServer is lost. Likely OServer was incorrectly terminated or network failure occurred.
    Source: GPUServer, Primary Server
    Result:
    • heartbeat process interrupted
    • GPUServer will be trying to reconnect OServer.
    Solution:
    • Verify if OServer is working properly, examine OServer's log.
    • Check if network is up and running, ping to OServer's IP address.
    GBSC-HB-71C [{info_type_C}] Incorrect response from OServer, retrying...

    Message type:
    warning
    During heartbeat OServer invalid data was received.
    Source: GPUServer, Primary Server
    Result:
    • GPUServer will be trying to reconnect heartbeat communication every 5 times in a row.
    • GPUServer processing continue but can fail when OServer connection is really lost.
    Solution:
    • Verify is OServer is working properly, examine OServer's log.
    • Check if network is up and running. Possibly network failure while OServer was being sent data.
    GBSC-HB-81C [{info_type_C}] Incorrect response from OServer, connection lost

    Message type:
    error
    During OServer's heartbeat, invalid data was received. Invalid response was received 5 times in a row and heartbeat is interrupted.
    Source: GPUServer, Primary Server
    Result:
    • heartbeat process interrupted
    • GPUServer will be trying to reconnect OServer.
    Solution:
    • Verify if OServer is working properly, examine OServer's log.
    • Check if network is up and running, ping to OServer's IP address.
    GBSC-HB-80S [{info_type_C}] Connection lost to OServer

    Message type:
    error
    During OServer's heartbeat, error occurred while data were being sent.
    Source: GPUServer, Primary Server
    Result:
    • heartbeat process interrupted
    • GPUServer will be trying to reconnect OServer.
    Solution:
    • Verify if OServer is working properly, examine OServer's log.
    • Check if network is up and running, ping to OServer's IP address.
    GBSC-HB-30T [{info_type_C}] OServer {ip_address} alive

    Message type:
    debug
    Message is shown every hour to confirm that connection with OServer is up an running.
    Source: GPUServer, Primary Server
    Result:
    • heartbeat process continue
    • GPUServer processing continue.
    Solution: none
    GBSC-LR-60A [{info_type_C}] Log {log_file} reopened

    Message type:
    notice
    Log was reopen most likely due to received SIGHUP signal.
    {log_file} is a full path to log file or it's STDOUT.
    Source: GPUServer
    Result: log file reopened.
    Solution: none
    GBSC-LR-80A [{info_type_C}] Log {log_file} failed to reopen

    Message type:
    error
    Likely GPUServer received SIGHUP signal and tried to reopen log due to normal operation but the reopen failed or log was redirected to STDOUT.
    Source: GPUServer
    Solution: Verify if the path to GPUServer's log is accessible.
    GBSC-SC-90A [{info_type_I}] Cannot create subprocess to send initial configuration to OServer

    Message type:
    critical
    GPUServer cannot create subprocess to send initial configuration of GPU Devices to OServer.
    Source: GPUServer, Primary Server
    Result: GPUSerer will be terminated
    Solution:
    GBSC-SC-95A [{info_type_I}] Cannot initialize CUDA environment: {error}

    Message type:
    critical
    GPUServer cannot initialize CUDA environment due to {error}.
    Windows: Despite critical errors other subprocesses can be started successfully and broadcast readiness. Eventually all process are terminated.
    Source: GPUServer, Primary Server
    Result: GPUServer will not register GPU devices.
    Solution:
    • Verify if you have installed at least one GPU device compatible with CUDA.
    • Verify if GPUServer process have access to /dev/nvidia* files.
    GBSC-SC-95B [{info_type_I}] Cannot initialize CUDA environment: {error}

    Message type:
    critical
    GPUServer cannot initialize CUDA environment due to {error}.
    Source: GPUServer, Primary Server
    Result: GPUServer will not register GPU devices.
    Solution:
    • Verify if you have installed at least one GPU device compatible with CUDA.
    • Verify if GPUServer process have access to /dev/nvidia* devices.
    GBSC-SC-60A [{info_type_I}] Found GPU device ID: {gpu_id}, name: {gpu_name}

    Message type:
    notice
    GPUServer found GPU devices with name {gpu_name} and ID {gpu_id}.
    The message is sent only once per each found GPU at the initialization phase of GPUServer.
    Source: GPUServer, Primary Server
    Result: GPUServer will send the GPU information to OServer.
    Solution: none
    GBSC-SC-91D [{info_type_I}] Unexpected data exception occurred,, cannot send GPU configuration to OServer

    Message type:
    critical
    While data of GPU devices are being prepared, unexpected exception occurred
    Source: GPUServer, Primary Server
    Result: GPUSerer will be terminated
    Solution:
    GBSC-SC-99Z [{info_type_I}] Initial communication subprocess in critical error, GPUServer will be terminated

    Message type:
    critical
    GPUServer tried to send configuration of GPUs but received critical error
    Source: GPUServer, Primary Server
    Result: GPUServer terminated
    Solution: start OServer, restart GPUServer
    GBSC-SC-81A [{info_type_I}] Bad request response from OServer, retrying {retry_number}

    Message type:
    error
    During GPU devices registration, OServer responds with 'bad request', GPUServer will try to resend the request.
    If resend fails 3 times in a row GPUServer will be terminated with message GBSC-SC-91A.
    {retry_number} current number of request retry.
    Source: GPUServer, Primary Server
    Result: GPUSerer will try to resend the request to OServer
    Solution:
    1. Try to find the cause by examining previous messages
    2. please contact support
    GBSC-SC-91A [{info_type_I}] Bad request response from OServer

    Message type:
    critical
    During GPU devices registration, OServer responds with 'bad request'
    Source: GPUServer, Primary Server
    Result: GPUSerer will be terminated
    Solution:
    1. Try to find the cause by examining previous messages
    2. please contact support
    GBSC-SC-81Z [{info_type_I}] Fatal response from OServer, retrying {retry_number}

    Message type:
    error
    During GPU devices registration, OServer responds with 'unauthorized', GPUServer will try to resend the request.
    If resend fails 3 times in a row GPUServer will be terminated with message GBSC-SC-91Z.
    {retry_number} current number of request retry.
    Source: GPUServer, Primary Server
    Result: GPUSerer will try to resend the request to OServer
    Solution:
    1. Try to find the cause by examining previous messages
    2. please contact support
    GBSC-SC-91Z [{info_type_I}] Fatal response from OServer, GPUServer is unauthorized

    Message type:
    critical
    During GPU devices registration OServer responds that GPUServer is unauthorized to register GPU devices
    Source: GPUServer, Primary Server
    Result: GPUSerer will be terminated
    Solution:
    • Follow the 'Security' chapter in the manual and check if 'auth_token' in configuration file is correct.
    GBSC-SC-71A [{info_type_I}] Cannot send initial GPUs configuration, retrying every 5s ...

    Message type:
    warning
    OServer is not ready yet or it is not started.
    GPUserver is trying to reconnect to OServer every 5 seconds, the message shows every 1000 seconds.
    Source: GPUServer, Primary Server
    Result: GPUSerer will be trying to connect to OServer and send configuration of GPUs
    Solution:
    • Wait until OServer is fully initialized or start OServer
    • If OServer is started and this message still shows restart GPUServer
    GBSC-SC-61A [{info_type_I}] Initial GPUs configuration sent to OServer

    Message type:
    notice
    Initial GPU devices configuration successfully sent to OServer
    Source: GPUServer, Primary Server
    Result: GPUServer is ready to accept requests from Clients.
    Solution: none
    GBSC-SL-70A [{info_type_I}] Parameter 'gpuserver_log_path' not found in configuration file, default used: STDOUT

    Message type:
    warning
    Parameter 'gpuserver_log_path' is missing in configuration file.
    Source: GPUServer, Primary Server
    Result:
    • log will be redirected to standard output
    • GPUServer processing continues.
    Solution: If log output to file is needed, specify full file's path in 'gpuserver_log_path'.
    GBSC-SL-70B [{info_type_I}] Parameter 'gpuserver_log_level' not found in configuration file, default value used: TRACE

    Message type:
    warning
    Parameter 'gpuserver_log_level' is missing in configuration file. Default value is NOTICE so the most relevant messages are displayed. /n Source: GPUServer, Primary Server
    Result: Only NOTICE, WARNING, ERROR and CRITICAL messages are displayed in log.
    Solution: Specify parameter 'gpuserver_log_level' in configuration file. Available log levels are: TRACE, DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL
    GBSC-SL-50C [{info_type_I}] Logging started

    Message type:
    information
    Logging has been started.
    Source: GPUServer, Primary Server
    Result: GPUServer processing continue.
    Solution: none
    GBSC-SL-68A [{info_type_I}] Invalid log level, 'TRACE' will be used instead

    Message type:
    notice
    GPUServer will use default value 'TRACE' for 'gpuserver_log_level'. The value from configuration file is invalid.
    Source: GPUServer, Primary Server
    Result: GPUServer will continue logging with 'TRACE' level'.
    Solution: Value of parameter 'gpuserver_log_level' must be on of 'TRACE', 'DEBUG', 'INFO', 'NOTICE', 'WARNING', 'ERROR', 'CRITICAL'.
    GBSC-SL-71A [{info_type_I}] Cannot open output file {gpuserver_log_path} logging continue to standard output

    Message type:
    warning
    OServer cannot open file for logging the further logging is redirected to standard output.
    Source: GPUServer, Primary Server
    Result:
    • log writes to standard output
    • GPUServer processing output.
    Solution:
    • Check if path to log is correct.
    • Check if all directories in path and log file has write permissions.
    GBSC-SL-50A [{info_type_I}] gpuserver_log_path = {gpuserver_log_path }

    Message type:
    information
    Displays path to log specified in configuration parameter 'gpuserver_log_path'.
    Source: GPUServer, Primary Server
    Result: GPUServer processing continue.
    Solution: none
    GBSC-SL-50B [{info_type_I}] gpuserver_log_level = {gpuserver_log_level}

    Message type:
    information
    Displays log level specified in configuration parameter. If value of 'gpuserver_log_level' is not one of accepted value ERROR will be used.
    Source: GPUServer, Primary Server
    Result: GPUServer processing continue.
    Solution: none
    GBSC-SL-90A [{info_type_I}] Unexpected exception occurred, failed to start logging

    Message type:
    critical
    During logging setup unexpected exception happens.
    Source: GPUServer, Primary Server
    Result: GPUServer is being terminated.
    Solution: Examine the previous messages.
    GBSC-SG-80A [{info_type_I}] Config parameter 'gpuserver_bind_ip' has invalid IP address: {gpuserver_bind_ip}

    Message type:
    error
    At GPUServer start, invalid IP interface of parameter 'gpuserver_bind_ip' was detected. Invalid IP address is displayed in {gpuserver_bind_ip}.
    Source: GPUServer, Primary Server
    Result: GPUServer will terminate
    Solution: correct configuration parameter and restart GPUServer
    GBSC-SG-80C [{info_type_I}] Config parameter 'gpuserver_bind_ip' specified, but cannot be 0.0.0.0, current value : {gpuserver_bind_ip}

    Message type:
    error
    At GPUServer start, invalid IP interface of parameter 'gpuserver_bind_ip' detected. Invalid IP address is displayed in {gpuserver_bind_ip}.
    Interface specified in 'gpuserver_bind_ip' cannot be either 0.0.0.0 or loopback.
    Source: GPUServer, Primary Server
    Result: GPUServer will terminate
    Solution: correct configuration parameter and restart GPUServer
    GBSC-SG-82I [{info_type_I}] 'gpuserver_rest_bind' has invalid format, must be bound on specific IP address

    Message type:
    error
    Interface specified in 'gpuserver_bind_ip' cannot be 0.0.0.0, must be a specific IP interface.
    Source: GPUServer, Primary Server
    Result: GPUServer will terminate
    Solution: correct configuration parameter and restart GPUServer
    GBSC-SG-82J [{info_type_I}] 'gpuserver_rest_bind' has invalid format, current value: {gpuserver_rest_bind}

    Message type:
    error
    Interface has invalid format
    Source: GPUServer, Primary Server
    Result: GPUServer will terminate
    Solution: Examine the incorrect value, change it accordingly to specification given in manual and restart GPUServer
    GBSC-SG-83U [{info_type_I}] 'gpuserver_rest_oserver_address' has invalid format

    Message type:
    error
    OServer address must be a full valid HTTP or HTTPs address.
    Source: GPUServer, Primary Server
    Result: GPUServer will terminate
    Solution: correct configuration parameter and restart GPUServer
    GBSC-SG-99J [{info_type_I}] GPUServer cannot be started

    Message type:
    critical
    As a consequence of previous errors, GPUServer cannot be started.
    Source: GPUServer, Primary Server
    Result: GPUServer is being terminated.
    Solution: Likely some configuration parameters are incorrect. Examine the previous messages, try to fix the issue and restart GPUServer
    GBSC-SG-91J [{info_type_I}] TCP/IP exception: {exception_message}

    Message type:
    critical
    TCP/IP exception occurred.
    Source: GPUServer, Primary Server
    Result: GPUServer is being terminated.
    Solution:
    • Check if specified parameters related to TCP communication are correct, carefully examine parameters and their values
      • gpuserver_bind_port, value displayed in 'GBSC-GI-51C'
      • gpuserver_rest_bind, value displayed in 'GBSC-GI-51D'
      • gpuserver_rest_oserver_address, value displayed in 'GBSC-GI-51F'
      • gpuserver_bind_ip, value displayed in 'GBSC-GI-51B', errors can also be displayed in
        • 'GBSC-SG-80A'
        • 'GBSC-SG-80B'
        • 'GBSC-SG-80C'
        • 'GBSC-SG-82I'
        • 'GBSC-SG-82J'
    • Examine the previous messages, try to fix the issue and restart GPUServer.
    • please contact support
    GBSC-SG-91D [{info_type_I}] Unexpected exception occurred

    Message type:
    critical
    Unspecified exception occurred.
    Source: GPUServer, Primary Server
    Result: GPUServer is being terminated.
    Solution:
    GBSC-SG-98X [{info_type_I}] Cannot continue without RESTful service

    Message type:
    critical
    GPUServer cannot continue initialization without RESTful service.
    Source: GPUServer, Primary Server
    Result: GPUServer will be terminated.
    Solution: Examine messages related to RESTful service, as: 'OSVC-MN-84A', 'GBSC-SG-82I' or 'GBSC-SG-82J'.
    GBSC-SG-61D [{info_type_I}] GPUServer is ready

    Message type:
    notice
    The message is shown at start and informs that GPUServer is ready to accept request from client. When connection to OServer fails and then reconnects this message GBSC-SC-61A indicates that GPUServer is back to normal.
    Source: GPUServer, Primary Server
    Result: GPUServer will accept request from client.
    Solution: none
    GBSC-SG-98Z [{info_type_I}] Cannot continue without RESTful service

    Message type:
    notice
    Secondary server must be terminated.
    Source: GPUServer, Primary Server
    Result: GPUServer will be terminated.
    Solution:
    • Try to find the cause of termination of secondary server. The process could be terminated outside the GPUServer.
    • Examine messages related to RESTful service as: 'OSVC-MN-84A'.
    GBSC-SG-73A [{info_type_C}] Parameter was not change for primary server

    Message type:
    warning
    Administrator tried to change parameter and the new value must be propagate across all processes but parameter for primary server cannot be change. Primary server could not read the configuration parameters passed by secondary server. If any parameters like 'gpuserver_log_level' or 'gpuserver_infiniband_enabled' were changed, it won't have effect on current running server, here it won't change the value of primary server's parameter.
    Source: GPUServer, Primary Server
    Result:
    • Primary server continues working on parameters from configuration file.
    • GPUServer continues processing
    Solution: please contact support
    GBSC-SG-78A [{info_type_C}] Invalid response, connection closed from {remote_ip_address}

    Message type:
    warning
    GPUServer detects invalid handshake. The connection failure can be caused by OServer, Client or unauthorized sources. Connection was denied on port specified in configuration parameter 'gpuserver_bind_port' and comes from {remote_ip_address}.
    Source: GPUServer, Primary Server
    Result: This request to GPUServer was denied.
    Solution:
    • Examine the previous messages to check if IP addresses are not from OServer or Client.
    • Many consecutive messages may indicate on attempts of attacks, may also indicate on many failure attempts of clients.
    • If the cause of the message is not obvious, save client messages if any, save log of GPUServer and OServer and please contact support.
    GBSC-SG-50A [{info_type_C}] Connection accepted from {remote_ip_address}:{remote_port}

    Message type:
    information
    GPUServer accepted connection either from OServer or Client.
    Source: GPUServer, Primary Server
    Result: Request processing continue.
    Solution: none
    GBSC-SG-52A [{info_type_C}] Heartbeat to {remote_ip_address} started

    Message type:
    information
    GPUServer started heartbeat connection to OServer. OServer has IP address {remote_ip_address}.
    Source: GPUServer, Primary Server
    Result: Request processing continue.
    Solution: none
    GBSC-SG-82A [{info_type_C}] Heartbeat authorization failed

    Message type:
    error
    Heartbeat connection detected but authorization failed.
    Source: GPUServer, Primary Server
    Result: heartbeat request failed.
    Solution: Check OServer log for appropriate message, otherwise unauthorized connection detected.
    GBSC-SG-83A [{info_type_C}] Cannot create child process

    Message type:
    error
    Cannot create new process to fulfill Client's request
    Source: GPUServer, Primary Server
    Result:
    • client's request failed
    • connection terminated.
      Solution: Examine Client messages and previous messages in GPUServer.
    GBSC-SG-53A [{info_type_C}] Subprocess {child_pid} created

    Message type:
    information
    For Client's request new process with {child_pid} was created.
    Source: GPUServer, Primary Server
    Result: continue processing of client's request.
    Solution: none
    GBSC-SG-700 [{info_type_C}] Authorization failed

    Message type:
    warning
    Client provides invalid credentials.
    Source: GPUServer, Primary Server
    Result: Client's request will be terminated
    Solution:
    • Possibly user has outdated token in Client's configuration file and needs to relogin with 'gpubox token' subcommand. Token is in the following client's configuration files:
      • on Linux environment $HOME/.gpubox
      • on Windows environment LOCALAPPDATA%.gpubox
    • Clients credentials are not valid anymore - examine OServer log and user's credentials.
    GBSC-SG-53B [{info_type_C}] Process: {child_pid} finished

    Message type:
    information
    Client's request has either finished correctly or failed and process is finished
    Source: GPUServer, Primary Server
    Result: client's request finished.
    Solution: If failed on Client's request, examine previous messages.
    GBSC-SR-90A [{info_type_I}] Cannot create RESTful service

    Message type:
    critical
    GPUServer cannot RESTful communication service point
    Source: GPUServer, Secondary Server
    Result: GPUServer will be terminated
    Solution: Examine the previous messages .
    GBSC-SR-30A [{info_type_I}] Starting RESTful service on: {gpuserver_rest_bind}

    Message type:
    debug
    GPUServer begin to start RESTful server on interface and port defined by {gpuserver_rest_bind}
    Source: GPUServer, Secondary Server
    Result: Message 'GBSC-SR-91A' inform about errors and message 'GBSC-SR-60A' about successfully started service
    Solution: none.
    GBSC-SR-91A [{info_type_I}] Cannot create RESTful service

    Message type:
    critical
    GPUServer cannot create communication service point.
    Source: GPUServer, Secondary Server
    Result: GPUServer will send GPU configuration to OServer and won't be terminated however it cannot serve any requests and must be restarted manually.
    Solution: Examine message OSVC-MN-84A, fix the issue and restart GPUServer
    GBSC-SR-60A [{info_type_I}] Rest service bound on: {gpuserver_rest_bind}, PID: {pidid}

    Message type:
    notice
    RESTful server is started and bound on interfaces and port pointed by configuration parameter {gpuserver_rest_bind}. Windows: Despite CRITICAL messages in other processes this message can be shown. Eventually all processes are terminated.
    The pid ID of new process is {pidid}.
    Source: GPUServer, Secondary Server
    Result: Secondary server is ready to accept requests
    Solution: none
    GBSC-DT-32A [{info_type_I}] Heartbeat termination signal sent to OServer

    Message type:
    debug
    During termination, GPUServer sends heartbeat termination signal to OServer.
    Source: GPUServer, Primary Server
    Result:
    • GPU devices from this server will get STOPPED status
    • GPUServer continue termination
    Solution: none
    GSU0 User must be a root to switch to another one

    Message type:
    critical
    The invoker of GPUServer must be a root.
    Source: GPUServer
    Result: GPUServer is terminated, return code 1.
    Solution: Start GPUServer as root.
    GSU1 user {user} does not exist or cannot retrieve the user's information, {errno message} ({errno_id})

    Message type:
    critical
    Argument '-u' was passed via command line to set new user but the operation failed due to {error_message}.
    Source: GPUServer
    Result: GPUServer is terminated, return code 1.
    Solution: analyze {errno_message} to see if the error can be fixed, otherwise please contact support
    GSU2 cannot set user {user} for GPUServer, {errno message} ({errno_id})

    Message type:
    critical
    Argument '-u' was passed via command line to set new user but the operation failed due to {error_message}.
    Source: GPUServer
    Result: GPUServer is terminated, return code 1.
    Solution: analyze {errno_message} to see if the error can be fixed, otherwise please contact support
    GSD1 cannot start daemon for GPUServer - detach process failed {errno message} ({errno_id})

    Message type:
    critical
    GPUServer cannot start daemon process. Detaching process failed. This message is redirected to STDERR and it's not displayed in log.
    Source: GPUServer
    Result: GPUServer is terminated.
    Solution: analyze {errno_message} if error cannot be fixed please contact support
    GSD2 cannot start daemon for GPUServer - detach process failed {errno message} ({errno_id})

    Message type:
    critical
    GPUServer cannot start daemon process. Detaching process failed. This message is redirected to STDERR and it's not displayed in log.
    Source: GPUServer
    Result: GPUServer is terminated.
    Solution: analyze {errno_message} if error cannot be fixed please contact support
    GSA1 configuration file path is missing

    Message type:
    critical
    Parameter '-c' or environment variable 'GPUSERVER_CONF' is used but path to configuration file is missing.
    This message is redirected to STDERR and it's not displayed in log.
    Source: GPUServer
    Result: GPUServer is terminated.
    Solution: Check if path to configuration file exists and it is correct. Configuration path is given either by
    • '-c' parameter at command line or
    • environment variable GPUSERVER_CONF
    GSA2 Cannot find configuration file

    Message type:
    critical
    GPUServer cannot find configuration file given by parameter '-c' or by an environment variable 'GPUSERVER_CONF'.
    This message is redirected to STDERR and it's not displayed in log.
    Source: GPUServer
    Result: GPUServer is terminated.
    Solution: Check if path to configuration file exists and it is correct. Configuration path is given by
    • '-c' parameter at command line,
    • environment variable GPUSERVER_CONF
    GSRV-GS-94B [{info_type_I}] Exception occurred

    Message type:
    critical
    GPUServer caught unexpected exception
    Source: GPUServer
    Result: GPUServer is terminated immediately
    Solution: please contact support.
    GSRV-GS-96B [{info_type_I}] TCP/IP exception {tcp_message}

    Message type:
    critical
    GPUServer caught TCP/IP exception, message {tcp_message} gives details.
    Source: GPUServer
    Result: GPUServer is terminated immediately
    Solution: save GPUServer and OServer log and please contact support.
    GSRV-GS-96C [{info_type_I}] Exception occurred

    Message type:
    critical
    GPUServer caught an unexpected exception
    Source: GPUServer
    Result: GPUServer is terminated immediately
    Solution: please contact support.
    GBRS-CH-50A [{info_type_R}] Connection from {remote_ip_address}:{remote_port}, {http_request_method}:{uri}

    Message type:
    informational
    Every request to GPUServer's RESTful interface is registered in log when 'gpuserver_log_level' is at least at 'INFO' level.
    Source: GPUServer, Secondary Server
    Result: GPUServer continues processing request.
    Solution: none
    GBRS-WS-500 [{info_type_R}] HTTP Status Code {HTTP_return_code}

    Message type:
    information
    GPUServer displays HTTP return code when request is finished.
    Source: GPUServer, Secondary Server
    Result: GPUServer continue processing.
    Solution: none.
    GBRS-GG-70A [{info_type_R}] URI has invalid GPU number

    Message type:
    warning
    URI contains invalid number of GPU. Number must be unsigned integer number.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and origin of the request.
    • If the request comes from GPUBox package please please contact support.
    GBRS-GG-71Z [{info_type_R}] CUDA error: {cuda_error}

    Message type:
    warning
    GPUServer affected by CUDA error.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    • Check if CUDA environment for GPUserver is up and running.
    • Check if GPUServer process has access to /dev/nvidia* files.
    • If the issue repeats, restart GPUServer.
    • If the request comes from GPUBox package please please contact support.
    GBRS-GG-71A [{info_type_R}] CUDA error: {cuda_error}

    Message type:
    warning
    GPUServer affected by CUDA error.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    • Check if CUDA environment for GPUserver is up and running.
    • Check if GPUServer process has access to /dev/nvidia* files.
    • If the issue repeats, restart GPUServer.
    • If the request comes from GPUBox package please please contact support.
    GBRS-GJ-80A [{info_type_R}] Cannot retrieve GPU configuration

    Message type:
    error
    GPUServer cannot get the GPU configuration due to an unexpected exception.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled or serves partially
    • GPUServer processing continue
    Solution:
    1. Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and origin of the request.
    2. If the issue repeats try to restart GPUServer.
    3. please contact support.
    GBRS-GU-70A [{info_type_R}] Invalid request, negative number {number}

    Message type:
    warning
    Negative number was passed to URI.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    • If the request comes from GPUBox package please please contact support.
    GBRS-GU-70Z [{info_type_R}] CUDA error: {cuda_error}

    Message type:
    warning
    GPUServer affected by CUDA error.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 500
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    • Check if CUDA environment for GPUserver is up and running.
    • Check if GPUServer process has access to /dev/nvidia* files.
    • If the issue repeats restart GPUServer and GPUServer sends HTTP return code 500.
    • If the request comes from GPUBox package please please contact support.
    GBRS-GU-70B [{info_type_R}] CUDA error: {cuda_error}

    Message type:
    warning
    GPUServer affected by CUDA error.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 500
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    • Check if CUDA environment for GPUserver is up and running.
    • If the issue repeats restart GPUServer and GPUServer sends HTTP return code 500.
    • If the request comes from GPUBox package please please contact support.
    GBRS-GU-80A [{info_type_R}] Cannot retrieve GPU information

    Message type:
    error
    GPUServer during processing request was affected by unexpected exception and could not retrieve information about GPU devices.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 500
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    • Check if CUDA environment for GPUserver is up and running.
    • if the issue repeats restart GPUServer.
    • If the request comes from GPUBox package please please contact support.
    GBRS-LC-70C [{info_type_R}] Invalid configuration parameter specified

    Message type:
    warning
    During retrieving configuration parameters, request had invalid input data. Refer to message GBRS-CH-50A for more details about URI and its origin.
    Source: GPUServer, Secondary Server
    Result:
    • Request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution: If the request comes from GPUBox package please please contact support.
    GBRS-LC-70D [{info_type_R}] Data output is invalid

    Message type:
    warning
    Request has invalid output data. Refer to message GBRS-CH-50A for more details about URI and its origin.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and origin of the request.
    • If the request comes from GPUBox package, please please contact support.
    GBRS-LC-800 [{info_type_R}] Unknown exception caught, failed to list configuration parameters

    Message type:
    error
    GPUServer cannot retrieve configuration parameters.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 500
    • GPUServer processing continue
    • requestor received message: "Unknown error caught"
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and origin of the request.
    • Possibly, the failing request is a consequence of incorrect input parameters.
    • if the issue repeats try to restart GPUServer.
    • please contact support.
    GBRS-LL-61E [{info_type_R}] Log is redirected to STDOUT or is not accessible

    Message type:
    notice
    This message is displayed when [gpuserver_log_path] is set to [STDOUT] during log retrieving.
    Source: GPUServer, Secondary Server
    Result: GPU processing continue.
    Solution: none.
    GBRS-LL-81D [{info_type_R}] Log too large to retrieve: {size}

    Message type:
    error
    Size of log if too large and cannot be retrieved.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 500
    • GPUServer processing continue
    Solution:
    • Move current log to archive file and empty the current one. File is specified by 'gpuserver_log_path'
    • Reduce log size on daile/weekly basis. You can use tools like logrotate. Please refer to GPUServer's operations in manual.
    GBRS-LL-61F [{info_type_R}] Log is empty

    Message type:
    notice
    Log is empty or failed to be retrieved from file.
    Source: GPUServer, Secondary Server
    Result:
    • request returns HTTP return code 204
    • GPUServer processing continue
    Solution:
    GBRS-LL-71A [{info_type_R}] Limit must be an integer number

    Message type:
    warning
    Limit specified in URI is not integer number.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    • If the request comes from GPUBox package please please contact support.
    GBRS-LL-65C [{info_type_R}] Log is empty

    Message type:
    notice
    Log is empty or failed to be retrieved from file.
    Source: GPUServer, Secondary Server
    Result:
    • request returns HTTP return code 204
    • GPUServer processing continue
    Solution:
    GBRS-LL-51A [{info_type_R}] HTTP Status Code 200

    Message type:
    info
    Log was retrieved successfully.
    Source: GPUServer, Secondary Server
    Result: Continue processing.
    Solution: none.
    GBRS-LL-51G [{info_type_R}] HTTP Status Code 200

    Message type:
    info
    Log was retrieved successfully.
    Source: GPUServer, Secondary Server
    Result: Continue processing.
    Solution: none.
    GBRS-LL-81B [{info_type_R}] unknown exception caught, failed to retrieve log

    Message type:
    error
    Cannot retrieve GPUSerer's log.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 500
    • GPUServer processing continue
    Solution:
    1. Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and the origin of the request.
    2. Check if log configuration is correct i.e. verify value of gpuserver_log_... parameters.
    3. please contact support.
    GBRS-LP-30A [{info_type_R}] Empty process list

    Message type:
    debug
    OServer requested to retrieve list of processes but at the moment none of clients are connected.
    Source: GPUServer, Secondary Server
    Result: processing continue
    Solution: none
    GBRS-LP-80A [{info_type_R}] Cannot retrieve information about processes

    Message type:
    error
    Unexpected exception occurred while retrieving process list.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 500
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and origin of the request.
    • if the issue repeats try to restart GPUServer.
    • please contact support.
    GBRS-NV-71A [{info_type_R}] Failed to load nvidia-ml.so/nvml.dll {reason_message}

    Message type:
    warning
    Cannot load nvidia-ml.so library. Detailed message is given with {reason_message}.
    Message is shown about every 20 unsuccessful opens of libnvidia-ml.so/nvml.dll.
    Source: GPUServer, Secondary Server
    Result:
    • Temperature and fan speed is not available
    • Continue processing.
    Solution:
    • Examine {reason_message} and verify library accessibility.
      If library is not reachable add library's directory to standard system library search path.
    • If library does not exist install from NVidia website.
    • Linux: Check if the library already exists under different extension like nvidia-ml.so.1, if so create a symbolic link, for example: # ln -s nvidia-ml.so.1 nvidia-ml.so
    • Windows: check if nvml.dll is on search path for gpuserver.exe program.
    • Windows: library name: nvml.dll
    • Linux: library name: nvidia-ml.so
    GBRS-NV-72A [{info_type_R}] Failed to load nvidia-ml.so/nvml.dll, missing symbol nvmlInit {reason_message}

    Message type:
    warning
    Symbol nvmlInit is missing in the library nvidia-ml.so/nvml.dll. Detailed message is given with {reason_message}.
    Source: GPUServer, Secondary Server
    Result:
    • Temperature and fan speed is not available
    • Continue processing.
    Solution: Verify if the library is correct nvidia-ml.so/nvml.dll and reinstall it if required.
    • Windows: library name: nvml.dll
    • Linux: library name: nvidia-ml.so
    GBRS-NV-72B [{info_type_R}] Failed to load nvidia-ml.so/nvml.dll, missing symbol nvmlDeviceGetHandleByIndex {reason_message}

    Message type:
    warning
    Symbol nvmlDeviceGetHandleByIndex is missing in the library nvidia-ml.so/nvml.dll. Detailed message is given with {reason_message}.
    Source: GPUServer, Secondary Server
    Result:
    • Temperature and fan speed is not available
    • Continue processing.
    Solution: Verify if the library is correct nvidia-ml.so and reinstall it if required.
    • Windows: library name: nvml.dll
    • Linux: library name: nvidia-ml.so
    GBRS-NV-72C [{info_type_R}] Failed to load nvidia-ml.so/nvml.dll, missing symbol nvmlDeviceGetTemperature {reason_message}

    Message type:
    warning
    Symbol nvmlDeviceGetTemperature is missing in the library nvidia-ml.so/nvml.dll. Detailed message is given with {reason_message}.
    Source: GPUServer, Secondary Server
    Result:
    • Temperature and fan speed is not available
    • Continue processing.
    Solution: Verify if the library is correct nvidia-ml.so and reinstall it if required.
    • Windows: library name: nvml.dll
    • Linux: library name: nvidia-ml.so
    GBRS-NV-72D [{info_type_R}] Failed to load nvidia-ml.so/nvml.dll, missing symbol nvmlDeviceGetFanSpeed {reason_message}

    Message type:
    warning
    Symbol nvmlDeviceGetFanSpeed is missing in the library nvidia-ml.so/nvml.dll. Detailed message is given with {reason_message}.
    Source: GPUServer, Secondary Server
    Result:
    • Temperature and fan speed is not available
    • Continue processing.
    Solution: Verify if the library is correct nvidia-ml.so/nvml.dll and reinstall it if required.
    • Windows: library name: nvml.dll
    • Linux: library name: nvidia-ml.so
    GBRS-NV-72E [{info_type_R}] Failed to load nvidia-ml.so/nvml.dll, missing symbol nvmlShutdown {reason_message}

    Message type:
    warning
    Symbol nvmlShutdown is missing in the library nvidia-ml.so/nvml.dll. Detailed message is given with {reason_message}.
    Source: GPUServer, Secondary Server
    Result:
    • Temperature and fan speed is not available
    • Continue processing.
    Solution: Verify if the library is correct nvidia-ml.so/nvml.dll and reinstall it if required.
    • Windows: library name: nvml.dll
    • Linux: library name: nvidia-ml.so
    GBRS-NV-72F [{info_type_R}] Failed to load nvidia-ml.so/nvml.dll, missing symbol nvmlDeviceGetUtilizationRates {reason_message}

    Message type:
    warning
    Symbol nvmlDeviceGetUtilizationRates is missing in the library nvidia-ml.so/nvml.dll. Detailed message is given with {reason_message}.
    Source: GPUServer, Secondary Server
    Result:
    • Temperature and fan speed is not available
    • Continue processing.
    Solution: Verify if the library is correct nvidia-ml.so-nvml.dll and reinstall it if required.
    • Windows: library name: nvml.dll
    • Linux: library name: nvidia-ml.so
    GBRS-PT-80A [{info_type_R}] Invalid process ID: {process_id}

    Message type:
    error
    Requested {process_id} has invalid format. It must be unsigned integer number.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution:
    • Refer to previous message GBRS-CH-50A with the same connection ID and verify the URI and theorigin of the request.
    • If the request comes from GPUBox package please please contact support.
    GBRS-PT-61G [{info_type_R}] Cannot remove active process, pid: {process_id}

    Message type:
    notice
    Process is active, only inactive processes can be removed from process list.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution: if the process has to be terminated, one option can be applied.
    • Terminate client's request related to process.
      Notice if client's process is killed with SIGKILL then GPUServer's processes become inactive.
    • Terminate process in force mode with SIGQUIT signal. Issue command kill -3 {process_id}.
      This way terminate automatically all clients' requests and process itself.
    GBRS-PT-51G [{info_type_R}] Removed {number_of_entries} entries from process list for pid {process_id}

    Message type:
    information
    {number_of_entries} entries were removed successfully from process list.
    Source: GPUServer, Secondary Server
    Result: Continue processing.
    Solution: none.
    GBRS-SC-60D [{info_type_R}] Invalid log level specified

    Message type:
    notice
    During retrieving configuration parameters, request has invalid log level. Refer to message GBRS-CH-50A for more details about URI and its origin.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution: If the request comes from GPUBox package please please contact support.
    GBRS-SC-70E [{info_type_R}] Invalid configuration parameter specified

    Message type:
    warning
    During retrieving configuration parameters, request has invalid input data. Refer to message GBRS-CH-50A for more details about URI and its origin.
    Source: GPUServer, Secondary Server
    Result:
    • request is canceled with HTTP return code 400
    • GPUServer processing continue
    Solution: If the request comes from GPUBox package please please contact support.
    GBRS-SC-50D [{info_type_R}] Parameter {parameter_name} changed to value {value}

    Message type:
    information
    The parameter {parameter_name} has been changed to value {value}
    Source: GPUServer, Secondary Server
    Result: Parameter changed.
    Solution: none.
    GBST-LP-80A [{info_type_C}] Error while loading plugin: {plugin_name} ({error_reason})

    Message type:
    error
    Plugin {plugin_name} cannot be loaded due to {error_reason}
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: {error_reason} is related to dynamic linking error
    Possible cause of those errors could be:
    • Invalid plugin's name in configuration parameter 'gpuserver_plugins'.
    • Library's directory is not in LD_LIBRARY_PATH.
    • Library's directory is not specify in one of the files from '/etc/ld.so.conf.d/'. To verify if the plugin exists on the search path, issue the command '# ldconfig -p | grep {plugin_name}'
    • If plugin is missing, verify if the library exists in 'installation_directory/lib64', if it does not - reinstall GPUServer
    GBST-LP-80B [{info_type_C}] Invalid plugin {plugin_name}, ({error_reason})

    Message type:
    error
    Plugin's library was loaded but has an invalid format. Plugin {plugin_name} cannot be processed due to {error_reason}
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: {error_reason} is related to dynamic linking error
    Possible cause of those errors could be:
    • Plugin is corrupted
    • Invalid plugin version - verify if library {plugin_name} has the same version as GPUServer.
      • check GPUServer version: '$ gpuserver -v',
      • check library: '$ strings lib{plugin_name}.so | grep version:',
      Both should have the same version, if they don't reinstall GPUServer.
    • Invalid plugin's name in configuration parameter 'gpuserver_plugins'.
    • Library's directory is not in LD_LIBRARY_PATH.
    • Library's directory is not specify in one of file from '/etc/ld.so.conf.d/'. To verify if plugin exist on search path, issue the command '$ ldconfig -p | grep {plugin_name}'
    • Missing plugin, verify if library exists in 'installation_directory/lib64', if so reinstall GPUServer
    GBST-LP-80C [{info_type_C}] Invalid plugin {plugin_name}, failed to call

    Message type:
    error
    Plugin's library was loaded but has an invalid format. Plugin {plugin_name} cannot be proceed.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: {error_reason} is related to dynamic linking error
    Possible cause of those errors could be:
    • Corrupted plugin
    • Invalid plugin version - verify if library {plugin_name} has the same version as GPUServer.
      • check GPUServer version: '$ gpuserver -v',
      • check libraries: '$ strings | grep version:',
      Both should have the same version, if they do not - reinstall GPUServer.
    • Invalid plugin's name in a configuration parameter 'gpuserver_plugins'.
    • Library's directory is not specified in an environment variable LD_LIBRARY_PATH.
    • Library's directory is not specified in one of the files from '/etc/ld.so.conf.d/'. To verify if the plugin exists on the search path, issue the command '$ ldconfig -p | grep {plugin_name}'
    • If plugin is missing, verify if the library exists in 'installation_directory/lib64', if it does not - reinstall GPUServer
    GBST-LP-80D [{info_type_C}] Invalid plugin {plugin_name}, ({error_reason})

    Message type:
    error
    Plugin's library was loaded but has an invalid format. Plugin {plugin_name} cannot be proceed due to {error_reason}
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: {error_reason} is related to dynamic linking error
    Possible cause of those errors could be:
    • Plugin is corrupted
    • Invalid plugin version - verify if library {plugin_name} has the same version as GPUServer.
      • check GPUServer version: '$ gpuserver -v',
      • check libraries: '$ strings | grep version:',
      Both should have the same version, if they do not - reinstall GPUServer.
    • Invalid plugin's name in a configuration parameter 'gpuserver_plugins'.
    • Library's directory is not specified in an environment variable LD_LIBRARY_PATH.
    • Library's directory is not specified in one of the files from '/etc/ld.so.conf.d/'. To verify if the plugin exists on the search path, issue the command '$ ldconfig -p | grep {plugin_name}'
    • If plugin is missing, verify if the library exists in 'installation_directory/lib64', if it does not - reinstall GPUServer
    GBST-LP-50A [{info_type_C}] Plugin {plugin_name} loaded

    Message type:
    information
    Plugin {plugin_name} successfully loaded
    Source: GPUServer, Subserver
    Result:
    • GPUServer task continue processing
    • GPUServer process continue.
    Solution: none
    GBST-LP-81A [{info_type_C}] Invalid plugin {plugin_name}, failed to call

    Message type:
    error
    Plugin's library was loaded but has an invalid format. Plugin {plugin_name} cannot be proceed.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: {error_reason} is related to dynamic linking error
    Possible cause of those errors could be:
    • Plugin is corrupted
    • Invalid plugin version - verify if library {plugin_name} has the same version as GPUServer.
      • check GPUServer version: '$ gpuserver -v',
      • check libraries: '$ strings | grep version:',
      Both should have the same version, if they do not - reinstall GPUServer.
    • Invalid plugin's name in a configuration parameter '$ gpuserver_plugins'.
    • Library's directory is not specified in an environment variable LD_LIBRARY_PATH.
    • Library's directory is not specified in one of the files from '/etc/ld.so.conf.d/'. To verify if the plugin exists on the search path, issue the command '$ ldconfig -p | grep {plugin_name}'
    • If plugin is missing, verify if the library exists in 'installation_directory/lib64', if it does not - reinstall GPUServer
    GBST-LP-81B [{info_type_C}] Plugin number error for {plugin_name}

    Message type:
    error
    Plugin's library was loaded but has an invalid format. Plugin {plugin_name} cannot be proceed.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: {error_reason} is related to dynamic linking error
    Possible cause of those errors could be:
    • Plugin is corrupted
    • Invalid plugin version - verify if library {plugin_name} has the same version as GPUServer.
      • check GPUServer version: '$ gpuserver -v',
      • check libraries: '$ strings | grep version:',
      Both should have the same version, if they do not - reinstall GPUServer.
    • Invalid plugin's name in a configuration parameter 'gpuserver_plugins'.
    • Library's directory is not specified in an environment variable LD_LIBRARY_PATH.
    • Library's directory is not specified in one of the files from '/etc/ld.so.conf.d/'. To verify if the plugin exists on the search path, issue the command '$ ldconfig -p | grep {plugin_name}'
    • If plugin is missing, verify if the library exists in 'installation_directory/lib64', if it does not - reinstall GPUServer
    GBST-LP-81C [{info_type_C}] Plugin number error, plugin: {plugin_name} {internal}

    Message type:
    error
    Plugin's library was loaded but has invalid format. Plugin {plugin_name} cannot be proceed.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: {error_reason} is related to dynamic linking error
    Possible cause of those errors could be:
    • Plugin is corrupted
    • Invalid plugin version - verify if library {plugin_name} has the same version as GPUServer.
      • check GPUServer version: '$ gpuserver -v',
      • check libraries: '$ strings | grep version:',
      Both should have the same version, if they do not - reinstall GPUServer.
    • Invalid plugin's name in a configuration parameter 'gpuserver_plugins'.
    • Library's directory is not specified in an environment variable LD_LIBRARY_PATH.
    • Library's directory is not specified in one of the files from '/etc/ld.so.conf.d/'. To verify if the plugin exists on the search path, issue the command '$ ldconfig -p | grep {plugin_name}'
    • If plugin is missing, verify if the library exists in 'installation_directory/lib64', if it does not - reinstall GPUServer
    GBST-LP-91A [{info_type_C}] No plugins found

    Message type:
    critical
    GPUServer cannot find any valid plugins.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continue.
    Solution: Specify correct plugin names in 'gpuserver_plugins' parameter
    GBST-ST-51A [{info_type_C}] Connection terminated, task ended

    Message type:
    information
    Connection from Client's thread has been terminated by Client.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continues.
    Solution: none
    GBST-ST-52A [{info_type_C}] Connection closed, task ended

    Message type:
    information
    Connection from Client's thread has been terminated by Client.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continues.
    Solution: none
    GBST-ST-72A [{info_type_C}] Client called invalid plugin ({internal_var}:{internal_var})

    Message type:
    warning
    Client called invalid plugin.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continues.
    Solution:
    GBST-ST-82A [{info_type_C}] Execution error ({internal_var}:{internal_var})

    Message type:
    error
    Client called invalid plugin.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is terminated
    • GPUServer task ended
    • GPUServer process continues.
    Solution:
    GBST-CH-68A [{info_type_C}] Client's connection terminated

    Message type:
    notice
    Heartbeat connection to Client has been terminated.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is being terminated
    • GPUServer process continues.
    Solution: Examine Client's messages if there are any and verify if Client's process was not terminated
    GBST-CH-68C [{info_type_C}] Client's connection terminated

    Message type:
    notice
    Heartbeat connection to Client has been terminated.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is being terminated
    • GPUServer process continues.
    Solution:
    • This message can be a consequence of normal client's termination.
    • If you suspect abnormal behavior please examine Client's messages if there are any, and verify if Client's process was not terminated.
    GBST-CH-68B [{info_type_C}] Client's connection terminated

    Message type:
    notice
    Heartbeat connection to Client has been terminated.
    Source: GPUServer, Subserver
    Result:
    • Client's connection is being terminated
    • GPUServer process continue.
    Solution: Examine Client's messages if there are any and verify if Client's process was not terminated
    GBST-SP-78A [{info_type_C}] Cannot start subserver, connection closed from {ip_address}

    Message type:
    warning
    Likely all ports are busy or system's security prevents from starting subserver.
    Subserver is used in phase II of communication between client and GPUServer. Within every new process from started within client, GPUserver starts subserver.
    Source: GPUServer, Subserver
    Result:
    • current connection to GPUserver terminated
    • primary GPUServer continue processing
    Solution: Check if ports are available and can be used by GPUServer
    GBST-SP-78B [{info_type_C}] Cannot start subserver, client failed, connection terminated by {ip_address}

    Message type:
    warning
    Client's connection failed. Likely connection between client and GPUServer was terminated by client.
    Source: GPUServer, Subserver
    Result:
    • handshake of phase II failed
    • primary GPUServer continue processing
    Solution:
    • verify client's connection and examine messages if any
    • examine OServer's log
    • check if network connection between client and GPUServer is up and running
    GBST-SP-78C [{info_type_C}] Invalid handshake, connection terminated by {ip_address}

    Message type:
    warning
    Client's handshake with GPUserver at phase II failed due to terminated connection
    Source: GPUServer, Subserver
    Result:
    • connection from Client to GPUserver terminated
    • GPUServer continues processing.
    Solution:
    • verify Client's connection and examine messages if any
    • examine OServer's log
    • check if network between client and GPUServer is up and running
    GBST-SP-78D [{info_type_C}] Invalid handshake from Client {ip_address}

    Message type:
    warning
    Client failed with handshake at phase II.
    Source: GPUServer, Subserver
    Result:
    • connection to GPUserver terminated
    • GPUServer continue processing
    Solution:
    • verify client's messages if any
    • examine message in OServer's log
    • examine OServer's log
    GBST-SP-78E [{info_type_C}] Invalid handshake, connection terminated by client {ip_address}

    Message type:
    warning
    Client handshake failed.
    Source: GPUServer, Subserver
    Result:
    • handshake of phase II client to GPUserver failed
    • connection from client to GPUserver terminated
    • GPUServer continue processing
    Solution:
    • verify client's messages if any
    • examine message in OServer's log
    • examine OServer's log
    GBST-SP-78F [{info_type_C}] Invalid phase II handshake from client {ip_address}

    Message type:
    warning
    Client send invalid handshake at phase II connection.
    Source: GPUServer, Subserver
    Result:
    • handshake of phase II client to GPUserver failed
    • connection from client to GPUserver terminated
    • GPUServer continue processing
    Solution:
    • verify client's messages if any
    • examine message in OServer's log
    • examine OServer's log
    GBST-SP-73A [{info_type_C}] Subprocess proceeds with inherited configuration parameters only

    Message type:
    warning
    During the communication of phase I, subprocess cannot read configuration parameters passed by secondary server. If any parameters like 'log_level' or 'gpuserver_infiniband_enabled' were changed, they will not have any effect on currently running subprocess.
    Source: GPUServer, Subserver
    Result:
    • client continues with parameters inherited from primary server only
    • GPUServer continues processing
    Solution: save GPUServer's log and contact support
    GBST-SP-51A [{info_type_C}] Upgraded to InfiniBand

    Message type:
    information
    Connection between GPUServer and Client has been switched to InfiniBand.
    GPUServer tries to upgrade to InfiniBand only when parameter 'gpuserver_infiniband_enabled' is set to 'yes'
    Source: GPUServer, Subserver
    Result: further communication between GPUServer and client will continue through InfiniBand connection
    Solution: none
    GBST-SP-71B [{info_type_C}] Upgrade to InfiniBand failed: {reason_message}

    Message type:
    warning
    {reason_message} gives the reason of failure.
    Parameter 'gpuserver_infiniband_enabled' is set to 'yes' but communication switch to InfiniBand failed. Client, as well GPUServer, has to be ready to enable InfiniBand communication:
    • they must have correct path to library libInfiniBand-gpubox.so,
    • they must have valid and available InfiniBand devices,
    • if they have more than one available device:
      • GPUServer specifies correct device name in parameter 'gpuserver_infiniband_device'
      • Client specifies correct device name under environment variable 'GPUBOX_IBDEV'
    Source: GPUServer, Subserver
    Result: further communication between GPUServer and client will continue TCP communication.
    Solution:
    • 'InfiniBand disabled in gpuserver config'
      InfiniBand is not enabled in configuration i.e. parameter 'gpuserver_infiniband_enabled' is not set to 'yes'. Change parameter to "yes" to enable InfiniBand communication.
    • 'Client couldn't load InfiniBand library' or 'Server couldn't load InfiniBand library' Client or GPUServer could not have found library libInfiniBand-gpubox.so. Verfiy if:
      • library is included on path in 'LD_LIBRARY_PATH', issue command: '$ env | grep LD_LIB'
      • library is on standard search path, i.e. define in any configuration file in directory '/etc/ld.so.conf.d/', to verify issue command: '$ ldconfig -p | grep libInfiniBand-gpubox.so'
    • 'InfiniBand device test failed on client' or 'InfiniBand device test failed on server' Cannot find any valid InfiniBand device on client or GPUServer side
    • 'InfiniBand init failed on client' or 'InfiniBand init failed on server' Device was found but client or GPUServer cannot initialize InfiniBand communication, detail reason is displayed in 'IBCC-IC-xxx' message
    Some messages are followed by more detailed 'IBCC-IC-xxx' messages, review them to find the real cause of IB communication failure.
    GBST-SP-51C [{info_type_C}] PhaseII:connection accepted from {client_ip_address}:{client_port}, {client_system}, {client_version}

    Message type:
    information
    Connection from Client has been accepted and successfully passed handshake verification
    {client_system} - client's operating system.
    Source: GPUServer, Subserver
    Result: Client continues processing
    Solution: none
    GSRV-SH-76D [{info_type_I}] Received {signal_name} ({signal_number})

    Message type:
    warning
    Process {process_pid} of GPUServer received signal.
    Source: GPUServer
    Result: Depending on a signal type GPUServer's process can:
    1. Continue processing
    2. Be terminated in 'normal mode' with signals SIGINT and SIGTERM, process returns code 0
    3. Be terminated in 'force mode' with signals SIGQUIT, process returns code 2
    4. Abend when receives:
    • SIGFPE - floating point exception
    • SIGILL - illegal Instruction
    • SIGSEGV - invalid memory reference
    • SIGBUS - bus error (bad memory access)
    • SIGABRT - abort signal
    • SIGTRAP - trace/breakpoint trap
    • SIGSYS - bad argument to routine
    5. process returns code 4
    Solution:
    • case 1: no action is required
    • case 2: process received termination signal in normal mode, no action is required,
    • case 3: process received termination signal in force mode, no action is required
    • case 4: ]please contact support
    GSRV-SH-77A [{info_type_I}] GPUServer forced to quit

    Message type:
    warning
    Process {process_pid} received SIGQUIT signal and will be forced to quit.
    Source: GPUServer
    Result:
    • all Clients' request are terminated immediately
    • GPUServer's process is being terminated. Solution: process received termination signal in force mode, no action is required
    GSRV-SH-67B [{info_type_I}] GPUServer has still running processes: {number_of_processes}

    Message type:
    notice
    Process {process_pid} received SIGQUIT but still has running Clients' requests in a number of {number_of_process}.
    Source: GPUServer
    Result:
    • displays message 'GSRV-SH-77C'
    • all Clients' requests are terminated immediately
    • GPUServer's process is being terminated.
    • displays message 'GSRV-SH-57C'
    Solution: none.
    GSRV-SH-77C [{info_type_I}] Stopping GPUserver...

    Message type:
    warning
    Process {process_pid} received SIGQUIT and began stopping procedure.
    Source: GPUServer
    Result:
    • all clients' reuqest will be terminated
    Solution: termination process is irreversible.
    GSRV-SH-67C [{info_type_I}] GPUServer stopped

    Message type:
    notice
    GPUServer is stopped by signal SIGQUIT.
    Source: GPUServer
    Result: All threads of GPUServer are being terminated.
    Solution: none.
    GSRV-SH-76B [{info_type_I}] Cannot shutdown GPUServer, processes are still running: {number_of_processes}

    Message type:
    warning
    Process {process_pid} received SIGINT or SIGTERM but it has running clients' requests in a number of {number_of_process}
    Source: GPUServer
    Result:
    • GPUServer cannot be terminated.
      Solution:
    • close all client's request and then retry with SIGINT or SIGTERM or
    • terminate GPUServer with signal SIGQUIT by issue one of commands:
      • # service gpuserver force-stop - when gpuserver is running as a service
      • $ pkill -3 gpuserver - for all processes of GPUServer
      • $ kill -3 {process_pid} - for single process
    GSRV-SH-66A [{info_type_I}] GPUServer is already stopping, please wait, {counter}

    Message type:
    notice
    GPUServer's process {process_pid} already received signals SIGTERM or SIGINT in a number of {counter}.
    Source: GPUServer
    Result: GPUServer is already in a stopping phase. All procedures of termination has been started.
    Solution: if GPUserver cannot be terminated:
    • Verify if clients still have running processes by issuing a command $ agpubox lp. Message 'GSRV-SH-76B' also informs about running processes.
      • If processes are inactive they can be terminated by issuing a command $ agpubox rprocess. For details refer to ]Command Reference.
      • If processes are active, Clients' requests need to be closed and then retry with signals SIGTERM or SIGINT or terminate GPUServer with SIGQUIT in force mode.
    • If all Clients' requests are already closed and GPUServer does not quits it means there are some incompatibilities between processes. You can try to terminate non-primary and non-secondary servers SIGKILL signal, i.e. kill all subservers, or simply kill all GPUServer's processes by pkill -9 gpuserver.
    GSRV-SH-76C [{info_type_I}] Stopping GPUserver...

    Message type:
    warning
    Process {process_pid} received SIGTERM or SIGINT and began stopping procedure.
    Source: GPUServer
    Result:
    • all Clients' reuqests should be already terminated
    Solution: termination process is irreversible.
    GSRV-SH-66C [{info_type_I}] GPUServer stopped

    Message type:
    notice
    GPUServer is stopped by signal SIGTERM or SIGINT.
    Source: GPUServer
    Result: All threads of GPUServer are terminated.
    Solution: none.
    GSRV-WP-60A [{info_type_I}] Child of current process terminated

    Message type:
    notice
    Child process of current GPUServer's process was terminated.
    Source: GPUServer
    Result: wait for other child process if any and probably current GPUServer process will be terminated as well
    Solution: none
    GSRV-WP-80A [{info_type_I}] Wait for child process failed, {child_pid}

    Message type:
    error
    Process of GPUServer was waiting for all children processes.
    Source: GPUServer
    Result: current GPUServer's process will exit
    Solution: none
    GSRV-WP-61A [{info_type_I}] All children of current process terminated

    Message type:
    notice
    All children processes of current GPUServer's process were terminated.
    Source: GPUServer
    Result: probably current GPUServer process will be terminated as well
    Solution: none