Introduction

What is GPUBox?

GPUBox is a virtualization technology and a cloud-ready GPU-computing platform. It is the foundation of a fully-functioning, scalable, and elastic cloud infrastructure that meets the challenges of today's 3D graphics industry by taking a new angle on the usage of the GPUs.

The GPUBox software simplifies GPU management by separating the application and operating systems from underlying GPU devices. GPUBox is a solution that allows the dynamic sharing of GPU devices from the same pool by many users. With GPUBox, the GPU-cloud-enabled infrastructure can be set up regardless of its size and purpose by a small 3D graphics studio as well as a larger provider. Also, the GPUBox infrastructure can be extremely efficient thanks to recent network technology with its high throughput and low latencies.

GPUBox enables on-demand provisioning of GPU devices to a physical or virtual machine with a Linux or Windows operating system. The pool of the GPU devices is shared between users on demand and in this way reduces total power consumption and idle-running hardware.

GPUBox is made to work with applications operating within the CUDA environment, such as GPU-based renderers. Virtual GPU devices are provided seamlessly on demand to a virtual or physical system. Existing applications do not need to be modified. While running in the GPUBox infrastructure, software such as Blender feels and behaves as if it were operating in a native CUDA-enabled system.

Component Function
OServer OServer = Operational + Server.
It is an operational server responsible for coordinating operations between GPUServers and Clients and maintaining the GPUBox infrastructure. OServer monitors and collects accounting information about users and GPU-related activities. It plays a fundamental role in managing users and related security flow between particular elements of the infrastructure.
GPUServer GPUServer = GPU + Server.
It is responsible for managing the GPUs installed in a particular computer. It also delivers the virtualized GPUs to the GPUBox infrastructure.
Client Enables regular users and administrators to connect to the GPUBox infrastructure and use or manage the virtualized GPUs.
There are two basic commands regarding GPUBox. gpubox is a command for users and administrators to use the resources by performing allocations and dropping off the GPUs and generating several listings. The agpubox command is dedicated to administrators only and its main purpose is managing the infrastructure via a variety of subcommands.
GPUBox Web Console A browser-based user interface for administrators and users to control and monitor resources of the GPUBox infrastructure. GPUBox Web Console has two modes: for administrators with full access to every feature and a user's mode with limited functionality.

OServer coordinates the operations between the other two components: GPUServer and Client. Users can dynamically allocate and drop virtual GPUs from a centrally managed and monitored pool of shared devices, created from the set of GPUServers.
When it comes to the specific GPU-related processes (such as rendering), Client and GPUServer communicate directly with each other.

A Client can simultaneously connect to many GPUServers and a GPUServer can serve GPUs to many Clients at the same time. A simple example of such an interconnection is shown below:

The GPU virtualization not only allows the use of devices on demand and delivers them to virtual systems, but also enables the allocation of dozens of GPUs and drastically speeds up computations and rendering. The time reduction of rendering a single frame has a great impact on the length of the production pipeline.

The key advantages of GPUBox:

Virtualization of the GPU devices GPUBox separates applications and operating systems from underlying GPU devices.
Provides the GPU devices on demand The user can allocate and drop the GPUs at any time.
Lowers total costs GPUBox improves the utilization by sharing the pool of the GPU devices between users. The GPU cards are no longer dedicated to a physical computer. The total number of GPU devices required within a team for efficient work can be significantly lowered.
Reduces power consumption Sharing GPUs and reducing the total number of devices allows the reduction of power consumption.
Exceeds physical limitations The number of available PCIe slots on a motherboard is no longer a limitation in using multiple GPU devices.

Document structure

The main purpose of this documentation is to provide core information on how to handle the GPUBox software and its features. This documentation will help you to properly install, configure, and manage the components of GPUBox. Chapters such as Command Reference and Messages provide you the most detailed information that you may find extremely useful in some circumstances.

GPUBox Starter for Windows A guide how to install and manage GPUBox and all its components in Windows operating system.
Quick start Brief instructions for the impatient about how to quickly install and configure the GPUBox software.
Installation A guide through every possible way to install a particular component of the GPUBox software, i.e. OServer, GPUServer, and Client. It covers both dialog-based installation and silent mode from command line as well. Client installation will guide you through the entire process of installation on Windows and Linux operating systems.
OServer This chapter handles every aspect of managing OServer and the GPUBox infrastructure. It shows correct methods of configuration, starting and stopping OServer, and possible concurrent messages. It also presents the aspect of monitoring and accounting for the pool of GPUs.
GPUServer Configuring and managing GPUServer — a component responsible for serving the GPUs in the infrastructure. This section also provides the most crucial aspects of daily operations, such as managing the GPU devices, recovering the GPUs, or changing the system parameters.
Using GPUBox Basic operations (such as allocating and dropping) of the GPUs that can be performed by a regular user of the GPUBox infrastructure.
Managing GPUBox Handling of the GPUs served in the infrastructure by a GPUBox user with administrator credentials.
Security A chapter regarding the management of users and protection of the GPUBox infrastructure from unauthorized access.
gpubox command reference Detailed descriptions of each of the subcommands for the gpubox command.
agpubox command reference Detailed descriptions of each of the subcommands for the agpubox command.
OServer messages Descriptions of messages that appear on the OServer's log, provided together with information about their meaning, context, and (if necessary) a solution for the issue reported by a message.
GPUServer messages Descriptions of messages that may appear on the GPUServers's log as well as in the case of the OServer messages; they are provided along with useful annotations.

We made every effort to ensure that this document is an accurate representation of the functionality of the GPUBox software. Nevertheless, we would appreciate any comments or suggestions regarding the clarity and completeness of this publication and/or any GPUBox software component. Feel free to share your ideas and remarks with us by sending us an email.

Document conventions

To help the readers fully understand the content of this document, we use special typographical conventions to convey certain information. For the sake of clarity and concision, we advise you read the conventions before you proceed with rest of this book. The section below provides the explanation of each of the conventions used in the document.

gpuboxOutput quotations, paths, filenames, commands, subcommands, and parameters.
$ gpubox free Commands starting with $ are issued by a user with non-superuser privileges.
# service gpuserver start Commands starting with # must be issued by a system administrator.
$PATH Linux environment variables.
%WINDIR%\system32 Windows environment variables.
{variable} Variables in messages.

For InfiniBand support

Annotation labels.

Default

Default values marker.

RedHat/CentOS

Commands or scripts related to a specific operating system.

Required

This parameter or element is required.
Replaceable tags
<OSERVER_INSTALLATION_DIR> Path to the OServer installation directory.
<OSERVER_IP> OServer's IP address.
<GPUSERVER_INSTALLATION_DIR> Path to the GPUServer installation directory.
<CLIENT_INSTALLATION_DIR> Path to the Client installation directory.
<LICENSE_UID> Unique license UID provided after the purchase.
<LICENSE_KEY> License key provided after the purchase.
<NUMBER_OF_GPUs> Number of licenses used by a particular OServer.
<PORT> Number of a TCP port.
<PATH_TO_DIRECTORY> Path to a directory.
<PATH_TO_FILE> Path to file.

Paragraphs with a blue background and icon at the beginning have information that is worth noticing.

Paragraphs with a red background and icon contain important information that should be paid attention to.

Paragraphs with a green background and icon contain less important information and ideas that sometimes might be useful.

Version 0.1.234 A change in the content of the document has been introduced. Paragraphs marked with greed dashed line are valid for GPUBox since version 0.1.234.

Version 0.1.234 A change in the content of the document has been introduced. Paragraphs marked with red dashed line are valid for GPUBox upto version 0.1.234.

In the examples throughout the entire document we preserve special conventions for the names of users, names of computers, and IP addresses.

userid
  • bob: Bob is a superuser in the GPUBox infrastructure and has administration privileges in the operating system.
  • mary and john: Mary and John are regular users in the GPUBox infrastructure and the operating system.
  • gpubox: For Linux operating systems, it's dedicated to start oserver and gpuserver services. In the GPUBox infrastructure, gpubox user is a superuser.
  • infuser - infrastructure user, it is used for communication between OServer and GPUServers.
  • IP addresses
  • 203.0.113.1 .. 203.0.113.9 or 198.51.100.1 .. 198.51.100.9 are OServers
  • 203.0.113.10 .. 203.0.113.20 or 198.51.100.10 .. 198.51.100.20 are GPUServers
  • 203.0.113.50 and above, or 198.51.100.50 and above, are Clients
  • Hostnames
  • GPU1, GPU2, GPU3 . . . are GPUServers
  • To enhance the readability of this documentation, some less-important columns, rows, and values in exemplary listings that are too wide to fit this document are replaced with .... As an example, the table below:

    $ agpubox list
    +----+----+---------+----------------------+---------------+-------------+-------------+------------+-----------+--------------------+
    |GID |LID |UserID   |User name         	   |GPU name   	   |PCI      	 |Client's IP  |GPUServer   |Status 	|Since               |
    +----+----+---------+----------------------+---------------+-------------+-------------+------------+-----------+--------------------+
    |1   |1   |gpubox   |GPUBox administrator  |GeForce GTX 690|710C:03:00.0 |203.0.113.12 |GPU6        |SHARED 	|2013-05-28 12:51:16 |
    |2   |2   |gpubox   |GPUBox administrator  |GeForce GTX 690|710C:04:00.0 |203.0.113.12 |GPU6        |SHARED 	|2013-05-28 12:51:16 |
    |3   |3   |gpubox   |GPUBox administrator  |GeForce GTX 690|710C:07:00.0 |203.0.113.12 |GPU6        |SHARED 	|2013-05-28 12:51:16 |
    |4   |4   |gpubox   |GPUBox administrator  |GeForce GTX 690|710C:08:00.0 |203.0.113.12 |GPU6        |SHARED 	|2013-05-28 12:51:16 |
    +----+----+---------+----------------------+---------------+-------------+-------------+------------+-----------+--------------------+
    depending on the context, can be shown as:
    $ agpubox list
    +----+----+-------------+---+---------------+---+---------+---+
    |GID |LID |UserID   	|...|GPU name       |...|Status   |...|
    +----+----+-------------+---+---------------+---+---------+---+
    |1   |1   |gpubox   	|...|GeForce GTX 690|...|SHARED   |...|
    |2   |2   |gpubox   	|...|GeForce GTX 690|...|SHARED   |...|
    |3   |3   |gpubox   	|...|GeForce GTX 690|...|SHARED   |...|
    |4   |4   |gpubox   	|...|GeForce GTX 690|...|SHARED   |...|
    +----+----+-------------+---+---------------+---+---------+---+
    or
    $ agpubox list
    +----+----+-------+------------+----------+--------+-------------+----------+-------+-------+
    |GID |LID |UserID |User name   |GPU name  |PCI     |Client's IP  |GPUServer |Status |Since  |
    +----+----+-------+------------+----------+--------+-------------+----------+-------+-------+
    |1   |1   |gpubox |GPUBox...   |GeForce...|710C... |203...       |GPU6      |SHARED |2013...|
    |2   |2   |gpubox |GPUBox...   |GeForce...|710C... |203...       |GPU6      |SHARED |2013...|
    |3   |3   |gpubox |GPUBox...   |GeForce...|710C... |203...       |GPU6      |SHARED |2013...|
    |4   |4   |gpubox |GPUBox...   |GeForce...|710C... |203...       |GPU6      |SHARED |2013...|
    +----+----+-------+------------+----------+--------+-------------+----------+-------+-------+
    

    How to read syntax diagrams

    Apply the following rules when reading syntax diagrams used in the Command Reference for GPUBox:

    • Command keywords and variables are given in rounded blocks.
      keyword
    • Subcommand blocks that are elaborated on another diagram are given in rectangles.
      subcommand block
    • If an item appears without a block, it is a place for a parameter's custom value.
      parameter_value
    • Elaborated subcommand blocks are described on the upper border.
      subcommand block
    • Required items appear on the horizontal line (main path).
      REQUIRED ITEM
    • Optional items appear below the main path. They are not necessary for a command to be executed.
      OPTIONAL ITEM
    • If you can choose from two or more items and making a choice is required, they appear vertically on the stack and the first alternative appears on the main path.
      REQUIRED CHOICE #1 REQUIRED CHOICE #2 REQUIRED CHOICE #3
    • If you can choose from two or more items and making a choice is optional, they appear vertically on the stack and the first alternative appears below the main path.
      OPTIONAL CHOICE #1 OPTIONAL CHOICE #2 OPTIONAL CHOICE #3
    • Default values that will be applied if a choice is omitted are displayed above the main path.
      DEFAULT VALUE
    • The path returning to the left side of an item, a choice, or a sequence of items means that it will be applied multiple times in the command. If there is a specified character on the path (i.e. , it should be used to divide used items. If there is no specified character on the returning path, only space has to be used.
      ITEM ,

    How to read command syntax

    Apply the following rules when reading command syntax such as:
    agpubox user|u  --add|-a --userid|-u=<user_id> --username|-n=<username> [--password|-p=<password>]
    • Expressions separated with | can be used interchangeably. agpubox user|u --add|-a means that $ agpubox user -a will have the same result as $ agpubox u --add
    • Expressions in the < > tags are required values and parameters.
    • Expressions in the [ ] tags are optional for a particular command.

    Terminology

    Administrator/superuser The GPUBox user with the administrator credentials authorized to use the agpubox subcommands and the GPUBox Web Console in a superuser mode. Administrator/superuser is always assigned to the group 0.
    Allocation mode The GPUs can be allocated by user at a given number in one of the two modes:
  • loose mode - if there are less free GPUs than requested, they will still be allocated.
  • strict mode - only the indicated number of GPUs will be allocated. If there are less free devices than the user desires, no GPU will be allocated.
  • Client Uses the resources of the GPUBox infrastructure. The GPUBox Client software is installed on a Linux- or Windows-based system. Client allows a user to connect and sign in to the GPUBox infrastructure and allocate the GPU devices from the available pool. All occurrences of 'Client' with an initial capital letter in this manual are related to the GPUBox Client software.
    Connection ID A seven-digit number displayed within some of the messages in GPUServer and OServer's logs. Connection ID helps group messages from the same origin. Messages with the same number (connection ID) come from the same request.
    GPUBox infrastructure OServer, Clients, and GPUServers work within the same network and are understood as both software and hardware (especially GPU devices).
    GPUServer The GPUBox GPUServer software installed on a Linux-based system containing one or more GPU devices that are served to the GPUBox infrastructure.
    Infrastructure token Generated at random, it is a string of 32 printable ASCII characters used to authenticate users and superusers.
    Infrastructure user A superuser whose token is used as the Infrastructure token.
    OServer The GPUBox OServer software installed on a Linux-based system. It is a central processing unit of the GPUBox infrastructure operations. Its key functions are:
  • managing the GPUBox infrastructure
  • coordinating the allocations of the GPUs across the GPUBox infrastructure
  • managing the security (i.e. users, access, credentials, etc.)
  • accounting for and logging a user's activities
  • Primary Server The first process started by the GPUServer. The main process of the GPUServer.
    Secondary Server The second process started by the GPUServer. It is responsible for communicating with the GPUBox infrastructure.
    Subserver Every new subprocess started by the Primary Server due to handling requests from clients.
    Usage mode The GPU can be used in one of two modes:
  • shared mode - the GPU device is shared between users, which means that they can use it simultaneously.
  • exclusive mode - the GPU device is allocated exclusively to a single user.
  • User A regular GPUBox user who is authorized to use the gpubox command.