Cloud Deployment Guide
Be sure to read the Deployment Overview first.
- Scenario 1: Deploy your own instance of VIAME Web to GCP Compute Engine.
- Scenario 2: Run VIAME pipelines on a GCP Compute Engine VM from the command line.
- Scenario 3: Run a Private GPU worker in GCP to process jobs from any VIAME Web instance including viame.kitware.com (standalone mode)
The terraform section is the same for all scenarios. The Ansible section will have differences.
Before you begin
You'll need a GCP Virtual Machine (VM) with the features listed below. This section will guide you through creating one and deploying VIAME using Terraform and Ansible.
- Terraform automates the process of creating and destroying cloud resources such as VMs.
- Ansible automates configuration, such as software installation, on newly created machines.
Together, these tools allow you to quickly create a reproducible environment. If you do not want to use these tools, you can create your own VM manually through the management console and skip to the docker documentation instead.
|Operating system||Ubuntu 20.04|
|Disk Type||SSD, 128GB or more depending on your needs|
To run the provisioning tools below, you need the following installed on your own workstation.
Google Cloud worker provisioning can only be done from an Ubuntu Linux 18.04+ host. Ansible and terraform should work on Windows Subsystem for Linux (WSL) if you only have a windows host. You could also use a cheap CPU-only cloud instance to run these tools.
- Install Google Cloud SDK
- Install Terraform
- Install Ansible
- Find your google cloud project id. It looks like
Google Cloud imposes GPU Quotas. You may need to request a quota increase. Anecdotally, request increases of 1 unit are approved automatically, but more are rejected.
Creating a VM with Terraform
1 2 3 4 5 6
GPU-accelerated VMs are significantly more expensive than typical VMs. Make sure you are familiar with the cost of the machine and GPU you choose. See main.tf for default values.
devops/main.tf for a complete list of variables. The default
gpu_count can be overridden.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Destroy the stack
Later, when you are done with the server and have backed up your data, use terraform to destroy your resources.
Configure with Ansible
This step will prepare the new host to run a VIAME worker by installing nvidia drivers, docker, and downloading VIAME and all optional addons.
The playbook may take 30 minutes or more to run because it must install nvidia drivers and download several GB of software packages.
Ansible Extra Vars
These are all the variables that can be provided with
|viame_bundle_url||latest bundle url||Optional for scenario 2 & 3. Change to install a different version of VIAME. This should be a link to the latest Ubuntu Desktop (18/20) binaries from viame.kitware.com (Mirror 1)|
|DIVE_USERNAME||null||Required for scenario 3. Username to start private queue processor|
|DIVE_PASSWORD||null||Required for scenario 3. Password for private queue processor|
||Optional for scenario 3. Max concurrnet jobs. Change this to 1 if you run training|
||Optional for scenario 3. Remote URL to authenticate against.|
||Optional for scenario 3. kwiver log level|
The examples below assumes the
inventory file was created by Terraform above.
1 2 3 4 5 6 7 8 9 10 11 12 13
Once provisioning is complete, jobs should begin processing from the job queue. You can check viame.kitware.com/#/jobs to see queue progress and logs.
This Ansible playbook is runnable from any Ubuntu 18.04+ host to any Ubuntu 18.04+ target. To run it locally, use the
inventory.local file instead. If you already have nvidia or docker installed, you can comment out these lines in the playbook.
If you run locally you'll need to restart the machine and run the playbook a second time. The playbook will do this automatically for remote provisioning, but cannot restart if you're provisioning localhost.
You may need to run through the docker post-install guide if you have permissions errors when trying to run
Check that it worked
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
You can enable your private queue on the jobs page and begin running jobs.
- Scenario 1: Proceed to Docker Compose Deployment.
- Scenario 2: Setup is complete. Proceed to the VIAME Documentation.
- Scenario 3: Setup is complete. Make sure your private queue is enabled.
- Ansible provisioning is idempotent. If it fails, run it again once or twice.
- You may need to change the global
GPUS_ALL_REGIONSquota in IAM -> Quotas
- Nvidia drivers may not install correctly the first time. Try installing manually using