Skip to content

Cloud Storage Integration

This page is intended for storage administrators who would like to make their existing data available through VIAME Web.

Tip

This guide assumes you are working with viame.kitware.com. If you are using a different deployment, be sure to change the appropriate fields.

Tip

Regarding data transfer costs, if you choose to keep both your data storage and job runners in Google Cloud (or AWS), you will avoid paying a data egress fee for transferring data between storage and the processing node.

Google Cloud Storage Mirroring

DIVE Web can mirror your data from Google Cloud storage buckets such that your team fully controls upload and data organization, but is able to view, annotate, and run analysis within VIAME Web.

Creating access credentials

  1. Create a new service account with read-only access to the bucket(s) and prefixes that you want to map.
  2. In storage settings, in the interoperability tab
    1. create an access key (Service account HMAC) for your read-only service account.
    2. set the current project as the default project for interoperability access
    3. take note of your Access Key, Secret Key, Storage URI, and Bucket Name.

Setting up CORS

You'll also need to configure CORS headers for any buckets where media will be served.

  • Save the snippet below as bucket-cors-config.json.
  • "origin" should be whatever you type into your browser to get to the web application.
1
2
3
4
5
6
7
8
  [
    {
      "origin": ["https://viame.kitware.com"],
      "method": ["GET", "PUT", "POST", "DELETE"],
      "responseHeader": ["Content-Type"],
      "maxAgeSeconds": 3600
    }
  ]

Then use gsutils to configure each bucket.

1
gsutil cors set bucket-cors-config.json gs://BUCKET_NAME

Choose a mount point

Choose a folder as a mount-point inside DIVE Web. This folder should ideally be dedicated to mapping from your GCS buckets.

We recommend creating a Google Cloud Storage folder with sub-folders named for each bucket you mount. You can do this using the New Folder button in DIVE Web's File Browser. You can get the folder ID from your browser's URL bar.

Send us the details

If you want to use your bucket with viame.kitware.com, send us an email with the following details to viame-web@kitware.com.

1
2
3
4
5
6
7
8
subject: Add a google cloud storage bucket mount

Bucket name:
Service provider: Google cloud
Access Key: 
Secret Key:
Mount point folder:
Prefix (if applicable):

S3 and MinIO Mirroring

If you have data in S3 or MinIO, you can mirror it in DIVE for annotation.

  • Data is expected to be either videos or images organized into folders
  • You should not make changes to folder contents once a folder has been mirrored into DIVE. Adding or removing images in a particular folder may cause annotation alignment issues.
  • Adding entire new folders is supported, and will require a re-index of your S3 bucket.

Pub/Sub notifications

Creating pub/sub notifications is optional, but will keep your mount point up-to-date automatically with new data added to the bucket. In order to make use of this feature, your DIVE server must have a public static IP address or domain name.

  1. Create a bucket notification configuration
  2. Create a topic subscription
  3. Set a push delivery method for the subscription
    1. The URL for delivery should be https://viame.kitware.com/api/v1/bucket_notifications/gcs

Our server will process events from this subscription to keep your data current.

Mirroring setup

If you have your own dive deployment, you can create a bucket mirror yourself through the Girder admin console.

  1. Open /girder#assetstores in your browser.
    1. Choose Create new Amazon S3 Assetstore
    2. Enter a name and all the details you collected above.
    3. For region, enter the AWS or GCS region you're using, like us-east-1.
    4. For service, enter the service URI if you're using an S3 provider other than AWS (such as MinIO or GCS).
    5. Mark as Read only.
  2. Now import your data. Choose the green Begin Import button on the new assetstore.
    1. Leave Import path blank unless you only want to import part of a bucket.
    2. For Destination type, use the folder ID you chose as the mount point above.

The import may take several minutes. You should begin to see datasets appear inside the mount point folder you chose.