Cloud Storage Integration
This page is intended for storage administrators who would like to make their existing data available through VIAME Web.
Tip
This guide assumes you are working with viame.kitware.com. If you are using a different deployment, be sure to change the appropriate fields.
Tip
Regarding data transfer costs, if you choose to keep both your data storage and job runners in Google Cloud (or AWS), you will avoid paying a data egress fee for transferring data between storage and the processing node.
Google Cloud Storage Mirroring
DIVE Web can mirror your data from Google Cloud storage buckets such that your team fully controls upload and data organization, but is able to view, annotate, and run analysis within VIAME Web.
Creating access credentials
- Create a new service account with read-only access to the bucket(s) and prefixes that you want to map.
- In storage settings, in the interoperability tab
- create an access key (Service account HMAC) for your read-only service account.
- set the current project as the default project for interoperability access
- take note of your
Access Key
,Secret Key
,Storage URI
, andBucket Name
.
Setting up CORS
You'll also need to configure CORS headers for any buckets where media will be served.
- Save the snippet below as
bucket-cors-config.json
. "origin"
should be whatever you type into your browser to get to the web application.
1 2 3 4 5 6 7 8 |
|
Then use gsutils
to configure each bucket.
1 |
|
Choose a mount point
Choose a folder as a mount-point inside DIVE Web. This folder should ideally be dedicated to mapping from your GCS buckets.
We recommend creating a Google Cloud Storage
folder with sub-folders named for each bucket you mount. You can do this using the New Folder
button in DIVE Web's File Browser. You can get the folder ID from your browser's URL bar.
Send us the details
If you want to use your bucket with viame.kitware.com, send us an email with the following details to viame-web@kitware.com
.
1 2 3 4 5 6 7 8 |
|
S3 and MinIO Mirroring
If you have data in S3 or MinIO, you can mirror it in DIVE for annotation.
- Data is expected to be either videos or images organized into folders
- You should not make changes to folder contents once a folder has been mirrored into DIVE. Adding or removing images in a particular folder may cause annotation alignment issues.
- Adding entire new folders is supported, and will require a re-index of your S3 bucket.
Pub/Sub notifications
Creating pub/sub notifications is optional, but will keep your mount point up-to-date automatically with new data added to the bucket. In order to make use of this feature, your DIVE server must have a public static IP address or domain name.
- Create a bucket notification configuration
- Create a topic subscription
- Set a push delivery method for the subscription
- The URL for delivery should be
https://viame.kitware.com/api/v1/bucket_notifications/gcs
- The URL for delivery should be
Our server will process events from this subscription to keep your data current.
Mirroring setup
If you have your own dive deployment, you can create a bucket mirror yourself through the Girder admin console.
- Open
/girder#assetstores
in your browser.- Choose Create new Amazon S3 Assetstore
- Enter a name and all the details you collected above.
- For region, enter the AWS or GCS region you're using, like
us-east-1
. - For service, enter the service URI if you're using an S3 provider other than AWS (such as MinIO or GCS).
- Mark as Read only.
- Now import your data. Choose the green Begin Import button on the new assetstore.
- Leave Import path blank unless you only want to import part of a bucket.
- For Destination type, use the folder ID you chose as the mount point above.
The import may take several minutes. You should begin to see datasets appear inside the mount point folder you chose.