Git Product home page Git Product logo

ucb_data_storage's Introduction

UC Berkeley Data Storage

This unofficial guide is intended to provide a brief overview of data storage options provided by UC Berkeley. The information included in this guide is up to date as of 6/1/2024, however storage options are subject to change. Please consult with the Berkeley Research Data Management Program for the most up to date information and support.


Table of Contents


  • Accounts for individuals
  • Free 50GB storage quota, 15 GB max file size
  • Request up to 2 TB using this form
  • 500GB for Special Purpose Accounts* (SPA)
  • Important note: based on Berkeley’s contract, Box is only approved for active data, not backups or archives

*SPAs are accounts with CalNet IDs that can be shared by multiple users. They allow for persistence in data access as they are not tied to individuals.


MyDrive (bdrive)

  • Accounts for individuals
  • Free 50 GB free storage quota
  • Request up to 150 GB using this form.

Shared Drives

  • Accounts for groups
  • Two options:
    1. Passthrough Pay Option: pay $1,440 annually for 10 TB (you can purchase more in 10 TB blocks). This is probably the easiest to use choice for PIs or small research groups (though it is more expensive than Wasabi or Active Archive Object Storage)
    2. Expanded Shared Drives: free, 1 TB Expanded Shared Drive per unit/group, or per 40 employees (i.e., Departments), 50 GB Expanded Shared Drive availability is more flexible
  • 5 TB max file size for both options.

This is a relatively inexpensive (e.g., cheaper than google) and easy to use option for data storage. It is user friendly cloud storage and offers a simple GUI option to access and drop files from wasabi onto your local drive.

  • $5.99 per TB/month (starting minimum 1TB), waiver on data egress fees (i.e., data transfer)
  • Submit a ticket for more information: [email protected]
  • Note: the cost of Wasabi without a Berkeley affiliation is $6.99 per TB/month, so this may be a good option if you may need to store data on a personal account in-between institutions

  • Options:
  1. Amazon Web Services (AWS)
  2. Google Cloud Platform (GCP)
  3. Microsoft Azure
  • Berkeley has data egress (i.e., data transfer) fee waivers for AWS and GCP
  • Pricing: highly variable and complex, reach out to [email protected] for support. You can submit a ticket through the following emails:
  • Note: there are good deep storage options via AWS (Amazon S3 Glacier) and GCP for data (Archive storage) that you aren’t actively using and want to archive. These options are often much cheaper than alternatives like Wasabi, but are a bit more complicated to set-up.

  • Cheaper, long term storage option, especially for splitting among labs or research groups
  • Book a consultation to figure out exact pricing: [email protected]
  • This is hardware purchased by the researcher/research group/lab. The hardware and vendor support lasts for five years, then the equipment must be replaced. Because you are purchasing the equipment, the purchase is a capital expense in grant accounting and NOT included in the total on which overhead is assessed - a great saving in and of itself
  • Note: these costs will be somewhere around $5,600 (plus sales tax) for 125 TB for 5 years…that’s about $50 per TB for five years’ storage (83 cents per TB/month)!

  • For Condo Computing users of the Savio high performance computing cluster.
  • Cheaper, long term storage option; similar model as Active Archive Object Storage (AAOS). No cost after five years
  • 112 TB increments at a cost of $5,750 each (plus sales tax) (price subject to change due to supply chain issues)
  • Interested faculty and PIs should contact [email protected]

*This storage option is probably too expensive for most individuals and labs and is used more for business processes on campus, but it is included here since it is a Berkeley storage offering

  • Utility tier: 4 cents per GB/month ($480 per TB/year)
  • Performance tier: 20 cents per GB/month ($2,400 per TB/year)
  • See here for information about the different tiers.
  • Hosted in a data center and then can be mounted on a local machine (e.g. gdrive, hdrive)

Intended for departmental collaboration, departmental intranet websites, and low-volume P4 data storage (not for individual use).

  • Free, 1 TB storage quota. Maximum single file upload 15GB.

Additional notes

Privacy and security: most of the Berkeley data storage options have a Data Security Level of P3. For P4 data storage options check out CalShare, Secure Research Data and Compute, and Active Archive Object Storage (AAOS).

Note about non-berkeley affiliated storage options: there are other options for storage (such as Dropbox, OneDrive, and iCloud) that UC Berkeley does not have an existing contract with. Using these services comes with greater risks because UC Berkeley contracts have more security and liability measures in place to protect you and your data. Here is basic information about pricing for some common non-affiliated services:

  • Dropbox
    • Free: 2GB
    • Plus: $11.99/month or $119.88/year for 2TB; Large file delivery up to 2 GB
    • Professional: $19.99/month or $198.96/year for 3TB; Large file delivery up to 100 GB
    • Standard (3+ users): $18/user/month or $180/user/year for 5TB; Large file delivery up to 2 GB
    • Advanced (3+ users): $24/user/month or $288/year for 15TB; Large file delivery up to 100 GB
  • iCloud (USA prices)
    • 50GB: $0.99/month
    • 200GB: $2.99/month
    • 2TB: $9.99/month
    • 6TB: $29.99/month
    • 12TB: $59.99/month
  • OneDrive (Personal Accounts - other options for business pricing)
    • Free - 5GB
    • Microsoft 365 Personal - $69.99/year for 1TB
    • Microsoft 365 Basic - $19.99/year for 100GB
  • Google Drive (Personal Account)
    • Free: 15GB
    • Basic: $1.99/month for 100GB
    • Standard: $2.99/month for 200GB
    • Premium: $9.99/month for 2TB

Unofficial Advice: I created this guide to serve as a resource for UC Berkeley graduate students, post-docs, staff, and PIs who are responsible for generating and maintaining research data. For those looking for the short answer to “what do I do with my data?”, based on the resources described here I think the cheapest and easiest option for labs or research groups is to pay for storage via Wasabi or Google Shared Drives. For individuals who need to store their data independently, Box/Google Drive work if you need less than 2 TB and Wasabi seems to be the easiest and cheapest option if you have more than that. There are definitely more cost effective options described above that may take a bit more work to initially set-up. - Anusha

Sources

Berkeley Research Data and Management: Data Storage & Backup

Berkeley bConnected: Alternative Storage Options

Contributors

Anusha Bishop (Creator, maintainer)

Rick Jaffe

ucb_data_storage's People

Contributors

anushapb avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.