Git Product home page Git Product logo

s3dmap's Introduction

S3DMap Logo S3DMap

S3DMap TreeMap GIF

S3DMap provides an interactive 3D Tree Map of your S3 bucket, to aid in S3 cost optimization and object management.
Use S3DMap to gain an intuitive visual map of your S3 bucket, at the prefix-level.
It is based on the suggested cost optimization methodology: Prefix Oriented Object Management (POOM).
Presented in PlatformCon2024:

The methodology and tool emerged from extensive research performed by the 5x team at PointFive and are based on real-world case studies.

Inspired by SpaceMonger from the 2000s, the tool enables interactive treemap browsing of your bucket's storage with configurable layers of insights.

Think of it as a self-serve tool for mining cost optimization opportunities, based on your S3 Bucket Inventory export.

๐Ÿš€ Please do contribute and share your use cases and ideas, via:

For a fully managed experience and automatic cost optimization recommendations across all dimensions and use cases, feel free to reach out and get PointFive platform on your environment!

โœจ Features

  • ๐Ÿงฎ Interactive treemap browsing of S3 bucket storage
  • ๐Ÿ“Ÿ Detailed prefix-level analysis, using configurable layers of insights
  • ๐Ÿ“œ Direct SQL interface on the Object level and Prefix level, for custom advanced research
  • ๐Ÿคก Anonymizer script to share bucket structure without conveying objects names

๐ŸŒŸ Example Use Cases

๐ŸŽฏ The Goal: Efficient Buckets Architecture

Choose the correct storage class for all objects given their usage pattern and attributes.

๐Ÿงฉ The Methodology

Prefix-Oriented Objects Management (POOM)

From AWS Official Documentation:

A prefix is a string of characters at the beginning of the object key name. A prefix can be any length, subject to the maximum length of the object key name (1,024 bytes). You can think of prefixes as a way to organize your data in a similar way to directories. However, prefixes are not directories.

While the ideal architecture strives to create the "designated bucket" (coined by @omritsa) with a well defined purpose, you likely already have huge "generalized buckets" in your cloud environment. And you would probably prefer any activity rather than migrate those existing piles of data to new buckets...

๐Ÿฎ The remedy comes in the form of designated-prefixes! ๐Ÿฎ

In a nutshell:

  • The bucket is only a semantic wrapper for the actual cost-driving entities: the prefixes (directories)
  • S3 storage is not hierarchial (excluding the new Express One Zone), but prefixes and sub-prefixes essentially create a hierarchial tree structure
  • Moreover, it is common for objectsโ€™ attributes to be fairly consistent within a specific prefix branch
  • The prefixes are the tangible organizational units in S3 for storage class management via Lifecycle Policies (a bucket does not have a storage class)
    • Lifecycle Policies, Expiration Policies and Intelligent Tiering, in turn, are the toolset for you to achieve the goal of the game
  • There are an order of magnitude fewer prefixes than objects, making management possible to handle and grasp.
  • Under the hood, prefixes are implicit instructions for S3 to partition the physical data storage. Thus, most relevant S3 mechanisms work by the prefix:
    • Lifecycle Policies
    • Intelligent Tiering
    • Expiration Policies
    • API (prefixes actually let you horizontally scale API requests per second!)
    • Inventory
    • ...

๐Ÿš€ Getting Started

Prerequisites

Installation

  1. Clone the repository:

    git clone https://github.com/PointFiveLabs/s3dmap.git
  2. Enter the s3dmap directory:

    cd s3dmap
  3. End-to-end docker-compose build:

    make full
  4. Open browser at: https://localhost:2323/ and hit "Update Treemap"

That will allow you to browse the preloaded sample-bucket out-of-the-box

๐Ÿ“š Usage Guides

Loading your own Bucket

This is where it gets interesting and you can start mining insights visually! Really just by looking at the map!

CSV
  1. Create the CSV S3 Inventory export for your bucket.
    When creating the export, choose as much optional columns to be included as desired. Non-checked columns will limit the tool's dimensions options.
  2. Put the CSV files under user_input_data/inventories/<BUCKET_NAME>/csv along with the corresponding manifest.json.
  3. Run:
    make full
  4. Open browser at: https://localhost:2323/
  5. Fill your <BUCKET_NAME> as the bucket name and hit Enter
Parquet

Not supported yet. Accepting PRs!

Run your own SQL Queries on Inventory and Prefixes

For advanced research and custom investigations - you may directly query the raw inventory table, or the transformed prefixes table, using the underlying Postgres DB.

  1. Load your bucket inventory as instructed above.
  2. Run make sql QUERY="<YOUR QUERY>;"

Example usage:

make sql QUERY="select * from inventory limit 10;"
make sql QUERY="select * from prefixes limit 10;"

Anonymize your Bucket Object Names

In case you want to show/share/screenshot your bucket's map but not convey real object names, you may anonymize the bucket's inventory.

  1. Load your bucket inventory as instructed above.
  2. Run:
    make anonymize BUCKET_NAME=<BUCKET_NAME>
    This will deep copy your bucket's inventory data using randomly mangled names, as a new bucket called sample-bucket
  3. Open browser at: https://localhost:2323/
  4. Fill sample-bucket as the bucket name and hit Enter

๐ŸŒŸ Future Plans

Accepting PRs!

  • Support Parquet Inventory input (available in our platform)
  • Automate Inventory export creation, using AWS CLI or IaC files (Terraform/CloudFormation) (available in our platform)
  • Obtain an existing Inventory export directly from the target bucket
  • Ingesting and processing other GIS-like layers of insights (available in our platform):
    • Cost (CUR)
    • Access Logs
    • Lifecycle Rules
    • Object Attributes
    • CloudFront

s3dmap's People

Contributors

azouri-pf avatar dorazouri avatar molaga avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.