Git Product home page Git Product logo

Comments (4)

zhang-ziang avatar zhang-ziang commented on August 24, 2024 1

I tried this code, it seems to work fine.

I tried the code, but still enconter some problems, here is my code and output.

import numpy as np
import torch
from PIL import Image

sensor_to_params = {
    "kv1": {
        "baseline": 0.075,
    },
    "kv1_b": {
        "baseline": 0.075,
    },
    "kv2": {
        "baseline": 0.075,
    },
    "realsense": {
        "baseline": 0.095,
    },
    "xtion": {
        "baseline": 0.095, # guessed based on length of 18cm for ASUS xtion v1
    },
}


def convert_depth_to_disparity(depth_file, intrinsics_file, sensor_type, min_depth=0.01, max_depth=50):
    """
    depth_file is a png file that contains the scene depth
    intrinsics_file is a txt file supplied in SUNRGBD with sensor information
            Can be found at the path: os.path.join(root_dir, room_name, "intrinsics.txt")
    """
    with open(intrinsics_file, 'r') as fh:
        lines = fh.readlines()
        focal_length = float(lines[0].strip().split()[0])
    baseline = sensor_to_params[sensor_type]["baseline"]
    depth_image = np.array(Image.open(depth_file))
    depth = np.array(depth_image).astype(np.float32)
    depth_in_meters = depth / 1000.
    if min_depth is not None:
        depth_in_meters = depth_in_meters.clip(min=min_depth, max=max_depth)
    disparity = baseline * focal_length / depth_in_meters
    return torch.from_numpy(disparity).float()

# ...

device = "cuda:0" if torch.cuda.is_available() else "cpu"
model = imagebind_model.imagebind_huge(pretrained=True)
model.eval()
model.to(device)

with torch.no_grad():
    for dep_file in tqdm(depth_files):
        sensor_type = ...
        disparity = convert_depth_to_disparity(dep_file, sensor_type, min_depth=0.01, max_depth=50).unsqueeze_(dim=0).to(device)
        print(disparity.shape)
        # Load data
        inputs = {
            ModalityType.DEPTH: disparity,
        }
        embeddings = model(inputs)

the print output: torch.Size([1, 530, 730]) imagebind throw an error:

RuntimeError: Given normalized_shape=[384], expected input with shape [*, 384], but got input of size[384, 45, 33]

Do you have any idea? Or could you share your code? Thanks a lot. :)

I solve the problem by resizing the tensor to the shape [B, 1, 224, 224], it seems to work well. :)

from imagebind.

tfwang08 avatar tfwang08 commented on August 24, 2024

I tried this code, it seems to work fine.

from imagebind.

LinB203 avatar LinB203 commented on August 24, 2024

I tried this code, it seems to work fine.

We can use absolute depth in meters to inference by this repo

from imagebind.

zhang-ziang avatar zhang-ziang commented on August 24, 2024

I tried this code, it seems to work fine.

I tried the code, but still enconter some problems, here is my code and output.

import numpy as np
import torch
from PIL import Image

sensor_to_params = {
    "kv1": {
        "baseline": 0.075,
    },
    "kv1_b": {
        "baseline": 0.075,
    },
    "kv2": {
        "baseline": 0.075,
    },
    "realsense": {
        "baseline": 0.095,
    },
    "xtion": {
        "baseline": 0.095, # guessed based on length of 18cm for ASUS xtion v1
    },
}


def convert_depth_to_disparity(depth_file, intrinsics_file, sensor_type, min_depth=0.01, max_depth=50):
    """
    depth_file is a png file that contains the scene depth
    intrinsics_file is a txt file supplied in SUNRGBD with sensor information
            Can be found at the path: os.path.join(root_dir, room_name, "intrinsics.txt")
    """
    with open(intrinsics_file, 'r') as fh:
        lines = fh.readlines()
        focal_length = float(lines[0].strip().split()[0])
    baseline = sensor_to_params[sensor_type]["baseline"]
    depth_image = np.array(Image.open(depth_file))
    depth = np.array(depth_image).astype(np.float32)
    depth_in_meters = depth / 1000.
    if min_depth is not None:
        depth_in_meters = depth_in_meters.clip(min=min_depth, max=max_depth)
    disparity = baseline * focal_length / depth_in_meters
    return torch.from_numpy(disparity).float()

# ...

device = "cuda:0" if torch.cuda.is_available() else "cpu"
model = imagebind_model.imagebind_huge(pretrained=True)
model.eval()
model.to(device)

with torch.no_grad():
    for dep_file in tqdm(depth_files):
        sensor_type = ...
        disparity = convert_depth_to_disparity(dep_file, sensor_type, min_depth=0.01, max_depth=50).unsqueeze_(dim=0).to(device)
        print(disparity.shape)
        # Load data
        inputs = {
            ModalityType.DEPTH: disparity,
        }
        embeddings = model(inputs)

the print output: torch.Size([1, 530, 730])
imagebind throw an error:

RuntimeError: Given normalized_shape=[384], expected input with shape [*, 384], but got input of size[384, 45, 33]

Do you have any idea? Or could you share your code? Thanks a lot. :)

from imagebind.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.