Git Product home page Git Product logo

Comments (5)

Jiayuan-Gu avatar Jiayuan-Gu commented on August 23, 2024

Thanks for providing us with a minimal example. However, I am not able to reproduce segfault given this script. KuafuRenderer is experimentally supported in ManiSkill2, and we hope the community can help us improve relevant features together.

After some quick checks, I think the error results from creating new scenes for different assets. If we only use one model, then the env will not create any new scene, which will not raise segfault.

Here is my modified minimal example:

import gym
import numpy as np
from mani_skill2.utils.wrappers import RecordEpisode


def main():
    np.set_printoptions(suppress=True, precision=3)
    env = gym.make(
        "TurnFaucet-v0",
        obs_mode=None,
        model_ids=["5001", "5021"],
        reward_mode=None,
        control_mode="pd_ee_delta_pose",
        enable_kuafu=True,
    )
    record_dir = "./tmp/issue25"
    # env = RecordEpisode(env, record_dir, render_mode="cameras")

    frames = []
    for i in range(10):
        print("Episode", i)
        env.reset()
        for _ in range(10):
            render_frame = env.render(mode="cameras")
            frames.append(render_frame)

            action = env.action_space.sample()
            obs, reward, done, info = env.step(action)
            print("step", info["elapsed_steps"])
            if done:
                break


if __name__ == "__main__":
    main()

The workaround now can be:

  1. using environments without asset change
  2. only use one model each time (PickSingleYCB, PickSingleEGAD, TurnFaucet all support this functionality through initialization, and all environments also support fixing randomness when reset)
  3. If you want to replay trajectories, I suggest you first split trajectories by model ids according to the associated json file, and use a bash script to process each model.

@jetd1 Please help check how the KuafuRender's memory is managed for a new scene.

from maniskill.

Xingyu-Lin avatar Xingyu-Lin commented on August 23, 2024

Indeed, the issue comes from creating a new scene with a different model. Use one process for each model addressed the issue, although the solution does not seem ideal.

from maniskill.

Xingyu-Lin avatar Xingyu-Lin commented on August 23, 2024

A related issue: When the save_trajectory option is set, I got wierd error from h5py, as shown below. This error is gone if I comment out the rendering.

Traceback (most recent call last):
  File "examples/demo_manual_control.py", line 57, in <module>
    main()
  File "examples/demo_manual_control.py", line 49, in main
    render_traj(0)
  File "examples/demo_manual_control.py", line 43, in render_traj
    env.close()
  File "/home/xingyu/Projects/OpenWorldManipulation/mani_skill2/utils/wrappers/record.py", line 277, in close
    self._h5_file.close()
  File "/home/xingyu/software/miniconda3/envs/maniskill2/lib/python3.8/site-packages/h5py/_hl/files.py", line 552, in close
    self.id._close_open_objects(h5f.OBJ_LOCAL | h5f.OBJ_FILE)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 360, in h5py.h5f.FileID._close_open_objects
RuntimeError: Can't decrement id ref count (unable to extend file properly, errno = 9, error message = 'Bad file descriptor')
Segmentation fault (core dumped)

Minimal example to reproduce this issue:

from mani_skill2.utils.wrappers import RecordEpisode
from multiprocessing import Pool
import multiprocessing as mp


def render_traj(i):
    print('render traj ', i)
    import gym
    import numpy as np
    np.set_printoptions(suppress=True, precision=3)
    model_ids = ['5001', '5021']
    model_id = model_ids[i % len(model_ids)]
    env = gym.make(
        'TurnFaucet-v0',
        obs_mode=None,
        reward_mode=None,
        control_mode='pd_ee_delta_pose',
        enable_kuafu=True)

    record_dir = '/tmp/issue25/'
    env = RecordEpisode(env,
                        record_dir,
                        render_mode='cameras',
                        save_trajectory=True,
                        save_on_reset=False,
                        save_video=True,
                        trajectory_name=str(i))
    env.reset(model_id=model_id)
    env.close()


def main():
    render_traj(0)

if __name__ == "__main__":
    mp.set_start_method('spawn')
    main()

from maniskill.

jetd1 avatar jetd1 commented on August 23, 2024

This bug should be fixed with haosulab/SAPIEN@31e12dd
Please consider building SAPIEN from scratch if you need the fix immediately (https://github.com/haosulab/SAPIEN/blob/dev/docker_install_38_debug.sh).

from maniskill.

Jiayuan-Gu avatar Jiayuan-Gu commented on August 23, 2024

https://github.com/haosulab/ManiSkill2/issues/25#issuecomment-1248769861 seems to be relevant to the accidental corruption of hdf5. However, the minimal example seems to be irrelevant and I can not reproduce it yet. I may comment on this issue if I figure out what happens.

from maniskill.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.