robotlocomotion / drake-blender Goto Github PK

View Code? Open in Web Editor NEW

12.0 15.0 3.0 1.17 MB

Drake glTF Render Client-Server API using Blender

Home Page: https://drake.mit.edu/

License: Other

Starlark 18.29% Python 79.28% Shell 2.43%

drake-blender's Introduction

Overview

drake-blender is an implementation of the Drake glTF Render Client-Server API atop Blender.

This is a relatively new project and may still have bugs. Please share your issues and improvements on GitHub.

Compatibility

This software is only tested on Ubuntu 22.04 "Jammy", but should probably work with any Python interpreter that supports our requirements.txt.

Running the render server

There are two ways to run the server.

(1) From a git checkout of drake-blender:

./bazel run :server

This way has no extra setup steps. It will automatically download the required dependencies into the Bazel sandbox, using the same versions as pinned by our requirements lockfile that is tested in our Continuous Integration build.

(2) From your own virtual environment:

The server.py file is self-contained -- it does not import any other files from drake-blender. Instead of using Bazel, you can also run it as a standalone Python program (python3 server.py) so long as the packages listed in our requirements.in are available in your Python runtime environment. You are responsible for preparing and activating an appropriate virtual environment on your own.

Examples

See examples.

Testing (for developers)

From a git checkout of drake-blender:

./bazel test //...

Linting

Check for lint:

./bazel test //... --config=lint

Fix all lint:

./tools/fix_lint.sh

Credits

The Drake-Blender project was created by the Robotics Division at Toyota Research Institute. Many other people have since contributed their talents. Here's an alphabetical list (note to contributors: do add yourself):

Bassam ul Haq
Cody Simpson
Eric Cousineau
Jeremy Nimmer
John Shepherd
Kunimatsu Hashimoto
Matthew Woehlke
Sean Curtis
Stephen McDowell
Zach Fang

Licensing

Per LICENSE.TXT, this module is offered under the BSD-2-Clause license, but note that it loads import bpy from Blender so is also governed by the terms of Blender license GPL-2.0-or-later.

Per examples/LICENSE.TXT, the examples code is offered under the MIT-0 license.

drake-blender's People

Contributors

Stargazers

Watchers

Forkers

jwnimmer-tri zachfang seancurtis-tri

drake-blender's Issues

Regression test for video writing

The ball_bin example video writing does not have any regression testing in our CI.

I assumed that since the --still mode worked, everything else would too, since VideoWriter is tested in Drake.

The problem is that import cv2 in Drake CI comes from Ubuntu, which has more codecs than the pip version of cv2, so in fact we were failing.

We should have a regression test to catch this in the future.

We will probably need RobotLocomotion/drake#19447 to land before the regression test would pass.

Upgrade to Bazel 6.2.0

Improve lint failure suggestions

When a lint test fails, it would be nice to tell developers to run fix_lint.sh. They should never need to fix lint by hand, but it's not always obvious what to do.

We have docs at https://github.com/RobotLocomotion/drake-blender#linting but its easy to miss.

blender server doesn't actively make use of GPU

When running the blender server and rendering against the Cycles renderer, I found that the default configuration was CPU bound. Within blender, I could configure blender's preferences to use GPU (which makes a huge performance difference).

Following the guidance from this stackoverflow question, I was able to introduce a --bpy_settings_file with the following code:

if bpy.data.scenes[0].render.engine == "CYCLES":
    cycles_prefs = bpy.context.preferences.addons["cycles"].preferences
    cycles_prefs.compute_device_type = "CUDA"
    bpy.context.scene.cycles.device = "GPU"
    bpy.context.preferences.addons["cycles"].preferences.get_devices()
    for d in cycles_prefs.devices:
        d["use"] = 1

This successfully changed the rendering to be GPU accelerated.

Ideally, the server should be doing this for us.

Naively, we could simply stash this code into the server. At this point, it's not clear if that could be harmful:

What if the user has a card where "OPTIX" acceleration would be better than "CUDA"?
What if the user has a driver problem?

We might consider simply adding the code stanza and including a flag called --disable_auto_gpu. It would at least allow a user to opt-out if it doesn't do what they want.

Finally, the stanza above is a bit of black magic incantation, it probably bears a bit more investigation to discover if it's all necessary/helpful.

Depth and label in example

In some example (probably ball_bin), we should also demonstrate the capability for depth and label images.

We could produce one video with all three types side-by-side using ConcatenateImages and ColorizeDepthImage & etc. Possibly this should be a built-in feature of the AddCameraConfig toolbox.

Fix kTooClose handling

Per code review, it seems like the depth return for kTooClose is not implemented correctly.

We should add unit tests and then (presumably) fix the code.

Test example in CI

As a follow-up from #5 and #9, we should enable the example smoke test in CI.

That will require adding a display server to the CI environment.

[epic] Tasks for "minimal viable product"

Release 0.2.0

I think we're ready for the next stable tag.

@SeanCurtis-TRI do you agree?

If yes, I'll work on it and start to make a playbook (it's only a few clicks).

Load *.gltf for environment

Like #14, but load a *.gltf for the static environment and lighting, instead of (or in addition to?) the *.blend file.

Add CI

Use github actions.

Try to make sure that pip (PyPI) downloads are cached, if we can. Maybe gihub has docs for this.

Likewise for the bazel disk-cache, though for the moment our build has almost no steps at all so that probably isn't buying us much. If we end up needing to rebuild blender from source, then we'd want it, though.

Explain how to test a Drake canary

Currently, our examples are pinned run against the latest stable version of Drake.

During development, it's sometimes helpful to run against a WIP build of Drake instead, to see the effect of Drake changes.

We should make it as easy as practical to do that, and document how to do it.

Test coverage for different camera models

Our current tests use a single "camera model" (width, height, center, fov, etc.). We should expand our test coverage to ensure that any model the user specifies ends up being correctly applied.

Resolve ambiguity around RPC glTF importing

We apply a 90 degree rotation after importing the glTF we got over RPC. It is not sufficiently clear why we do this.

Is Blender importing the file incorrectly? Or is its importer documentation unclear about applying extra rotations that we need to undo? Or is Drake exporting the scene incorrectly? It has to be one or those, so we need figure out which one and (at least) write it down. If it's Drake doing it wrong, we'll also need to fix that.

Sean's comment from review:

        bpy.ops.transform.rotate(
            value=math.pi/2, orient_axis='X', orient_type='GLOBAL'
        )

Does this feel like a defect in VTK's glTF exporter? glTF is documented as y-up. Blender is z-up. If we need to rotate positive 90 degrees around the x-axis, it's because blender applied, correctly, the negative 90 degree rotation to go back from y-up to z-up. Admittedly, as long as the server knows about the idiosyncracies of the client's glTF flavor, they can form a coherent whole. But it does mean we're producing non-compliant glTF files.

Add automated regression test

With a sample *.gltf + *.blend file, use https://docs.python.org/3.10/library/urllib.request.html to call the RPC and check the image that comes back.

Note that in particular, no Drake code is used in this automated regression test.

Platform review in Reviewable

We should update our reviewable config to require Drake platform review, just like Drake.

Document some performance-related tips

Teach users how to use output_delay with RgbdSensorAsync.

Teach users how to run an nginx proxy to run multiple concurrent servers on the same RPC URL.

Need autoformatters

Accepting contributions from the community will be easier if we don't need to deal with code formatting.

We should adopt a one-true-style tool (e.g., pyblack) and integrate it as a bazel test //... linter.
We should integrate isort.
We should integrate buildifier.

Load *.blend file for environment

Using a command-line argument, the user should be able to load an optional *.blend file into the server in order to add static elements to the scene, set lighting, etc. This allows the user to establish elements of the scene beyond the scope of what Drake's SceneGraph can express over the RPC protocol.

Port conflicts in tests

PR #21 failed post-merge CI:

Port 8000 is in use by another program. Either identify and stop that program, or start the server with a different port.

The example test (ball_bin_test) and the new test (server_test) both try to claim port 8000. If bazel runs them concurrently, they will fail.

Each test program should choose a unique port number, either by hard-coding or by something like hashing the program name into unique integer.

glTF parsing is frightfully slow

Problem statement

Importing the glTF file can dominate performance.

Quantify the problem

I'm working on a scenario trying to get the rendering as fast as possible.

I did some profiling, and found (for a given set of rendering configurations), that the majority of the server's time was spent in importing the glTF file. I took a bunch of measurements of the time the server spent handling a request. We'll see a bunch of tables as follows:

X seconds - Request handling over N measurements.
  X seconds - Blender operations over N measurements.
    X seconds - Scene init (reset, script, glTF load) over N measurements.
      X seconds - Scene reset over N measurements.
      X seconds - Load blend over N measurements.
      X seconds - Blender script over N measurements.
      X seconds - Load glTF over N measurements.
    X seconds - Scene prep over N measurements.
    X seconds - Render over N measurements.

The times are average times. There are N instances taken (to be interpreted as N rendering requests).
Measurements are hierarchical. For a reported time, any times indented below it represent a portion of that reported time.

In my particular scenario, I had a robot operating in an environment. For visualization reasons, the environment geometry is included in SceneGraph geometry. However, the environment geometry does not move. When it comes to rendering in drake-blender we have two choices:

Simply pass the SceneGraph geometry through to the server via the .gltf file, and make the .blend file bare bones.
Filter out all the environment geometries from SceneGraph and only include the robot model in the glTF file. The .blend file includes a blender version of all of the static environment geometries.

Probing those two cases, and measuring the server behavior we get the following:

11.864986 seconds - Request handling over 4 measurements.
  11.590297 seconds - Blender operations over 4 measurements.
    8.446637 seconds - Scene init (reset, script, glTF load) over 4 measurements.
      0.032361 seconds - Scene reset over 4 measurements.
      0.098421 seconds - Load blend over 4 measurements.
      0.011363 seconds - Blender script over 4 measurements.
      8.304475 seconds - Load glTF over 4 measurements.
    0.001303 seconds - Scene prep over 4 measurements.
    3.142343 seconds - Render over 4 measurements.

Table 1: Bare .blend file with all geometries transported via glTF -- glTF file approximately 256 Mb.

5.413307 seconds - Request handling over 57 measurements.
  5.357982 seconds - Blender operations over 57 measurements.
    2.173868 seconds - Scene init (reset, script, glTF load) over 57 measurements.
      0.037553 seconds - Scene reset over 57 measurements.
      0.234062 seconds - Load blend over 57 measurements.
      0.011327 seconds - Blender script over 57 measurements.
      1.890909 seconds - Load glTF over 57 measurements.
    0.000771 seconds - Scene prep over 57 measurements.
    3.183332 seconds - Render over 57 measurements.

Table 2: Populated .blend file with only robot geometries transported via glTF -- glTF file approximately 45 Mb.

Observations:

Blender loads .blend files very quickly. 10 ms for empty file, 230 ms for massive file.
Rendering time is indistinguishable.
glTF import is crazy slow: 1.89 s for robot only, 8.3 s for robot and environment.
I haven't explored the glTF file to determine if there is any part of it that is particularly expensive (i.e., number of nodes, number of triangles, amount of texture data, etc.) It's conceivable that not every byte comes with the same cost.
I have explored using a .glb instead of a .gltf file. Unsurprisingly, loading a .glb file is faster, but the benefit is a constant 1 s chopped off the cost of the parse. So, 2 seconds become one second and 8 seconds become 7 s.
The 3 second rendering time is a highly optimized rendering time -- before tweaking rendering, the render time was 23 seconds.

Possible solutions

Server caching

The basic idea is that the server maintains state from rendering to rendering and only updates poses.

Challenges:

The server would need to be informed when a new scene comes in, and when a scene is an update of a previous load.
- Possibly make use of the scene id we currently broadcast?
In order to just update poses, it would have to extract node poses from the .gltf file.
- It's not clear how fast a json parser would make that data available (compared to whatever it is blender is doing). Still to be tested.
- Mapping may be a problem. Blender has very strict name uniqueness requirements. Every name is unique globally. SceneGraph has much weaker name requirements and glTF has none at all. The act of importing a glTF file can change the name silently. So, we'd need some form of robust data that would allow us to successfully identify nodes across glTF files w.r.t. arbitrary .blend files.

Multi-glTF API

More closely model the RenderEngine API in terms of "register geometry", "pose geometry", "remove geometry", and "render".

A geometry being registered would still be a glTF file. But it would also be accompanied by a unique client-owned identifier. The server would then map the identifier to whatever artifacts the glTF creates in the server. Thus when pose and geometry commands come with id information, the server can use its own maps to do whatever it needs to do.

Challenges:

This is a completely different client-server API.
It goes from processing a glTF to a host of glTFs (although, it is possible to embed the unique identifier within a glTF, so that we could put many geometries and their identifiers in a single glTF).

Add example

In some kind of examples subdirectory, show a demonstration of how to render a Drake scene using the blender server. Use pydrake from pip.

Desired:

A few object moving under Drake dynamics.
Demonstrate loading a *.blend file.
- #14
Create a visually interesting movie (e.g., moving camera using a trajectory).
TRI branding and/or watermarking, as possible.

Upgrade to Blender 4.0

Our tests don't pass with Blender 4.0. It looks like depth + label images fail to configure the rendering as we want -- they still have normal coloring just like the color image.

In #65, I've pinned to blender 3.6 to work around this for now.

Watermarking in example

Follow-up from #5. Ideally, our demo video(s) should have a watermark as marketing.