Git Product home page Git Product logo

Comments (19)

jstone-lucasfilm avatar jstone-lucasfilm commented on May 29, 2024

Thanks for this question, @howetuft, and my main experience in this area is with MaterialX renders to virtual framebuffers in GitHub Actions:

https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/.github/workflows/main.yml#L252

I'm also CC'ing @kwokcb, who has done initial work on rendering MaterialX content from Jupyter notebooks, and may have some good advice to offer.

from materialx.

kwokcb avatar kwokcb commented on May 29, 2024

Base on the error which comes from here it appears that GLAD cannot be initialized.

I don't have a Linux environment but it seems you have a X Display setup properly so I guess a check to see that the appropriate OpenGL libraries are installed and runnable on an X Display would help narrow the issue down. I think you can test this by running the MaterialXView and / or MaterialXNodeEditor which you can download from a distribution.

If you have this built locally then you can run MaterialXTest [renderglsl] from the build/bin area. That runs the same code as the Python wrappers.

Hope this helps. Ping if you need more follow-up.
(BTW, I have run this Python wrapper locally on Mac and Windows in virtual envs)

from materialx.

kwokcb avatar kwokcb commented on May 29, 2024

BTW, This is what I do for my renderer class which is the same as the sample code.

from materialx.

howetuft avatar howetuft commented on May 29, 2024

Hello,
Thank you for your quick response!

a check to see that the appropriate OpenGL libraries are installed and runnable on an X Display would help narrow the issue down.

My OpenGL has been operational for years, so I have few doubts about it, but I ran eglinfo -B -p x11 and here is the result:

X11 platform:
EGL API version: 1.5
EGL vendor string: Mesa Project
EGL version string: 1.5
EGL client APIs: OpenGL OpenGL_ES 
OpenGL core profile vendor: Intel
OpenGL core profile renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
OpenGL core profile version: 4.6 (Core Profile) Mesa 24.0.3-arch1.1
OpenGL core profile shading language version: 4.60
OpenGL compatibility profile vendor: Intel
OpenGL compatibility profile renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
OpenGL compatibility profile version: 4.6 (Compatibility Profile) Mesa 24.0.3-arch1.1
OpenGL compatibility profile shading language version: 4.60
OpenGL ES profile vendor: Intel
OpenGL ES profile renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
OpenGL ES profile version: OpenGL ES 3.2 Mesa 24.0.3-arch1.1
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20

glxgears also runs perfectly.

I think you can test this by running the MaterialXView and / or MaterialXNodeEditor which you can download from a distribution.

I confirm that, out of any virtual environment:

  • my above Python script runs like a charm

  • and so do the other official programs from distribution MaterialXView and MaterialXNodeEditor

Moreover I've been using MaterialX for 4 months (developing something based on it...), I was fully satisfied and encountered no issue till yesterday when I decided to move everything to virtual env, for deployment reasons...

If you have this built locally then you can run MaterialXTest [renderglsl] from the build/bin area. That runs the same code as the Python wrappers.

Sorry I installed MaterialX from ArchLinux repo (sudo pacman -S materialx), not built from source.

Base on the error which comes from here it appears that GLAD cannot be initialized.

I fully agree with you!
If I may, I would particularly suspect this function: https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L88
For Linux (and only for Linux), it tries to dynamically load libGL.so or libGL.so.1. However, this lib is not installed in the virtual env, so I wonder if the interpreter is able to find it in the system.

(BTW, I have run this Python wrapper locally on Mac and Windows in virtual envs)

That's why I suspect int open_gl(void), which behaves differently depending on the OS. Your opinion?

from materialx.

howetuft avatar howetuft commented on May 29, 2024

@kwokcb @jstone-lucasfilm
OK, I think I found the bug. I confirm it is related to your omitting libGL.so.1 in the wheel.

Indeed, if I copy (and patch somewhat) libGL.so.1 into the virtual environment, the Python script now runs smoothly.

The bash script becomes:

# Create virtual env (same as before)
python -m venv .venv
source .venv/bin/activate
pip install MaterialX

# Copy and patch libGL.so.1 (ugly, just for demo)
cd .venv/lib/python3.11/site-packages/materialx.libs
cp /usr/lib/libGL.so.1 .
patchelf --replace-needed libGLdispatch.so.0 libGLdispatch-dcc1ca97.so.0.0.0 libGL.so.1
patchelf --replace-needed libGLX.so.0 libGLX-404aa684.so.0.0.0 libGL.so.1

(nota: without elf patch, libGL.so.1 would remain pointing to system libGLdispatch.so and libGLX.so libs instead of local ones, and it would still fail...)

At this stage, the Python script now runs without error.

Of course, this copy-and-patch workaround is ugly, it is just for demo.

But it allows me to conclude that you (or your PyPi packager) should include a libGL.so.1 in the wheel for Linux.
--> Are you OK with that and would it be possible for you to make the fix?

from materialx.

jstone-lucasfilm avatar jstone-lucasfilm commented on May 29, 2024

That sounds like a good catch, @howetuft, and I'm open to that fix, but it's a bit beyond my own knowledge to include that OpenGL file in our Python distributions.

Would you have the bandwidth to propose that as a pull request to GitHub, and we can review the changes for our next release?

from materialx.

howetuft avatar howetuft commented on May 29, 2024

Certainly! Or I can try, at least...

from materialx.

kwokcb avatar kwokcb commented on May 29, 2024

I think you need add a dependency in pyproject.toml ? Something like this:

[options.extras_require]
# Add dependencies specific to Linux
linux = ["libGL"]

Adding in @JeanChristopheMorinPerso for comment as he set this up.

from materialx.

howetuft avatar howetuft commented on May 29, 2024

Unfortunately, this doesn't work, as far as I know. Indeed, no matter what you do, libGL.so cannot be included in the wheel, because it is externally-provided according to the policy (
https://peps.python.org/pep-0599/#the-manylinux2014-policy).
But I think I may have found another solution, which I'll submit to you this weekend.

from materialx.

JeanChristopheMorinPerso avatar JeanChristopheMorinPerso commented on May 29, 2024

dlopen should be able to find it. The kernel delegates that to ld.so (https://man7.org/linux/man-pages/man3/dlopen.3.html). In our case, PyMaterialXRenderGlsl.cpython-39-x86_64-linux-gnu.so has an DT_RPATH set to $ORIGIN:$ORIGIN/../materialx.libs, so ld.so will try that first it, it will fail and it will then look at the usual system directories.

Maybe the problem is with symbols conflicts? The wheels bundles libOpenGL. From what I understand, both libGL and libOpenGL export the same symbols? I might be missing something here. But I'm pretty sure we can't and shouldn't bundle libGL (because like @howetuft, it is forbidden by PEP-599).

from materialx.

howetuft avatar howetuft commented on May 29, 2024

dlopen should be able to find it.

Yes, and it does find it (in glad)!

Maybe the problem is with symbols conflicts?

My answer would be yes, because ldd inspection shows the following:

  • externally provided libGL.so.1 links to system libGLX.so.0 and libGLdispatch.so.0
  • whereas wheel's PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so links to wheel's libGLX-404aa684.so.0.0.0 and libGLdispatch-dcc1ca97.so.0.0.0

So, when libGL.so.1 is dlopened by glad, there is some kind of conflict between system and wheel versions of the two libs. And the conflict disappears when we patch libGL to require wheels' versions of the libs, see my (ugly) workaround above.

But I'm pretty sure we can't and shouldn't bundle libGL

Yes, I was wrong to suggest that in my first posts, my apologies. libGL is and will remain externally provided, due to PEP 599, and we must adapt the code around that.

So, now, my suggested solution:
My proposition would be to make PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so depend on system libGLX.so.0 and libGLdispatch.so.0.

This is made possible by a FindOpenGL parameter, named OpenGL_GL_PREFERENCEand documented here:
https://cmake.org/cmake/help/latest/module/FindOpenGL.html#linux-specific

The expected result is achieved by setting this parameter to LEGACY: that is what I propose in PR #1766
Please note this setting would be restricted to specific LINUX + SKBUILD context.

What would you think about that?

from materialx.

JeanChristopheMorinPerso avatar JeanChristopheMorinPerso commented on May 29, 2024

I'm unfortunately not very familiar with the OpenGL stack. On my arch linux system, libGL.so is coming from https://gitlab.freedesktop.org/glvnd/libglvnd which states

libGL is a wrapper library to libGLdispatch and libGLX which is provided for
backwards-compatibility with applications which link against the old ABI.

Note that since all OpenGL functions are dispatched through the same table in libGLdispatch, it doesn't matter which library is used to find the entrypoint. The same OpenGL function in libGL, libOpenGL, libGLES, and the function pointer returned by glXGetProcAddress are all interchangeable.

This makes me wonder, should MaterialX really use the old ABI? Though, it seems to be coming from https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L97, which seems like a vendored dependency, so I don't think we should change that...

So, now, my suggested solution:
My proposition would be to make PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so depend on system libGLX.so.0 and libGLdispatch.so.0.

This is made possible by a FindOpenGL parameter, named OpenGL_GL_PREFERENCE and documented >here:
https://cmake.org/cmake/help/latest/module/FindOpenGL.html#linux-specific

The expected result is achieved by setting this parameter to LEGACY: that is what I propose in PR #1766
Please note this setting would be restricted to specific LINUX + SKBUILD context.

This won't fix the problem, at least setting OpenGL_GL_PREFERENCE won't. The current build system currently already links against the system libGLX.so.0 and libGLdispatch.so.0 from what I see. The tool used to create the final wheel (cibuildwheel) uses https://github.com/pypa/auditwheel to bundle any libraries required. It's the thing that changes the RPATH, bundles the libs in materialx.libs, etc.

We could instruct it not to bundle those libraries, but we would end up with a broken wheel (like we have today)...

from materialx.

JeanChristopheMorinPerso avatar JeanChristopheMorinPerso commented on May 29, 2024

Basically, what I currently thinking is that changing https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L97 to use libOpenGL instead of libGL would potentially fix the problem...

from materialx.

howetuft avatar howetuft commented on May 29, 2024

This won't fix the problem, at least setting OpenGL_GL_PREFERENCE won't.

Did you test my proposal? On my system, it fixes the problem: Auditwheel does not bundle the libs anymore, and PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so only links to system libs.

--> Can you please try it on your own system before rejecting?

This makes me wonder, should MaterialX really use the old ABI?

As you know, the wheel is compiled in a CentOS 7 environment, dating from 2014. The new ABI was introduced in 2017, if I'm right: GLVND
So my answer is: yes, the Python packaged version of MaterialX is more eligible to the old ABI than to the new one.

glad.c, which seems like a vendored dependency,

It is not a vendor dependency, it is an official dynamic loader from Khronos OpenGL: glad.

We could instruct it not to bundle those libraries, but we would end up with a broken wheel (like we have today)...

That's the miracle: OpenGL_GL_PREFERENCE=LEGACY instructs auditwheel not to bundle libGL* to the wheel, but... doing so, it does not break the wheel!

Basically, what I currently thinking is that changing https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L97 to use libOpenGL instead of libGL would potentially fix the problem...

I'm not sure we should tamper with glad.c. It is a widely used solution, nearly an industry standard, so it should work as-is. Moreover, if we patch it in any way, this would prevent it from being regenerated in the future and break compatibility with potential future evolutions.

However, to extend your thoughts, there is a question we can ask:

Why should PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so be statically linked to libGLX and libGLDispatch, while MaterialX use a dynamic loader at the same time?

Those two approaches static/dynamic should be mutually exclusive, in my understanding...

So, maybe another solution would be to remove all static dependencies to libGL* to the benefit of the dynamic loader. However, this action would have greater impact on the code (and is not guaranteed to succeed)...

from materialx.

howetuft avatar howetuft commented on May 29, 2024

Hello guys,

Any news from your end?

Let me summarise my proposal:

Issue
In the current situation, when building a Python wheel for MaterialX under Linux:

  • the libGL.so is provided by the host system - as an unavoidable consequence of standard (PEP 599)
  • but the libGLdispatch.so and libGLX.so are provided by the build environment (manylinux2014), as a consequence of build settings.

This results into conflicts between libs and breaks the wheel.

Proposed solution
When building a Python wheel under Linux, let's set the CMake OpenGL_GL_PREFERENCE parameter to LEGACY so that the wheel links to the host system's libGLdispatch.so and libGLX.so, rather than to the build system's ones. PR #1766

Key points

  • Works perfectly in the environments I tested (Arch Linux and Ubuntu). But I let you try on your own.
  • Low risk:
    • Modification restricted to the build case of "building a wheel under Linux". No impact on all other cases.
    • No impact on the code itself, just a build parameter
    • Very localised modification: 4 lines in one CMakeLists.txt.

Additional point
Please note that I would need this bug fixed quickly please so that I can deploy a feature in my own code, which requires MaterialX to be installed in a Python virtual environment.

Moreover, Linux distros are increasingly restricting the use of pip to virtual environments, by declaring the Python system environment "externally managed" (https://packaging.python.org/en/latest/specifications/externally-managed-environments/), so I think my case is likely to become more and more common.

--> Therefore, can you come back quickly to me with a solution?

I obviously promote my own proposal, but I would equally welcome any solution that could fix the bug, as soon as it can be implemented in the short term.

Thank you in advance for your responsiveness!

from materialx.

jstone-lucasfilm avatar jstone-lucasfilm commented on May 29, 2024

@howetuft Your proposed fix in #1766 looks very reasonable to me, though I defer to @JeanChristopheMorinPerso and @kwokcb on whether this is the best path forward.

Ideally we'd like to consider our main branch locked for the upcoming MaterialX 1.38.10 release, but we're very open to proposed fixes in the dev_1.39 branch, which will be our main focus leading up to SIGGRAPH 2024.

from materialx.

jstone-lucasfilm avatar jstone-lucasfilm commented on May 29, 2024

Since the change proposed in #1766 is not intrusive, only affecting library preferences for Linux Python wheels, I'd be open to including this in our upcoming 1.38.10 release, if @JeanChristopheMorinPerso and @kwokcb don't have objections.

Let me know what your thoughts are.

from materialx.

jstone-lucasfilm avatar jstone-lucasfilm commented on May 29, 2024

Thanks for this report, @howetuft, as well as for the fix in #1766!

from materialx.

howetuft avatar howetuft commented on May 29, 2024

Thanks to you, and thank you also for this outstanding software!

from materialx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.