Comments (19)
Thanks for this question, @howetuft, and my main experience in this area is with MaterialX renders to virtual framebuffers in GitHub Actions:
https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/.github/workflows/main.yml#L252
I'm also CC'ing @kwokcb, who has done initial work on rendering MaterialX content from Jupyter notebooks, and may have some good advice to offer.
from materialx.
Base on the error which comes from here it appears that GLAD cannot be initialized.
I don't have a Linux environment but it seems you have a X Display setup properly so I guess a check to see that the appropriate OpenGL libraries are installed and runnable on an X Display would help narrow the issue down. I think you can test this by running the MaterialXView and / or MaterialXNodeEditor which you can download from a distribution.
If you have this built locally then you can run MaterialXTest [renderglsl]
from the build/bin
area. That runs the same code as the Python wrappers.
Hope this helps. Ping if you need more follow-up.
(BTW, I have run this Python wrapper locally on Mac and Windows in virtual envs)
from materialx.
BTW, This is what I do for my renderer class which is the same as the sample code.
from materialx.
Hello,
Thank you for your quick response!
a check to see that the appropriate OpenGL libraries are installed and runnable on an X Display would help narrow the issue down.
My OpenGL has been operational for years, so I have few doubts about it, but I ran eglinfo -B -p x11
and here is the result:
X11 platform:
EGL API version: 1.5
EGL vendor string: Mesa Project
EGL version string: 1.5
EGL client APIs: OpenGL OpenGL_ES
OpenGL core profile vendor: Intel
OpenGL core profile renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
OpenGL core profile version: 4.6 (Core Profile) Mesa 24.0.3-arch1.1
OpenGL core profile shading language version: 4.60
OpenGL compatibility profile vendor: Intel
OpenGL compatibility profile renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
OpenGL compatibility profile version: 4.6 (Compatibility Profile) Mesa 24.0.3-arch1.1
OpenGL compatibility profile shading language version: 4.60
OpenGL ES profile vendor: Intel
OpenGL ES profile renderer: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
OpenGL ES profile version: OpenGL ES 3.2 Mesa 24.0.3-arch1.1
OpenGL ES profile shading language version: OpenGL ES GLSL ES 3.20
glxgears
also runs perfectly.
I think you can test this by running the MaterialXView and / or MaterialXNodeEditor which you can download from a distribution.
I confirm that, out of any virtual environment:
-
my above Python script runs like a charm
-
and so do the other official programs from distribution
MaterialXView
andMaterialXNodeEditor
Moreover I've been using MaterialX for 4 months (developing something based on it...), I was fully satisfied and encountered no issue till yesterday when I decided to move everything to virtual env, for deployment reasons...
If you have this built locally then you can run MaterialXTest [renderglsl] from the build/bin area. That runs the same code as the Python wrappers.
Sorry I installed MaterialX from ArchLinux repo (sudo pacman -S materialx
), not built from source.
Base on the error which comes from here it appears that GLAD cannot be initialized.
I fully agree with you!
If I may, I would particularly suspect this function: https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L88
For Linux (and only for Linux), it tries to dynamically load libGL.so
or libGL.so.1
. However, this lib is not installed in the virtual env, so I wonder if the interpreter is able to find it in the system.
(BTW, I have run this Python wrapper locally on Mac and Windows in virtual envs)
That's why I suspect int open_gl(void)
, which behaves differently depending on the OS. Your opinion?
from materialx.
@kwokcb @jstone-lucasfilm
OK, I think I found the bug. I confirm it is related to your omitting libGL.so.1
in the wheel.
Indeed, if I copy (and patch somewhat) libGL.so.1
into the virtual environment, the Python script now runs smoothly.
The bash script becomes:
# Create virtual env (same as before)
python -m venv .venv
source .venv/bin/activate
pip install MaterialX
# Copy and patch libGL.so.1 (ugly, just for demo)
cd .venv/lib/python3.11/site-packages/materialx.libs
cp /usr/lib/libGL.so.1 .
patchelf --replace-needed libGLdispatch.so.0 libGLdispatch-dcc1ca97.so.0.0.0 libGL.so.1
patchelf --replace-needed libGLX.so.0 libGLX-404aa684.so.0.0.0 libGL.so.1
(nota: without elf patch, libGL.so.1
would remain pointing to system libGLdispatch.so
and libGLX.so
libs instead of local ones, and it would still fail...)
At this stage, the Python script now runs without error.
Of course, this copy-and-patch workaround is ugly, it is just for demo.
But it allows me to conclude that you (or your PyPi packager) should include a libGL.so.1
in the wheel for Linux.
--> Are you OK with that and would it be possible for you to make the fix?
from materialx.
That sounds like a good catch, @howetuft, and I'm open to that fix, but it's a bit beyond my own knowledge to include that OpenGL file in our Python distributions.
Would you have the bandwidth to propose that as a pull request to GitHub, and we can review the changes for our next release?
from materialx.
Certainly! Or I can try, at least...
from materialx.
I think you need add a dependency in pyproject.toml
? Something like this:
[options.extras_require]
# Add dependencies specific to Linux
linux = ["libGL"]
Adding in @JeanChristopheMorinPerso for comment as he set this up.
from materialx.
Unfortunately, this doesn't work, as far as I know. Indeed, no matter what you do, libGL.so
cannot be included in the wheel, because it is externally-provided according to the policy (
https://peps.python.org/pep-0599/#the-manylinux2014-policy).
But I think I may have found another solution, which I'll submit to you this weekend.
from materialx.
dlopen
should be able to find it. The kernel delegates that to ld.so
(https://man7.org/linux/man-pages/man3/dlopen.3.html). In our case, PyMaterialXRenderGlsl.cpython-39-x86_64-linux-gnu.so
has an DT_RPATH
set to $ORIGIN:$ORIGIN/../materialx.libs
, so ld.so
will try that first it, it will fail and it will then look at the usual system directories.
Maybe the problem is with symbols conflicts? The wheels bundles libOpenGL. From what I understand, both libGL
and libOpenGL
export the same symbols? I might be missing something here. But I'm pretty sure we can't and shouldn't bundle libGL
(because like @howetuft, it is forbidden by PEP-599).
from materialx.
dlopen
should be able to find it.
Yes, and it does find it (in glad)!
Maybe the problem is with symbols conflicts?
My answer would be yes, because ldd
inspection shows the following:
- externally provided
libGL.so.1
links to systemlibGLX.so.0
andlibGLdispatch.so.0
- whereas wheel's
PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so
links to wheel'slibGLX-404aa684.so.0.0.0
andlibGLdispatch-dcc1ca97.so.0.0.0
So, when libGL.so.1
is dlopened by glad, there is some kind of conflict between system and wheel versions of the two libs. And the conflict disappears when we patch libGL to require wheels' versions of the libs, see my (ugly) workaround above.
But I'm pretty sure we can't and shouldn't bundle
libGL
Yes, I was wrong to suggest that in my first posts, my apologies. libGL
is and will remain externally provided, due to PEP 599, and we must adapt the code around that.
So, now, my suggested solution:
My proposition would be to make PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so
depend on system libGLX.so.0
and libGLdispatch.so.0
.
This is made possible by a FindOpenGL parameter, named OpenGL_GL_PREFERENCE
and documented here:
https://cmake.org/cmake/help/latest/module/FindOpenGL.html#linux-specific
The expected result is achieved by setting this parameter to LEGACY
: that is what I propose in PR #1766
Please note this setting would be restricted to specific LINUX + SKBUILD context.
What would you think about that?
from materialx.
I'm unfortunately not very familiar with the OpenGL stack. On my arch linux system, libGL.so
is coming from https://gitlab.freedesktop.org/glvnd/libglvnd which states
libGL is a wrapper library to libGLdispatch and libGLX which is provided for
backwards-compatibility with applications which link against the old ABI.Note that since all OpenGL functions are dispatched through the same table in libGLdispatch, it doesn't matter which library is used to find the entrypoint. The same OpenGL function in libGL, libOpenGL, libGLES, and the function pointer returned by glXGetProcAddress are all interchangeable.
This makes me wonder, should MaterialX really use the old ABI? Though, it seems to be coming from https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L97, which seems like a vendored dependency, so I don't think we should change that...
So, now, my suggested solution:
My proposition would be to make PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so depend on system libGLX.so.0 and libGLdispatch.so.0.This is made possible by a FindOpenGL parameter, named OpenGL_GL_PREFERENCE and documented >here:
https://cmake.org/cmake/help/latest/module/FindOpenGL.html#linux-specificThe expected result is achieved by setting this parameter to LEGACY: that is what I propose in PR #1766
Please note this setting would be restricted to specific LINUX + SKBUILD context.
This won't fix the problem, at least setting OpenGL_GL_PREFERENCE
won't. The current build system currently already links against the system libGLX.so.0
and libGLdispatch.so.0
from what I see. The tool used to create the final wheel (cibuildwheel) uses https://github.com/pypa/auditwheel to bundle any libraries required. It's the thing that changes the RPATH, bundles the libs in materialx.libs
, etc.
We could instruct it not to bundle those libraries, but we would end up with a broken wheel (like we have today)...
from materialx.
Basically, what I currently thinking is that changing https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L97 to use libOpenGL
instead of libGL
would potentially fix the problem...
from materialx.
This won't fix the problem, at least setting OpenGL_GL_PREFERENCE won't.
Did you test my proposal? On my system, it fixes the problem: Auditwheel does not bundle the libs anymore, and PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so
only links to system libs.
--> Can you please try it on your own system before rejecting?
This makes me wonder, should MaterialX really use the old ABI?
As you know, the wheel is compiled in a CentOS 7 environment, dating from 2014. The new ABI was introduced in 2017, if I'm right: GLVND
So my answer is: yes, the Python packaged version of MaterialX is more eligible to the old ABI than to the new one.
glad.c, which seems like a vendored dependency,
It is not a vendor dependency, it is an official dynamic loader from Khronos OpenGL: glad.
We could instruct it not to bundle those libraries, but we would end up with a broken wheel (like we have today)...
That's the miracle: OpenGL_GL_PREFERENCE=LEGACY instructs auditwheel not to bundle libGL*
to the wheel, but... doing so, it does not break the wheel!
Basically, what I currently thinking is that changing https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/source/MaterialXRenderGlsl/External/Glad/glad.c#L97 to use libOpenGL instead of libGL would potentially fix the problem...
I'm not sure we should tamper with glad.c. It is a widely used solution, nearly an industry standard, so it should work as-is. Moreover, if we patch it in any way, this would prevent it from being regenerated in the future and break compatibility with potential future evolutions.
However, to extend your thoughts, there is a question we can ask:
Why should PyMaterialXRenderGlsl.cpython-311-x86_64-linux-gnu.so
be statically linked to libGLX
and libGLDispatch
, while MaterialX use a dynamic loader at the same time?
Those two approaches static/dynamic should be mutually exclusive, in my understanding...
So, maybe another solution would be to remove all static dependencies to libGL*
to the benefit of the dynamic loader. However, this action would have greater impact on the code (and is not guaranteed to succeed)...
from materialx.
Hello guys,
Any news from your end?
Let me summarise my proposal:
Issue
In the current situation, when building a Python wheel for MaterialX under Linux:
- the
libGL.so
is provided by the host system - as an unavoidable consequence of standard (PEP 599) - but the
libGLdispatch.so
andlibGLX.so
are provided by the build environment (manylinux2014), as a consequence of build settings.
This results into conflicts between libs and breaks the wheel.
Proposed solution
When building a Python wheel under Linux, let's set the CMake OpenGL_GL_PREFERENCE
parameter to LEGACY
so that the wheel links to the host system's libGLdispatch.so
and libGLX.so
, rather than to the build system's ones. PR #1766
Key points
- Works perfectly in the environments I tested (Arch Linux and Ubuntu). But I let you try on your own.
- Low risk:
- Modification restricted to the build case of "building a wheel under Linux". No impact on all other cases.
- No impact on the code itself, just a build parameter
- Very localised modification: 4 lines in one CMakeLists.txt.
Additional point
Please note that I would need this bug fixed quickly please so that I can deploy a feature in my own code, which requires MaterialX to be installed in a Python virtual environment.
Moreover, Linux distros are increasingly restricting the use of pip
to virtual environments, by declaring the Python system environment "externally managed" (https://packaging.python.org/en/latest/specifications/externally-managed-environments/), so I think my case is likely to become more and more common.
--> Therefore, can you come back quickly to me with a solution?
I obviously promote my own proposal, but I would equally welcome any solution that could fix the bug, as soon as it can be implemented in the short term.
Thank you in advance for your responsiveness!
from materialx.
@howetuft Your proposed fix in #1766 looks very reasonable to me, though I defer to @JeanChristopheMorinPerso and @kwokcb on whether this is the best path forward.
Ideally we'd like to consider our main
branch locked for the upcoming MaterialX 1.38.10 release, but we're very open to proposed fixes in the dev_1.39
branch, which will be our main focus leading up to SIGGRAPH 2024.
from materialx.
Since the change proposed in #1766 is not intrusive, only affecting library preferences for Linux Python wheels, I'd be open to including this in our upcoming 1.38.10 release, if @JeanChristopheMorinPerso and @kwokcb don't have objections.
Let me know what your thoughts are.
from materialx.
Thanks for this report, @howetuft, as well as for the fix in #1766!
from materialx.
Thanks to you, and thank you also for this outstanding software!
from materialx.
Related Issues (20)
- Incorrect implementation of SplitTB, RampTB and Ramp4 node types HOT 2
- Incorrect implementation of Convert_Vector2_Vector3 HOT 2
- MaterialX WebGL Path Tracer HOT 3
- Wrap a mtlx document into a nodedef (through API or graphEditor) HOT 7
- Graph Editor 1.38.10 does not update as 1.38.8
- Specification Query: Can top level outputs be connected to upstream nodegraph?
- Clarify ABI compatibility rules in MaterialX HOT 5
- Suggestion: Add more Global Nodes HOT 3
- Proposal: Apply convert node rules to image edge cases
- Not all math functions are documented in the specification. HOT 4
- 1.39 "channels" upgrade in creating incorrect connections on some nodegraph configurations HOT 2
- Allow users to input values in Web Viewer outside of soft min and max values
- Bump node not working with genglsl
- 1.39 "channels" upgrade issue due to value vs. connection priority logic
- 1.39 channels upgrade attempts to create invalid swizzle / extract for float -> float extraction
- Validation incorrectly categorizes top level inputs connected to graphs as invalid
- Setting an element's colorspace attribute to empty string has unexpected behavior
- Query: What is supposed to happen with multiple <materialx> tags in a document ?
- Cloverleaf placement appears incorrect HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from materialx.