Comments (9)
Camera models are always a pain to get right. I haven't bothered about the intrinsic at all, so I cannot help you with that. Regarding the world to pixel space though I think I can help.
First of all, you have the world_view_transform
that simply transforms points from world to view/camera space.
Then you have the projection_matrix
that transforms points from view/camera to NDC space.
The full_proj_transform
is just the combination of the two. It transforms a point from world to NDC space.
Regarding the getProjectionMatrix
, the code uses the OpenGL projection matrix. A very nice blog about that can be found here. However, there are two differences with the matrix shown on that blog. The first is that the used sign is the positive one (so entries [2,2]
and [3,2]
have a positive sign). Also, instead of using a cube from [-1, 1]
in all coordinates for the NDC space, the z coordinate spans just [0,1]
. So at entry [2,2]
instead of (f + n) / (f - n)
, you have f / (f - n)
.
As for the 4x4 shape, it is as such to handle homogenous coordinates.
I also want to point out that OpenGL uses column-major matrices, so they might be the transposed version of what you would expect (this is covered in the blog too).
Hope I helped
from gaussian-splatting.
I find a way to understand your projection matrix.
Opengl projection matrix is response for mapping frustum into NDC cube and let
x: [l, r] -> [-1, 1]
y: [b, t] -> [-1, 1]
z: [-n, -f] -> [-1, 1].
And your projection matrix is response for mapping the frustum which is symmetric about the origin into NDC cube and only map z to [0, 1] and let
x: [-r, -l] -> [-1, 1]
y: [-t, -b] -> [-1, 1]
z: [n, f] -> [0, 1].
Here is my derivation:
While
I tried may ways to derive your projection matrix and only this way of understanding works. Please help me if I have miss understanding.
from gaussian-splatting.
If we put z=−n in your matrix then we get ff−n(−n)−fnf−n−n=2ff−n≠0. So I think your projection is not meat to map z from (-n, -f) into (0, 1). Instead, you may map (n, f) into (0, 1). In other world, your camera z axis is opposite to opengl camera z axis.
Ok, I see where the mixup is. You are right, the camera space here, unlike OpenGL, has a positive z-axis, so the boundaries are [n, f]
. I'm sorry, I didn't even know that OpenGL had a negative z-axis even for the camera space. I thought it was only the NDC/clip space.
from gaussian-splatting.
Thank you for the answer, now it is way clearer.
It is still not clear to me what is the point of projecting in the NDC space.
Does the rasterizer need to have the gaussians in NDC in order to work?
Because the points can still be projected from the image space to the world space without passing through the NDC
Thanks a lot for your availability
from gaussian-splatting.
NDC is nothing more than a 3D representation of the distorted (after the perspective transform) space. Definitely, you could skip it, but in general, I find it useful because it is still a 3D representation (you have a sense of depth), in which you have the comforts of orthographic projection (rays are parallel to each other, ray direction is just [0, 0, +-1]
, hit and occlusion tests are trivial etc).
from gaussian-splatting.
Camera models are always a pain to get right. I haven't bothered about the intrinsic at all, so I cannot help you with that. Regarding the world to pixel space though I think I can help. First of all, you have the
world_view_transform
that simply transforms points from world to view/camera space. Then you have theprojection_matrix
that transforms points from view/camera to NDC space. Thefull_proj_transform
is just the combination of the two. It transforms a point from world to NDC space.Regarding the
getProjectionMatrix
, the code uses the OpenGL projection matrix. A very nice blog about that can be found here. However, there are two differences with the matrix shown on that blog. The first is that the used sign is the positive one (so entries[2,2]
and[3,2]
have a positive sign). Also, instead of using a cube from[-1, 1]
in all coordinates for the NDC space, the z coordinate spans just[0,1]
. So at entry[2,2]
instead of(f + n) / (f - n)
, you havef / (f - n)
. As for the 4x4 shape, it is as such to handle homogenous coordinates.I also want to point out that OpenGL uses column-major matrices, so they might be the transposed version of what you would expect (this is covered in the blog too). Hope I helped
I think you missed the third difference that your camera z axis is opposite with opengl camera.
Otherwise, the projection matrix should be this, I think.
from gaussian-splatting.
The sign I mentioned affects just the z axis. The code is pretty clear on how the sign is used.
To make it clear, the projection matrix that post mentions is
while this code uses:
from gaussian-splatting.
If we put
So I think your projection is not meat to map z from (-n, -f) into (0, 1).
Instead, you may map (n, f) into (0, 1).
In other world, your camera z axis is opposite to opengl camera z axis.
from gaussian-splatting.
I find a way to understand your projection matrix. Opengl projection matrix is response for mapping frustum into NDC cube and let wn=−zc, that is:
x: [l, r] -> [-1, 1] y: [b, t] -> [-1, 1] z: [-n, -f] -> [-1, 1].
And your projection matrix is response for mapping the frustum which is symmetric about the origin into NDC cube and only map z to [0, 1] and let wn=zc, that is:
x: [-r, -l] -> [-1, 1] y: [-t, -b] -> [-1, 1] z: [n, f] -> [0, 1].
Here is my derivation:
While xc means x coord in camera coord and xn means x coord in homogenous NDC coord and xn′ means x coord in NDC coord.
I tried may ways to derive your projection matrix and only this way of understanding works. Please help me if I have miss understanding.
I have tried many ways to understand this function, and your answer is the most convincing explanation I've seen so far.
from gaussian-splatting.
Related Issues (20)
- The VR support failed to connect to OpenXR
- basic question from beginner of 3d reconstruction using 3DGS HOT 1
- When training on a custom monocular video, SIBR viewer doesn't work HOT 1
- How to reduce floaters/artifacts in the air HOT 1
- Any plan on making Gaussian splatting opensource ? HOT 1
- How to get color on point cloud.ply? HOT 1
- Issues with the output point cloud files HOT 1
- AssertionError: would build wheel with unsupported tag ('cp37', 'cp38', 'linux_x86_64')
- Why there is a second version of function of getting world-to-camera (i.e. getWorld2View and getWorld2View2)?
- The feature initialization in create_from_pcd in gaussian_model.py seems confusing....
- AttributeError: module 'typing' has no attribute 'TypedDict' HOT 2
- Reality Capture and Point Cloud HOT 1
- Running into GCC killed issue when installing (RAM Issue?)
- SIBR build error (Ubuntu 22.04) with ImGui
- Failed to build diff-gaussian-rasterization
- Failed building wheel for diff_gaussian_rasterization HOT 4
- cuda backwards is not used? HOT 1
- About Visualization Results HOT 1
- Error with gradients when using camera with arbitrary principal points outside of Image Boundaries
- About visualiazation with SIBR viewers. How does it generate the 3d model. What files are important in the process. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gaussian-splatting.