Comments (8)
Camera models are always a pain to get right. I haven't bothered about the intrinsic at all, so I cannot help you with that. Regarding the world to pixel space though I think I can help.
First of all, you have the world_view_transform
that simply transforms points from world to view/camera space.
Then you have the projection_matrix
that transforms points from view/camera to NDC space.
The full_proj_transform
is just the combination of the two. It transforms a point from world to NDC space.
Regarding the getProjectionMatrix
, the code uses the OpenGL projection matrix. A very nice blog about that can be found here. However, there are two differences with the matrix shown on that blog. The first is that the used sign is the positive one (so entries [2,2]
and [3,2]
have a positive sign). Also, instead of using a cube from [-1, 1]
in all coordinates for the NDC space, the z coordinate spans just [0,1]
. So at entry [2,2]
instead of (f + n) / (f - n)
, you have f / (f - n)
.
As for the 4x4 shape, it is as such to handle homogenous coordinates.
I also want to point out that OpenGL uses column-major matrices, so they might be the transposed version of what you would expect (this is covered in the blog too).
Hope I helped
from gaussian-splatting.
If we put z=−n in your matrix then we get ff−n(−n)−fnf−n−n=2ff−n≠0. So I think your projection is not meat to map z from (-n, -f) into (0, 1). Instead, you may map (n, f) into (0, 1). In other world, your camera z axis is opposite to opengl camera z axis.
Ok, I see where the mixup is. You are right, the camera space here, unlike OpenGL, has a positive z-axis, so the boundaries are [n, f]
. I'm sorry, I didn't even know that OpenGL had a negative z-axis even for the camera space. I thought it was only the NDC/clip space.
from gaussian-splatting.
Thank you for the answer, now it is way clearer.
It is still not clear to me what is the point of projecting in the NDC space.
Does the rasterizer need to have the gaussians in NDC in order to work?
Because the points can still be projected from the image space to the world space without passing through the NDC
Thanks a lot for your availability
from gaussian-splatting.
NDC is nothing more than a 3D representation of the distorted (after the perspective transform) space. Definitely, you could skip it, but in general, I find it useful because it is still a 3D representation (you have a sense of depth), in which you have the comforts of orthographic projection (rays are parallel to each other, ray direction is just [0, 0, +-1]
, hit and occlusion tests are trivial etc).
from gaussian-splatting.
Camera models are always a pain to get right. I haven't bothered about the intrinsic at all, so I cannot help you with that. Regarding the world to pixel space though I think I can help. First of all, you have the
world_view_transform
that simply transforms points from world to view/camera space. Then you have theprojection_matrix
that transforms points from view/camera to NDC space. Thefull_proj_transform
is just the combination of the two. It transforms a point from world to NDC space.Regarding the
getProjectionMatrix
, the code uses the OpenGL projection matrix. A very nice blog about that can be found here. However, there are two differences with the matrix shown on that blog. The first is that the used sign is the positive one (so entries[2,2]
and[3,2]
have a positive sign). Also, instead of using a cube from[-1, 1]
in all coordinates for the NDC space, the z coordinate spans just[0,1]
. So at entry[2,2]
instead of(f + n) / (f - n)
, you havef / (f - n)
. As for the 4x4 shape, it is as such to handle homogenous coordinates.I also want to point out that OpenGL uses column-major matrices, so they might be the transposed version of what you would expect (this is covered in the blog too). Hope I helped
I think you missed the third difference that your camera z axis is opposite with opengl camera.
Otherwise, the projection matrix should be this, I think.
from gaussian-splatting.
The sign I mentioned affects just the z axis. The code is pretty clear on how the sign is used.
To make it clear, the projection matrix that post mentions is
while this code uses:
from gaussian-splatting.
If we put
So I think your projection is not meat to map z from (-n, -f) into (0, 1).
Instead, you may map (n, f) into (0, 1).
In other world, your camera z axis is opposite to opengl camera z axis.
from gaussian-splatting.
I find a way to understand your projection matrix.
Opengl projection matrix is response for mapping frustum into NDC cube and let
x: [l, r] -> [-1, 1]
y: [b, t] -> [-1, 1]
z: [-n, -f] -> [-1, 1].
And your projection matrix is response for mapping the frustum which is symmetric about the origin into NDC cube and only map z to [0, 1] and let
x: [-r, -l] -> [-1, 1]
y: [-t, -b] -> [-1, 1]
z: [n, f] -> [0, 1].
Here is my derivation:
While
I tried may ways to derive your projection matrix and only this way of understanding works. Please help me if I have miss understanding.
from gaussian-splatting.
Related Issues (20)
- Install submodules diff-gaussian-rasterization and simple-knn HOT 1
- strange training result HOT 7
- Latest Visual Studio 2022 (17.10) not compatiable with CUDA HOT 3
- High resolution image training by dividing image patchs. HOT 7
- Confused about the evalution metric of Mem
- Bad reconstruction results for street view
- Which Gaussians are optimized? HOT 3
- Non uniform image as background HOT 3
- depth images per scene
- sibr_viewer, cmake -Bbuild . -D..... HOT 1
- submodules installation: error: [WinError 2] The system cannot find the file specified HOT 1
- Failed to install the compile the submodules on Windows 10 HOT 2
- Do points3D and camera poses by COLMAP in the same coordinate system? HOT 2
- RuntimeError: numel: integer multiplication overflow
- To change the rendering view HOT 2
- Logarithm scale initialization
- I like your work very much, I have tilted photography photos on my side and if I train with the original resolution, the training doesn't go on. My setup is RTX4090 and I've tried CPU loading data as well. Do you have any thoughts on this please?
- The VR support failed to connect to OpenXR
- basic question from beginner of 3d reconstruction using 3DGS HOT 1
- When training on a custom monocular video, SIBR viewer doesn't work HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gaussian-splatting.