Git Product home page Git Product logo

Comments (4)

cuddlyogre avatar cuddlyogre commented on August 27, 2024

I have been working on a realtime LDraw renderer/editor in Monogame/C# using Datsville as a performance benchmark. Datsville has helped me weed out errors in my Blender importer as well.

Although I get a decent framerate, I still hit a bottleneck because there is such a massive amount of data being processed per frame. I'm talking multiple 10s of millions of vertices and indices and around ~60k objects on screen at ground level.

It also eats up about 5-6GB of RAM when everything is all loaded. This may be improved with some kind of streaming or deferred loading process, but fixing that is premature optimization at this point.

The single massive terrain mesh is over 8MB in triangles, the fence around the airport has low framerate due to the amount of elements that make up the fence, and the cornfield is something like 50k objects on screen at once. It is no surprise that when these are either not on screen, or not loaded at all, frame rates and import times improve.

I've experimented with a handful of optimization strategies. Surprisingly, replacing the meshes with simple colored cubes had almost no affect on framerate on my machine. Not importing studs had almost no affect on framerate. Only when I started limiting the number of objects being drawn do things improve. This also leads me to believe LOD meshes will have minimal impact, so I'm not prioritizing figuring that out.

Lowering draw distances helps, but there are so many objects that once you start getting a decent framerate, you can't even see one end of the airport to the other.

The best outcome I've had is where I merge the parts of individual models in a post processing step so that there aren't so many draw calls per frame, meaning there are much fewer entities to loop through each frame. I import every element as a color code 16 and apply the correct color in a shader, which cuts down on import time and RAM usage. It also improves framerates because I can draw more objects before having to switch index and vertex buffers.

I ran into problems with backface culling where models are mirrored, the Town Hall specifically, but it resulted in a huge performance boost. I will look into correcting these models, even if by hand, to see if it is really worth pursuing. Even if it doesn't help with this task, mirroring models is not recommended in the LDraw docs, so it's a worthwhile fix to make.

I'm also going to rework the terrain to be less dense and to be chunked out so parts that are not on screen can be ignored.

I know next to nothing about HLSL shaders, so everything is a basic color, so there are no fancy effects slowing anything down.

Instancing, interestingly enough, didn't seem to improve framerates much at all. Primarily because the instances need to be organized every frame, which is an expensive operation. I've tried a heap and a modified octree, and both perform pretty similarly. There may be a better way to do it. I will do further research on this.

I really think with a little bit of work, we can make this work.

from datsville.

ScanMountGoat avatar ScanMountGoat commented on August 27, 2024
image

I really think with a little bit of work, we can make this work.

I already have what I would consider acceptable framerates and loading times using ldr_tools and ldr_wgpu. Even my MacBook Air gets decent performance with an infinite draw distance. I plan on making the renderer into a dedicated library for people to use at some point. The optimizations are all documented on the ldr_wgpu repository. I would strongly recommend just using the Rust code, since it's based on WebGPU and will work on most modern GPUs and the web eventually. It's also going to be hard to match the performance even in a modern game engine.

If you really want to use MonoGame, I'll try and summarize some tips to improve your performance. You can also compile and run ldr_wgpu to use for a performance comparison while making optimizations.

It also eats up about 5-6GB of RAM when everything is all loaded.

Use instancing to reduce memory usage. Any handmade LDraw scene should fit easily in GPU or system memory. There's not enough unique data to need streaming. ldr_wgpu uses less than 1 GB when loading datsville. You can also try reducing the precision for colors and normals to save some space.

Surprisingly, replacing the meshes with simple colored cubes had almost no affect on framerate on my machine.

It sounds like you have too many draw calls. You can try instanced rendering to reduce the amount of overhead spent sending the drawing commands to the GPU. Whether it will actually make the GPU render faster or not depends on the scene.

I know next to nothing about HLSL shaders, so everything is a basic color, so there are no fancy effects slowing anything down.

You're probably bottlenecked by vertex processing. You can either reduce the vertex count or use vertex indices to reduce the number of points the GPU needs to calculate.

Primarily because the instances need to be organized every frame, which is an expensive operation.

Is there a reason you need to do this every frame?

If you want to improve GPU performance, you'll need to use some sort of profiling tool to see what actually takes up most of the time. GPU performance is complex, and it's difficult to accurately measure from your own CPU code. Hardware manufacturers provide their own tools like Nsight Graphics (Nvidia), Radeon GPU Profiler (AMD), or the frame profiler built in to XCode for MacOS.

from datsville.

cuddlyogre avatar cuddlyogre commented on August 27, 2024

If you really want to use MonoGame

I do. Nothing against any other project. This is how I relax. And it's nice to have a real problem to solve since my paying work has become fairly routine.

I'm not sure where the 5GB of RAM usage comes from. It is possible I'm not clearing some temporary lists, not that I'm thinking about it. The import process only creates one mesh object that is transformed by the unique objects that use the mesh, so it's not duplication of mesh data. It very well could be duplicate objects that only share a transformation difference. I will look into that. I might be able to adjust the precision of the normals and colors, but those calculations are done by Monogame classes and they work, so I hesitate to venture in that direction.

Is there a reason you need to do this every frame?
I have an octree that is populated every frame that groups entities by part name and I pull the cached instance data from each entity and send that to the GPU. There are fewer draw calls, but more loops. Looking through it just now, I realized I can combine loops, which may improve performance

I will definitely look into the debugging tools, so thank you for that.

from datsville.

cuddlyogre avatar cuddlyogre commented on August 27, 2024

I redid my instancing approach, and the differences massive.

The world is scaled to 0.02 of the actual size.

At position -50, -150, -500 facing east looking at the Dennett Ave sign using an octree, I get 30FPS. With my new instancing approach, I get 60FPS.

My initial approach was to build the instance collection using only the visible entities. The rationale was that I only wanted to send instances that were on screen. This proved prohibitively expensive.

My new approach is to build the instances collection at import and draw every item in that collection regardless if it is on screen or not. This causes the GPU to draw every item in the collection, but it cuts down the draw calls from 35k to about 500. which improves framerates considerably. I am going to explore pruning not visible items to improve things further.

My next experiment is to see what effect LOD meshes might have.

Not loading studs raises the framerates to 90FPS. I will definitely look into stud instancing.

Rendering only colored boxes instead of the actual meshes has the same effect on framerates as disabling studs.

I'm running into issues with backface culling when models are mirrored. I haven't started on solving that yet.

from datsville.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.