F34.us Content Aggregator Prototype - Hub - About Me

From Asset to Rendered MicroBlock: the current Update loop

This is a blog post describing how the Strongpoint voxel environment renders and organizes it's data internally.

From Asset to Rendered MicroBlock: the current Update loop

The current Update loop for all the Systems in the ECS World is complicated but provides a robust way to instantiate hundreds of thousands of Renderable Entities and display them at a minimum 60FPS, instantiating an entire map within under 5 seconds. How is this feat accomplished? Via a series of meticulously thought out moving parts, all operating in series to efficiently create the Entities and prepare them for Rendering, leveraging cutting-edge powerful Unity preview libraries.

The SpawnerSystems and BlockLoader together provide a very robust loading system which handles failures of loading very gracefully due to the use of the Unity Addressables API and it's entirely asynchronous loading scheme.

Unity default Transformation Hierarchical Systems are used to move everything into it's proper location in the 3D environment.

After loading is accomplished, the RenderableSystemGroup handles preparing the Entities for display, utilizing the unique data structure of MicroBlock-type Entities to decide on the desired appearance. Then the Unity RenderMesh Systems handle the rendering of hundreds of thousands of Mesh/Materials in under 60 seconds, utilizing advanced culling and rendering techniques like batched and instanced rendering schemes to cut down on draw calls and display them to the user efficiently.

The System Update Loop:

The System Update loop is ticked once per frame on the Main Thread. Eventually I hope to isolate the creation of Entities (the Spawn System) in their own World to entirely separate it from the Main Thread, but for now, all Systems are Update()'d on the Main Thread, and then they spawn Jobs which are Schedule()'d to be performed across multiple threads and across multiple hardware CPU cores, taking advantage of new Unity concurrency APIs.

Here's how it happens, start to finish. We'll start with the very-first moment the game is ran for the first time.

Addressables BlockLoader pre-loading of Assets occurs.

The BlockLoader, upon the initialization of the Scene, handles loading all of the Assets that it foresees will be used in the Scene, ahead of time. This is where you're going to get a "Loading Bar" and this will occur only once, when the game loads the Scene for the first time on creation of the World. It also now loads any "Default" Assets which are used as a fallback if anything fails to load or be configured later on.

It pre-loads all the Assets, creating a bunch of Asynchronous loading Jobs which are carried out by the Addressables API. The loading Jobs are actually done entirely in parallel and are non-blocking, which keeps the Main Thread available. They also handle various "ResourceLocations" which means the Assets can be loaded from the disk, an AssetLibrary/Package, off a mounted CD-ROM, or even from an Asset Streaming Server on the Internet via HTTP, and this happens entirely transparently (the BlockLoader does not have any custom code to handle any of these scenarios, it all 'looks the same' to it regardless of the location of these data).

World creation occurs. The Update Loop begins on the Main Thread!

Pre-frame ECS default Systems run.

The ECS System begins the Update loop by running various pre-frame Systems to prepare the ECS World for use.

SimulationSystemGroup runs.

After some smaller pre-frame Systems run, SimulationSystemGroup is ticked. This default ECS SystemGroup is where all of the "Simulation" Systems - which the vast majority of BlockEditor Systems are part of - are intended to run. This means anything not related to rendering or appearances goes here, such as the Transform Systems which move everything around, the spawning of Entities, etc.

Also at this time various Systems will run which deal with specific in-game events and Entities. For example, the DeathSystem will handle killing off any Entities which have a negative or zero Health here, and doing any work that needs to be done to them at the time of their death.

SpawnTargetSystemGroup begins its Update.

SpawnTargetSystemGroup handles the creation of all in-game Entities in the form of Blocks and their children MicroBlocks, which make up the BlockEditor 3D world. This is naturally performed prior to any Transform-related or rendering-related processes.

SpawnerSystems use "Configs" to determine the data that will be associated with each Block/MicroBlock. These data encompass all of the Components that will be attached to each Entity as well as their appearance. [sidenote: I do intend to create a write-up on Configs eventually, it will be linked here.]

BlockSpawnerSystem runs.

BlockSpawnerSystem is triggered after World creation once the BlockLoader notices that all of the pre-loading Jobs have Completed, and additionally once every frame if any BlockSpawnerTarget Entities are detected to have been placed in the World indicating a Block needs to be spawned. The BlockSpawnerSystem uses the BlockLoader to access pre-loaded and cached BlockConfigs via AssetReferences, to create Blocks and MicroBlocks en-masse. For each Block it creates, it also spawns the many (up to 512 max.) requisite MicroBlockSpawnerTarget children. It does all of this using batch API's, so these Entities are created using pre-cached Archetypes and are spawned in incredibly fast. And it only grabs the data from BlockLoader for each BlockConfig once, even if the BlockConfig is used across many target Blocks.

The Blocks ComponentData are then customized (ideally later on this will be done in a parallelized Job, but for now it's done on Main Thread) according to the data in their BlockConfig, and the spawned in MicroBlockSpawnerTargets are given a reference to their parent Block so they can be properly populated by MicroBlockSpawnerSystem. They are also registered with the MapGrid. At this point these processed BlockSpawnerTargets are entirely ready to go and are now officially Blocks. Their BlockSpawnerTargetComponent is removed, and they are given their BlockComponent instead.

MicroBlockSpawnerSystem runs.

MicroBlockSpawnerSystem sees all of the created MicroBlockSpawnerTarget children that the BlockSpawnerSystem created en masse using batch APIs. It begins to convert them into honest-to-god MicroBlocks, and it does this in massive batches, sorting them and processing them by their MicroBlockConfig AssetReference. Similarly to how the BlockSpawnerSystem runs, it only pulls in the cached MicroBlockConfig data from BlockLoader once for each MicroBlockConfig, even though there may be many thousands of MicroBlocks which each share the same MicroBlockConfig data. This is a big improvement on the old iteration process.

It's important to understand that the MicroBlockSpawnerSystem isn't thinking in terms of the parent-child hierarchy. It sees all of the MicroBlockSpawnerTargets as a flat array of thousands of MicroBlockSpawnerTargets. It's not grouping them by parent Block, but by their type of MicroBlockConfig. There may be MicroBlocks in these iterative groups which come from many different Block parents. The fact that it looks at these MicroBlockSpawnerTargets as a flat array is important.

It means for one we can iterate over them extremely fast due to the nature of the ECS system's data structure, taking advantage of the CPU cache. It also means that we can process them in massive groups, not limited to the 512 max. of MicroBlocks in a parent Block. And additionally, we can use the EntityManager batch APIs to change entire Archetypes/Chunks of them at a time, meaning we do not need to iterate over each MicroBlock. If iteration is required, we can do so in parallel because we are not restricted to operating within a data hierarchy (all of the MicroBlockSpawnerTargets are considered independently from each other during this process). Due to the nature of all of the MicroBlocks processing happening independently from one another, it is able to be done concurrently on multiple worker Threads.

The MicroBlockSpawnerSystem handles the setting of all ComponentData as defined by the MicroBlockConfig. It does some minor checks to verify the MicroBlocks are all sane and proper and then adds the ECS default Components which register it with the ECS default Transform Hierarchical Systems. The MicroBlockSpawnerTargets have now become MicroBlocks.

Transformations are applied based on Positional Components.

The default ECS Transform Systems handle the moving and rotating and scaling and parenting of all of these new Entities as defined by their Translation/Rotation/Scale/etc Components. Everything is moved into place.

Static MicroBlocks are separated from their Parents in the Transform Hierarchy

Due to an inefficiency with the current ECS API, transformation calculations seem to be repeatedly affecting Block->MicroBlock relationships every frame even though both parties have been marked as Static. LocalToWorldRemovalSystem.cs attempts to alleviate this performance constraint by destroying the ECS Transform Hierarchy we just created between Parent Blocks and Child MicroBlocks only after they are moved into place and if they are marked as Static.

The reasoning is: since these Blocks/MicroBlocks should never move (this is what the Static Component implies) their float4x4 LocalToWorld transformation matrix should never change. So it is useless to consider these MicroBlocks parent's position Components relative to the child, or even their own positional Components as having an effect on their matrix. We just remove all of them. Stranding the MicroBlock in it's current position, with only a LocalToWorld matrix Component which is used by the RenderMeshSystemV2 to calculate RenderBounds for Culling and to determine the Entities' position for rendering. The overall result is now the ECS system does not see this MicroBlock as even being a child anymore. That may have some negative implications down the line, but it does mean that Transform/Rendering-related Systems will not perform any additional calculations related to the hierarchy. This doubles performance with the current version of the ECS library.

RenderableSystemGroup prepares the Blocks/MicroBlocks for Rendering

RenderableSystemGroup which consists of MicroBlockRenderableSystem and MicroBlockRenderMeshSystem come into play to prepare the MicroBlocks (which are the only Entities in the game so far which have a visual appearance) for rendering and display.

MicroBlockRenderableSystem chooses a Renderable for each MicroBlock.

Each MicroBlock has an array of up to 5 Renderables (which are essentially Mesh/Material pairs which describe a possible appearance for the renderer). These are stored in an array within the RenderableComponent, which is a Shared ComponentData attached to each MicroBlock. Because it's an ISharedComponentData type, the ECS system stores the RenderableData only once per each configuration of RenderableComponent that exists. The MicroBlock Entities are only given a reference to the RenderableComponent, meaning there are no copies of the Renderable data in memory. This cuts down on memory usage dramatically. And since the RenderableComponent reference is stored with the Chunk, per-Archetype filtering is available allowing us to easily filter by appearance, selecting all of the MicroBlocks that exist in the World with a specific RenderableComponent and iterating over them without having to do any pointer reference lookups, storing the entire iteration concisely in CPU caches. This is a massive advantage (if not the biggest advantage) of the ECS system and allows Render Systems to batch up MicroBlock Entities by-appearance for instanced/batched Render calls to the GPU.

All that aside - the MicroBlockRenderableSystem's main purpose is to determine which of those 5 possible Renderables in the Renderable Component this MicroBlock will use. This is determined by indices in the RenderableGroupComponent that every MicroBlock has. This allows a System to adjust which appearance a MicroBlock uses by changing the selected RenderableGroup index. The use of an IndexSet in the RenderableGroupComponent means that a System modifying a MicroBlock's appearance does not need to know exactly what Renderables it has available, only that it wants to use the "4th" Renderable in it's RenderableGroup for example, whatever that may be. This allows us to do fun things like have each index in the IndexSet refer to increasingly damaged appearances (first being least damaged and pristine, fifth being a heavily damaged and burnt, cracked appearance) for walls and structures.

The MicroBlockRenderableSystem determines the appropriate Renderable index and adds a RenderMeshIndexComponent to the MicroBlock. This is a ISharedComponentData type which just contains the 0-4 inclusive range Renderable index integer value. Because it is an ISharedComponentData type it not only causes the Archetype of these MicroBlocks to change, triggering a restructuring of the underlying ECS data layout, but it allows us to filter the MicroBlocks by their Renderable index.

MicroBlockRenderMeshSystem assigns a RenderMesh Component.

Using the newly-reorganized ECS data layout, the MicroBlockRenderMeshSystem leverages batch EntityManager API calls to filter all of the MicroBlocks en masse by not only their possible appearances (the RenderableComponent) but also their selected index into those appearances (the new RenderMeshIndexComponent). This means it now has easily-iterable groups of MicroBlocks sorted by, essentially, their chosen Renderable. And this all happens extremely fast due to the batch API calls and the fact that the ECS data layout was pre-organized by the addition of the RenderMeshIndexComponent beforehand.

In fact, this System does no iteration over any of the MicroBlocks at all! It only iterates over the large groups of them by appearance. This allows it to process 100K Entities in a small fraction of a second (usually under 30ms).

The MicroBlockRenderMeshSystem uses the filtered groups of Entities to assign for each one a constructed RenderMesh Component. This default ECS Component is used by the built-in RenderMeshSystemV2 of the Unity Hybrid Renderer Package to perform high-performance rendering of all of the Entities in a data-agnostic format. The MicroBlockRenderMeshSystem, similarly to the SpawnerSystems, only performs the construction of each RenderMesh Component once for each possible appearance configuration, meaning if 100K Entities all have the same selected Renderable, only one RenderMesh construction and assignment operation needs to be performed across all 100K of those Entities, with the EntityManager iterating over all of the Chunks of these Entities and updating the RenderMesh ISharedComponentData reference number for all the Chunks at once. This means no Entity iteration is needed and it could update all of those 100K Entities appearance in under a millisecond assuming optimal ECS Archetype/Chunk layout (which is usually guaranteed by the use of the MicroBlockRenderableSystem's pre-organization task).

Important note: RenderableSystemGroup Systems are only run once per appearance change. An appearance change being triggered by some other System, or upon creation of a new MicroBlock. This means for most frames, the processes just described won't even occur at all, because all of this generated data is cached once it's inside the RenderMesh Component. As a result, the RenderableSystemGroup Systems average about 0.01ms execution time during the Update loop (essentially zero overhead) across the execution of the game and are extremely efficient, only running once and when they absolutely are needed to.

At this point all of the MicroBlocks have been assigned an appearance and are now ready to be rendered by the built-in Unity batch rendering Systems.

ECS default Rendering Systems kick in to draw all Entities.

Here the various rendering-related Systems such as RenderMeshSystemV2 and other Systems that are part of the new, bleeding-edge Unity Hybrid Renderer Package (which at the time of writing this wiki is only at version v0.0.2-preview and doesn't even have documentation or even a changelog written for it yet, since it's so new) perform their duties to render all of these MicroBlocks at a blazing fast speed. It's only possible due to the data layout and guarantees of the underlying Unity ECS system that an efficient iteration and grouping of this many Entities is possible, but at the current state of the project, we're able to achieve a solid 60FPS framerate while displaying nearly 100K MicroBlock Entities, each with their own Mesh and Material with diffuse shader, all on screen without any sort of LOD or viewport culling optimizations of any kind. Of the time it takes to render, only 10ms of it is spent actually waiting on the GPU due to the reduced draw calls via batching/instancing, the vast majority of the per-frame computation time being due to the arrangement and preparation of Entities before the rendering process.

End frame clean-up Systems run.

At the end of the frame after all of the SimulationGroup and PresentationGroup Systems have finished their duty, some extra Systems run to process any clean-up duties are needed before we go to the next frame.

The Update Loop is complete.

This loop runs all-over again the next frame. So as you can see there's a lot going on each frame. And it's pretty amazing that the Unity ECS package makes it possible for us to do this the under 30 or so odd milliseconds required to maintain a stable 60FPS. Hopefully this gives you a greater understanding of how the BlockEditor goes from Asset to an on-screen rendered image.