Skip to content

Optimize and rewrite texture loading#369

Draft
Phantomical wants to merge 15 commits into
KSPModdingLibs:devfrom
Phantomical:faster-dds-loading
Draft

Optimize and rewrite texture loading#369
Phantomical wants to merge 15 commits into
KSPModdingLibs:devfrom
Phantomical:faster-dds-loading

Conversation

@Phantomical
Copy link
Copy Markdown
Collaborator

KSPCF's texture loader is pretty fast so normally I wouldn't really bother to optimize it, but I would like to do some additional work here and a refactor + optimize pass will make that easier.

The big changes here are:

  • Every texture load is now launched in its own independent coroutine
  • Instead of using the normal Texture2D constructor we now use CreateUnititializedTexture2D, as lifted from KSPTL.
  • DDS loading is now more async:
    • Create a texture
    • Wait a frame for unity to finish its first upload
    • Use AsyncReadManager to read directly into the array returned by GetRawTextureData()
    • Apply (and make it unreadable if needed)
  • The PNG conversion cache is now gone, however we still compress and mark textures unreadable just like it was before.
  • PNG/JPG textures are still loaded through UnityWebRequest, normal swizzling is now offloaded to a background thread.
  • TGA/MBM/etc. remains mostly the same, though normal swizzling is offloaded.

I do have plans to reintroduce a cache in a different form, so this PR keeps the popup around for that use case.

Despite the fact that we are now compressing all the png textures every time, we still get a decent ~25% improvement compared to the old KSPCF texture loader.

Here's what the baseline looks like (2nd reload, warm cache, KSPCF 1.40.1 off CKAN)

[LOG 20:18:54.796] [KSPCF:FastLoader] 13th Gen Intel(R) Core(TM) i7-13700H | 65208 MB | NVIDIA RTX A1000 6GB Laptop GPU (6001 MB)
Total loading time to main menu : 97.088s
- Configs and assemblies loaded in 10.006s
- Configs reload done in 6.669s
- Configs translated in 0.075s
- 13942 assets loaded in 24.458s :
  - 243 audio assets (144.512 MiB) in 4.123s, 35.053 MiB/s
  - 8240 texture assets (7.569 GiB) in 10.593s, 731.758 MiB/s
  - 5430 model assets (1.586 GiB) in 9.740s, 166.818 MiB/s
- Asset bundles loaded in 0.555s
- GameDatabase (configs, resources, traits, upgrades...) loaded in 0.427s
- Built-in parts copied in 0.044s
- Part and internal configs extracted in 0.010s
- 3694 parts and 20662 modules compiled in 28.029s
  - 5.6 modules/part, 7.588 ms/part, 1.357 ms/module
  - PartIcon compilation : 10.499s
- 161 internal spaces and 523 props compiled in 0.938s
- 2 DLC (Making History, Breaking Ground) loaded in 0.387s
- Planetary system loaded in 10.382s

And here's with this PR applied

[LOG 02:04:25.475] [KSPCF:FastLoader] 13th Gen Intel(R) Core(TM) i7-13700H | 65208 MB | NVIDIA RTX A1000 6GB Laptop GPU (6001 MB)
Total loading time to main menu : 89.487s
- Configs and assemblies loaded in 11.225s
- Configs reload done in 5.646s
- Configs translated in 0.076s
- 12697 assets loaded in 17.452s :
  - 243 audio assets (144.512 MiB) in 4.407s, 32.788 MiB/s
  - 7688 texture assets (6.798 GiB) in 7.695s, 904.62 MiB/s
  - 4737 model assets (1.526 GiB) in 5.344s, 292.498 MiB/s
- Asset bundles loaded in 0.493s
- GameDatabase (configs, resources, traits, upgrades...) loaded in 0.337s
- Built-in parts copied in 0.031s
- Part and internal configs extracted in 0.013s
- 3694 parts and 20662 modules compiled in 22.009s
  - 5.6 modules/part, 5.958 ms/part, 1.065 ms/module
  - PartIcon compilation : 8.164s
- 16 internal spaces and 0 props compiled in 0.045s
- 2 DLC (Making History, Breaking Ground) loaded in 0.362s
- Planetary system loaded in 8.935s

This is on a test save with FFT, NFT, BDB, Tantares, SSPX, Heat Control, and all of Sterling Systems - though IVAs are disabled since that seems to be the default?

As a bonus this should perform acceptably well on HDDs since it AsyncReadManager only ever reads one texture at a time.

@Phantomical Phantomical marked this pull request as draft May 6, 2026 03:45
@Phantomical
Copy link
Copy Markdown
Collaborator Author

The performance comparisons above are inaccurate. Marking this as draft until I can come up with better ones (and those better ones actually show an improvement)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant