dune-extract: icon pipeline #13 + super-review audit #14-#36 + real-texture validation #14

Open
Sponge wants to merge 4 commits from feature/dune-extract-html-icons into develop
Owner

Lands the dune-extract icon pipeline and the full security/robustness/structural hardening audit onto develop.

What's in this PR (4 commits)

  • 9b3dc5b — super-review audit #14-#36: output escaping (#14-#16), failure observability (#17), parser robustness (#18-#22), structural debt (#23-#26), hygiene/correctness (#27-#35), render reproducibility (#36).
  • f766aaa#13 productionized per-item icon pipeline: dune_extract/icon_extract.py (Icon SoftObject -> Texture2D -> bounds-safe parse -> BC1-BC7/raw decode -> stdlib PNG), --icons opt-in, card + modal img, [icons] optional extra.
  • 1e5868e — POST-WORK GATE: archive #13-#36 verbatim to FINALIZED, clear TODO.
  • f7accd3#13 real-texture validation harness proving the production decode path on real Funcom data.

Validation

Real-texture run against the live DuneSandbox.pak (3.24 GB, 16,406 data-area entries, clean-room Oodle backend): 153 icon Texture2D assets found, 150/150 decoded to VALID PNG (BC7=147, DXT5=3), 0 invalid, 0 degenerate, 3 cleanly skipped by the safety contract. Independently re-confirmed via file(1) + Pillow (256x256 RGBA, real imagery).

All docs (TODO/FINALIZED) updated in-tree per the docs-before-push LAW.

Lands the dune-extract icon pipeline and the full security/robustness/structural hardening audit onto develop. ## What's in this PR (4 commits) - **9b3dc5b** — super-review audit #14-#36: output escaping (#14-#16), failure observability (#17), parser robustness (#18-#22), structural debt (#23-#26), hygiene/correctness (#27-#35), render reproducibility (#36). - **f766aaa** — #13 productionized per-item icon pipeline: dune_extract/icon_extract.py (Icon SoftObject -> Texture2D -> bounds-safe parse -> BC1-BC7/raw decode -> stdlib PNG), --icons opt-in, card + modal img, [icons] optional extra. - **1e5868e** — POST-WORK GATE: archive #13-#36 verbatim to FINALIZED, clear TODO. - **f7accd3** — #13 real-texture validation harness proving the production decode path on real Funcom data. ## Validation Real-texture run against the live DuneSandbox.pak (3.24 GB, 16,406 data-area entries, clean-room Oodle backend): 153 icon Texture2D assets found, 150/150 decoded to VALID PNG (BC7=147, DXT5=3), 0 invalid, 0 degenerate, 3 cleanly skipped by the safety contract. Independently re-confirmed via file(1) + Pillow (256x256 RGBA, real imagery). All docs (TODO/FINALIZED) updated in-tree per the docs-before-push LAW.
Resolves the full /super-review audit of tools/dune-extract/ (~8.7k LOC).

P0 output escaping (XSS — fires on benign data, ships in a shareable artifact):
- #14 <script> JSON payload: escape <,>,& -> \uXXXX (was </-only, unsound)
- #15 coerce tier (int|None) + whitelist status before class/attr/innerHTML;
      hardened JS TIER_CLASS to integer-only
- #16 depth-cap _format_cell recursion (RecursionError on deep structs)

P1 failure observability:
- #17 narrow except Exception in enrichment + localization to expected decode
      errors; route unexpected ones through callbacks + a skip-breakdown summary

P2 parser robustness (crash-clean on corrupt / Steam-updated paks):
- #18 cap num_blocks + struct.error/IndexError -> ExtractionError
- #19 MAX_UNC_SIZE allocation guard in parse_inpak_header
- #20 bounds-check fstring length, export stride/offsets, summary counts
- #21 atomic artifact writes (temp + fsync + os.replace) for all writers
- #22 validate untrusted-JSON shape + generator marker in catalog diff

P3 structural debt:
- #23 document provenance-vs-authoritative categorization boundary + reconcile
- #24 shared _common.py (dt_stem, PKG_TAG_LE, header window) + unified MD formatters
- #25 hoist CSS/JS to _assets.py + decompose _build_items_grid_html
      (html_writer 1782 -> ~1230 lines); stdlib-only, no Jinja2
- #26 delete dead _consume dict branch

P4 hygiene / correctness:
- #27 top-level error handling in main() + promote steam_locator.has_paks
- #28 subprocess timeouts + refuse-to-guess AES key + honest zeroing docstring
- #29 reject path-traversal in appmanifest installdir
- #30 tier _T<n> priority, description order, hoist _HOUSE_TOKENS
- #31 dead-import / dead-code sweep
- #32 asset-path value-is-None bias
- #33 _row_diff non-dict row guard
- #34 visible error banner on catalog-data parse failure
- #35 _walk_for_field falsy-name guard + dead depth default
- #36 verified render reproducibility across hash seeds

Every fix verified: compile + targeted behavior tests + end-to-end render.
Promotes the throwaway icon-decode spike to a real, wired, opt-in feature.

New dune_extract/icon_extract.py:
- resolve each item's Icon SoftObject ref -> Texture2D asset name
- locate + extract the texture from the pak, parse FTexturePlatformData
  (pixel format + dims + top-mip), bounds-safe per #18-#20
- decode_to_rgba: BC1/BC2/BC3/BC4/BC5/BC7 via optional texture2ddecoder,
  raw B8G8R8A8/R8G8B8A8/A8R8G8B8 in pure Python
- encode_png: pure-stdlib zlib PNG encoder (no Pillow)

Safety contract (matches the #14-#36 audit philosophy):
- emits a PNG ONLY when the format is supported AND mip bytes exactly match
  the format+dims size; any mismatch/unknown-format -> skip (never garbage)
- clean no-op when texture2ddecoder is absent
- traversal-safe PNG filenames; icon src is HTML-escaped on the card

Wiring:
- --icons opt-in flag (__main__)
- Catalog.icons field + from_json_dump restore (enrichment)
- dump carry (catalog_writer.write_json)
- card <img class="item-icon"> + modal preview + CSS (html_writer/_assets)
- [icons] optional extra (pyproject); README + CLI reference updated

Verified here: PNG stdlib round-trip, mip-size math, parse_texture on
synthetic cooked layout (incl. unknown-format skip), B8G8R8A8 channel swap,
icon-ref resolution, graceful no-op, dump round-trip, card+modal img render,
icon-src escaping. Real-texture BC decode is user-validated against the local
DuneSandbox.pak per the project's testing model.
Completes the FINALIZED-before-DELETE ledger close-out interrupted by a
power loss after the code for #13 (icon pipeline, f766aaa) and #14-#36
(super-review hardening, 9b3dc5b) had already committed.

- FINALIZED.md: append #13-#36 verbatim (word-for-word per LAW #0) with
  per-entry branch + commit-hash resolution notes; section intro carries
  the original audit framing (threat model + P0-P5 grouping).
- TODO.md: Pending + In-progress cleared to none; Finalized summary
  extended to reference #13 and the #14-#36 batch.

No code change; docs land before any push (commits still unpushed).
Closes the #13 'REMAINING (user-validated)' gap with an actual run against
real Funcom data. validate/icon_decode_real.py drives the production
icon_extract functions (parse_texture -> decode_to_rgba -> encode_png)
against real cooked Texture2D icon assets from DuneSandbox.pak (auto-detected
via steam_locator, no hardcoded paths) and validates every emitted PNG.

Result on the live pak (3.24 GB, 16,406 entries, clean-room Oodle backend):
153 icon assets found, 150/150 decoded to VALID PNG (BC7=147, DXT5=3),
0 invalid, 0 degenerate, 3 cleanly skipped by the safety contract.
Independently re-confirmed via file(1) + Pillow (256x256 RGBA, real imagery).

FINALIZED #13 gets an appended real-texture-validation record (append-only).
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin feature/dune-extract-html-icons:feature/dune-extract-html-icons
git switch feature/dune-extract-html-icons

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch develop
git merge --no-ff feature/dune-extract-html-icons
git switch feature/dune-extract-html-icons
git rebase develop
git switch develop
git merge --ff-only feature/dune-extract-html-icons
git switch feature/dune-extract-html-icons
git rebase develop
git switch develop
git merge --no-ff feature/dune-extract-html-icons
git switch develop
git merge --squash feature/dune-extract-html-icons
git switch develop
git merge --ff-only feature/dune-extract-html-icons
git switch develop
git merge feature/dune-extract-html-icons
git push origin develop
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Sponge/Dune-Awakening-Server-Tools!14
No description provided.