Skip to content

docs: add I2_S format reference#444

Open
Bortlesboat wants to merge 1 commit intomicrosoft:mainfrom
Bortlesboat:codex/i2s-format-docs
Open

docs: add I2_S format reference#444
Bortlesboat wants to merge 1 commit intomicrosoft:mainfrom
Bortlesboat:codex/i2s-format-docs

Conversation

@Bortlesboat
Copy link

Summary

  • add a focused docs/i2s-format.md reference for the I2_S ternary weight layout
  • document the backend-specific packing layout for x86 (QK_I2_S = 128) and ARM NEON (QK_I2_S = 64)
  • add a small README pointer so runtime implementers can discover the format note

Why

Issue #412 asks for documentation of the I2_S layout so alternative runtimes do not need to reverse-engineer quantize_i2_s().

This write-up is grounded in the current implementation in src/ggml-bitnet-mad.cpp and avoids making assumptions that are not present in the repo.

Testing

  • Verified the new markdown file and README link locally
  • No code-path changes; documentation only

Closes #412

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebGPU inference engine for BitNet b1.58 — and notes on I2_S format documentation

1 participant