Skip to content

fix: correct BitNetForCausalLM model registration and tokenizer type#442

Open
goodcomm74 wants to merge 1 commit intomicrosoft:mainfrom
goodcomm74:fix/bitnet-2b-4t-apple-silicon-support
Open

fix: correct BitNetForCausalLM model registration and tokenizer type#442
goodcomm74 wants to merge 1 commit intomicrosoft:mainfrom
goodcomm74:fix/bitnet-2b-4t-apple-silicon-support

Conversation

@goodcomm74
Copy link

Problem

Two bugs in utils/convert-hf-to-gguf-bitnet.py prevent the microsoft/BitNet-b1.58-2B-4T model from being converted on Apple Silicon (and likely other platforms):

Bug 1: Architecture name mismatch (capital 'N')

# Before (wrong)
@Model.register("BitnetForCausalLM")

# After (correct)
@Model.register("BitNetForCausalLM")

The config.json of microsoft/BitNet-b1.58-2B-4T declares:

"architectures": ["BitNetForCausalLM"]

This mismatch caused:

NotImplementedError: Architecture 'BitNetForCausalLM' not supported!

Bug 2: Wrong tokenizer type

# Before (wrong - looks for tokenizer.model which doesn't exist)
def set_vocab(self):
    self._set_vocab_sentencepiece()

# After (correct - uses tokenizer.json which exists)
def set_vocab(self):
    self._set_vocab_gpt2()

microsoft/BitNet-b1.58-2B-4T uses a GPT-2 style tokenizer (tokenizer.json), not SentencePiece (tokenizer.model). This caused:

FileNotFoundError: File not found: models/BitNet-b1.58-2B-4T/tokenizer.model

Testing

Tested on Apple Silicon M4 Max (macOS 15.3, arm64).

After this fix, the model converts successfully and runs via mlx-lm:

  • Speed: ~200 tokens/sec
  • Memory: ~1.2 GB

Note: tl1 quantization still requires preset kernels for BitNet-b1.58-2B-4T (not included in this PR). i2_s conversion works after this fix.

Related Issues

…itNet-b1.58-2B-4T

Two bugs prevent BitNet-b1.58-2B-4T from being converted on Apple Silicon:

1. Architecture name mismatch: Model was registered as "BitnetForCausalLM"
   (lowercase 'n') but the actual architecture in config.json is
   "BitNetForCausalLM" (capital 'N'). This caused:
   NotImplementedError: Architecture 'BitNetForCausalLM' not supported!

2. Wrong tokenizer type: BitNet-b1.58-2B-4T uses a GPT-2 style tokenizer
   (tokenizer.json) not SentencePiece (tokenizer.model). Using
   _set_vocab_sentencepiece() caused:
   FileNotFoundError: File not found: models/BitNet-b1.58-2B-4T/tokenizer.model

Tested on Apple Silicon M4 Max (macOS 25.3) using mlx-lm as the inference
backend after conversion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant