VLLM Metal Engine Error (`ValueError: Model type qwen3_5 not supported` when running `Qwen3.5-4B-MLX-4bit` via `docker model run`)

**Description:**
When attempting to run the `Qwen3.5-4B-MLX-4bit` model from the MLX community using the `docker model run` command, the model fails to preload. The underlying `vllm-metal` runner crashes with a `ValueError` indicating that the `qwen3_5` model type is not supported by `mlx_lm`.

**Steps to Reproduce:**
1. Run the following command in the terminal:
   ```bash
   docker model run hf.co/mlx-community/Qwen3.5-4B-MLX-4bit
   ```
2. Enter a prompt (e.g., `> Hola`).
3. Observe the crash and the traceback.

**Expected Behavior:**
The model should load successfully and generate a response to the prompt. 

**Actual Behavior / Error Logs:**
The background model preload fails with a status 500. Here is the traceback:

```text
background model preload failed: preload failed: status=500 body=unable to load runner: error waiting for runner to be ready: vllm-metal terminated unexpectedly: vllm-metal failed:    model, config = load_model(model_path, lazy, model_config=model_config)
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/emi/.docker/model-runner/vllm-metal/lib/python3.12/site-packages/mlx_lm/utils.py", line 201, in load_model
    model_class, model_args_class = get_model_classes(config=config)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/emi/.docker/model-runner/vllm-metal/lib/python3.12/site-packages/mlx_lm/utils.py", line 75, in _get_classes
    raise ValueError(msg)
ValueError: Model type qwen3_5 not supported.
packages/vllm_metal/server.py", line 288, in create_engine
    runner.load_model()
  File "/Users/emi/.docker/model-runner/vllm-metal/lib/python3.12/site-packages/vllm_metal/model_runner.py", line 51, in load_model
    self.model, self.tokenizer = mlx_load(
                                 ^^^^^^^^^
  File "/Users/emi/.docker/model-runner/vllm-metal/lib/python3.12/site-packages/mlx_lm/utils.py", line 322, in load
```

**Environment details:**
* **OS:** macOS (Apple Silicon) M3
* **Command line tool:** Docker Desktop local AI features (`docker model` CLI)

**Additional Context:**
It appears the version of `mlx_lm` bundled with the Docker model-runner (`vllm-metal`) might be outdated and missing support for the `qwen3_5` architecture, which was added in more recent versions of the MLX framework.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLLM Metal Engine Error (`ValueError: Model type qwen3_5 not supported` when running `Qwen3.5-4B-MLX-4bit` via `docker model run`) #767

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VLLM Metal Engine Error (ValueError: Model type qwen3_5 not supported when running Qwen3.5-4B-MLX-4bit via docker model run) #767

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

VLLM Metal Engine Error (`ValueError: Model type qwen3_5 not supported` when running `Qwen3.5-4B-MLX-4bit` via `docker model run`) #767