Skip to content

Add LoRA multimethod export to CoreML static LLM export#18347

Open
lucylq wants to merge 15 commits intogh/lucylq/144/headfrom
gh/lucylq/146/head
Open

Add LoRA multimethod export to CoreML static LLM export#18347
lucylq wants to merge 15 commits intogh/lucylq/144/headfrom
gh/lucylq/146/head

Conversation

@lucylq
Copy link
Contributor

@lucylq lucylq commented Mar 19, 2026

Add --adapter CLI for exporting LoRA adapters as separate methods in
a CoreML PTE. CoreML POSITIONAL weight sharing deduplicates base weights
across methods. Supports combination with --multifunction for
decode/prefill variants per adapter.

Authored with Claude.

lucylq added 2 commits March 19, 2026 14:46
[ghstack-poisoned]
[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18347

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit 7ef0005 with merge base dd7464a (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lucylq added a commit that referenced this pull request Mar 19, 2026
Add --adapter CLI for exporting LoRA adapters as separate methods in
a CoreML PTE. CoreML POSITIONAL weight sharing deduplicates base weights
across methods. Supports combination with --multifunction for
decode/prefill variants per adapter.

Authored with Claude.


ghstack-source-id: 16ab852
ghstack-comment-id: 4093498502
Pull-Request: #18347
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 19, 2026
lucylq added 2 commits March 19, 2026 14:50
[ghstack-poisoned]
[ghstack-poisoned]
lucylq added a commit that referenced this pull request Mar 19, 2026
Add --adapter CLI for exporting LoRA adapters as separate methods in
a CoreML PTE. CoreML POSITIONAL weight sharing deduplicates base weights
across methods. Supports combination with --multifunction for
decode/prefill variants per adapter.

Authored with Claude.


ghstack-source-id: eaac058
ghstack-comment-id: 4093498502
Pull-Request: #18347
lucylq added 2 commits March 19, 2026 14:53
[ghstack-poisoned]
[ghstack-poisoned]
lucylq added a commit that referenced this pull request Mar 19, 2026
Add --adapter CLI for exporting LoRA adapters as separate methods in
a CoreML PTE. CoreML POSITIONAL weight sharing deduplicates base weights
across methods. Supports combination with --multifunction for
decode/prefill variants per adapter.

Authored with Claude.


ghstack-source-id: d3fc801
ghstack-comment-id: 4093498502
Pull-Request: #18347
lucylq added 2 commits March 19, 2026 15:00
[ghstack-poisoned]
[ghstack-poisoned]
lucylq added a commit that referenced this pull request Mar 19, 2026
Add --adapter CLI for exporting LoRA adapters as separate methods in
a CoreML PTE. CoreML POSITIONAL weight sharing deduplicates base weights
across methods. Supports combination with --multifunction for
decode/prefill variants per adapter.

Authored with Claude.

ghstack-source-id: 54b25aa
ghstack-comment-id: 4093498502
Pull-Request: #18347
lucylq added 2 commits March 19, 2026 16:39
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
"forward": _export_model(model, example_inputs, "base"),
}
for name, lora_model in lora_models.items():
methods[name] = _export_model(lora_model, example_inputs, name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add methods for each lora

[ghstack-poisoned]
methods[f"{name}_forward"] = _export_model(
lora_model, decode_inputs, f"{name} decode"
)
methods[f"{name}_prefill"] = _export_model(
Copy link
Contributor Author

@lucylq lucylq Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add methods for each lora with separate prefill, decode

Not sure if this is how we want to do it, though.

[ghstack-poisoned]
[ghstack-poisoned]
constant_methods=constant_methods,
compile_config=edge_compile_config,
if has_adapters:
constant_methods["has_lora"] = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should add a method like this here. Or if we add, it should be more granular, like a list of lora methods.

[ghstack-poisoned]
@lucylq lucylq changed the base branch from gh/lucylq/145/head to gh/lucylq/144/head March 20, 2026 22:52
return False
return isinstance(m, nn.Linear)

linear_filter = _exclude_lora if has_lora_modules else None
Copy link
Contributor Author

@lucylq lucylq Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exclude lora from quantization for now; this should probably be a config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I think we can keep it in, and exclude if necessary later?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant