Skip to content

Windows wheels [A]: fix XNNPACK test hang by disabling pthreadpool on Windows#18372

Draft
manuelcandales wants to merge 4 commits intomainfrom
manuel/windows-wheels-fix-A
Draft

Windows wheels [A]: fix XNNPACK test hang by disabling pthreadpool on Windows#18372
manuelcandales wants to merge 4 commits intomainfrom
manuel/windows-wheels-fix-A

Conversation

@manuelcandales
Copy link
Contributor

pthreadpool's condvar-based synchronization on Windows can deadlock with multiple threads due to a lost-wakeup bug where signal_num_recruited_threads uses cnd_signal on a condition variable shared between two different wait conditions. Return nullptr from get_pthreadpool() on Windows so XNNPACK runs single-threaded.

The previous fix (setting sslBackend in pre_build_script.sh) only
applied to nested tokenizer submodules. The top-level submodule
checkout still used schannel via the reusable workflow's
`submodules: true`, causing SEC_E_ILLEGAL_MESSAGE errors when
cloning from git.gitlab.arm.com.

Move all submodule initialization into the pre-build script where
we can control the SSL backend, and disable submodule checkout in
the workflow.
Move submodule initialization above the aarch64 sed workaround so
the file it edits is guaranteed to exist even if the caller disables
submodule checkout. Also remove the redundant UNAME_S assignment
later in the script.
The default 60-minute timeout from pytorch/test-infra is too tight for
the Windows wheel build + smoke test, causing jobs to be cancelled.
…dows

pthreadpool's condvar-based synchronization on Windows can deadlock
with multiple threads due to a lost-wakeup bug where
signal_num_recruited_threads uses cnd_signal on a condition variable
shared between two different wait conditions. Return nullptr from
get_pthreadpool() on Windows so XNNPACK runs single-threaded.
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 20, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18372

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures

As of commit 67b4c8d with merge base 94e9ca6 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 20, 2026
@manuelcandales manuelcandales force-pushed the manuel/build-windows-wheels-fix-2 branch from 8033f32 to 77989a2 Compare March 20, 2026 20:15
Base automatically changed from manuel/build-windows-wheels-fix-2 to main March 20, 2026 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant