[ET-VK] Enable int8 tiled compute shader to be used with buffer tensors #10302

SS-JIA · 2025-04-18T20:16:14Z

Stack from ghstack (oldest at bottom):

Context

As title. Allow the optimized int8 tiled compute shader to be usable for buffer-backed tensors as well.

Changes

Generate buffer variants for the int8 linear tiled shader
Force the scales tensor to always be a buffer to reduce the number of shader variants that need to be generated.
Generate an additional variant that computes only 1 output row
Do not require output rows to be an exact multiple of 4 or 6 to use the tiled implementation

Differential Revision: D73276277

## Context As title. Allow the optimized int8 tiled compute shader to be usable for buffer-backed tensors as well. ## Changes * Generate buffer variants for the int8 linear tiled shader * Force the scales tensor to always be a buffer to reduce the number of shader variants that need to be generated. * Generate an additional variant that computes only 1 output row * Do not require output rows to be an exact multiple of 4 or 6 to use the tiled implementation Differential Revision: [D73276277](https://our.internmc.facebook.com/intern/diff/D73276277/) [ghstack-poisoned]

## Context As title. Allow the optimized int8 tiled compute shader to be usable for buffer-backed tensors as well. ## Changes * Generate buffer variants for the int8 linear tiled shader * Force the scales tensor to always be a buffer to reduce the number of shader variants that need to be generated. * Generate an additional variant that computes only 1 output row * Do not require output rows to be an exact multiple of 4 or 6 to use the tiled implementation Differential Revision: [D73276277](https://our.internmc.facebook.com/intern/diff/D73276277/) ghstack-source-id: 279008193 Pull Request resolved: #10302

pytorch-bot · 2025-04-18T20:16:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10302

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[Infra] Jobs got intermittently cancelled/fail midway checkout

❌ 4 New Failures, 6 Unrelated Failures

As of commit 191e6c4 with merge base 5b7f235 ():

NEW FAILURES - The following jobs have failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu... / linux-job (gh)
Access to the path '/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_f3aef0fc-0b71-47b7-8d20-5239adbad071' is denied.
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu-22.04-... / linux-job (gh)
Access to the path '/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_a51879a9-14f7-4de1-9a78-8b3b66a5145e' is denied.
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh)
Access to the path '/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_383d6271-68aa-4557-a7d3-9c155a049b31' is denied.

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
pull / test-models-linux-basic (mv3, portable, buck2, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
pull / test-pybind-build-linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
pull / test-selective-build-linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
pull / unittest-buck / linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
pull / unittest-editable / linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-04-18T20:16:28Z

This pull request was exported from Phabricator. Differential Revision: D73276277

github-actions · 2025-04-18T20:17:01Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…uffer tensors" ## Context As title. Allow the optimized int8 tiled compute shader to be usable for buffer-backed tensors as well. ## Changes * Generate buffer variants for the int8 linear tiled shader * Force the scales tensor to always be a buffer to reduce the number of shader variants that need to be generated. * Generate an additional variant that computes only 1 output row * Do not require output rows to be an exact multiple of 4 or 6 to use the tiled implementation Differential Revision: [D73276277](https://our.internmc.facebook.com/intern/diff/D73276277/) [ghstack-poisoned]

facebook-github-bot · 2025-04-18T20:54:54Z

This pull request was exported from Phabricator. Differential Revision: D73276277

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 18, 2025

facebook-github-bot added the fb-exported label Apr 18, 2025

SS-JIA mentioned this pull request Apr 18, 2025

[ET-VK] Add coop shader for int8 linear #10304

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK] Enable int8 tiled compute shader to be used with buffer tensors #10302

[ET-VK] Enable int8 tiled compute shader to be used with buffer tensors #10302

SS-JIA commented Apr 18, 2025 •

edited

Loading

pytorch-bot bot commented Apr 18, 2025 •

edited

Loading

facebook-github-bot commented Apr 18, 2025

github-actions bot commented Apr 18, 2025

facebook-github-bot commented Apr 18, 2025

[ET-VK] Enable int8 tiled compute shader to be used with buffer tensors #10302

Are you sure you want to change the base?

[ET-VK] Enable int8 tiled compute shader to be used with buffer tensors #10302

Conversation

SS-JIA commented Apr 18, 2025 • edited Loading

Context

Changes

pytorch-bot bot commented Apr 18, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10302

❗ 1 Active SEVs

❌ 4 New Failures, 6 Unrelated Failures

facebook-github-bot commented Apr 18, 2025

github-actions bot commented Apr 18, 2025

This PR needs a release notes: label

facebook-github-bot commented Apr 18, 2025

SS-JIA commented Apr 18, 2025 •

edited

Loading

pytorch-bot bot commented Apr 18, 2025 •

edited

Loading

This PR needs a `release notes:` label