Skip to content

Run tutorials on g5 #3324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 15, 2025
Merged

Run tutorials on g5 #3324

merged 1 commit into from
Apr 15, 2025

Conversation

svekars
Copy link
Contributor

@svekars svekars commented Apr 15, 2025

Why we're making this change

PR authors are frustrated with torch.compile tutorials, which aren't supported on our older workers. As more tutorials adopt torch.compile, users must constantly update metadata.json to ensure their tutorials run on compatible workers.

This PR moves these tutorials to G5 instances, providing a consistent environment that supports torch.compile out of the box, eliminating the need for manual configuration and improving the overall developer experience. This change should also help to make the whole tutorial build faster.

Cons: Less and less tutorials are actually runnable on Colab, which still serves only T4 machines in their free tier

@svekars svekars added the build issue Issues relating to the tutorials build label Apr 15, 2025
Copy link

pytorch-bot bot commented Apr 15, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3324

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job

As of commit 218788a with merge base 548ea1c (image):

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@svekars svekars changed the title Run tutorials on linux.g5.4xlarge.nvidia.gpu Run tutorials on g5 Apr 15, 2025
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but perhaps it would be good to undo the logic that dispatches different tests to different nodes

@malfet malfet merged commit 16e549f into main Apr 15, 2025
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build issue Issues relating to the tutorials build cla signed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants