fix: jumpstart estimator for gated uncompressed training #5175

evakravi · 2025-05-13T15:21:22Z

Issue #, if available:

Description of changes:

This PR fixes an issue with JumpStartEstimator not including the model channel when invoking CreateTrainingJob
This PR also adds support for accept_eula field in base Estimator class, and added support for setting low-level field accordingly.

Testing done:

Unit tests added and updated
Integration tests pass
Tested workflow locally
TO-DO: Add integration test for gated uncompressed training once artifacts and metadata are in production

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

I have read the CONTRIBUTING doc
I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
I used the commit message format described in CONTRIBUTING
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

I have added tests that prove my fix is effective or that my feature works (if appropriate)
I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
I have checked that my tests are not configured for a specific region or account (if appropriate)
I have used unique_name_from_base to create resource names in integ tests (if appropriate)
If adding any dependency in requirements.txt files, I have spell checked and ensured they exist in PyPi

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

benieric · 2025-05-14T02:58:38Z

src/sagemaker/estimator.py

+                data_source = train_args["InputDataConfig"][idx]["DataSource"]
+                if "S3DataSource" in data_source:
+                    s3_data_source = data_source["S3DataSource"]
+                    if "ModelAccessConfig" not in s3_data_source:


Who sets this ModelAccessConfig? Is it the default artifacts set by JumpStart or is it something user would explicitly add to their inputs?

it's set by us when the customer inputs accept_eula=True

…ed files

…-training

bencrabtree · 2025-05-16T14:51:20Z

src/sagemaker/estimator.py

@@ -2674,6 +2695,36 @@ def _add_spot_checkpoint_args(cls, local_mode, estimator, train_args):
                raise ValueError("Setting checkpoint_local_path is not supported in local mode.")
            train_args["checkpoint_local_path"] = estimator.checkpoint_local_path

+    @classmethod
+    def _set_accept_eula_for_model_channel_input_data_config(cls, train_args, accept_eula):


qq- I know @Narrohag added some code that does something very similar. I want to make sure we're not doing the same work twice

Yeah I'm not fully following why this extra accept_eula logic is needed. For reference Evan this is where I added most of the model access config logic: https://github.com/aws/sagemaker-python-sdk/pull/5070/files. I'll take a closer look in a bit and maybe we can chat about it today

fix: jumpstart estimator for gated uncompressed training

8ae6835

evakravi requested a review from a team as a code owner May 13, 2025 15:21

evakravi requested a review from rsareddy0329 May 13, 2025 15:21

evakravi temporarily deployed to auto-approve May 13, 2025 15:21 — with GitHub Actions Inactive

fix: optional accept_eula arg

68202a7

evakravi temporarily deployed to auto-approve May 13, 2025 16:23 — with GitHub Actions Inactive

fix: unit tests

e9f2f65

evakravi temporarily deployed to auto-approve May 14, 2025 00:24 — with GitHub Actions Inactive

fix: unit tests

5de1e55

evakravi temporarily deployed to auto-approve May 14, 2025 02:33 — with GitHub Actions Inactive

benieric previously approved these changes May 14, 2025

View reviewed changes

fix: support legacy training models, fix cache override for unsupport…

402fbc9

…ed files

evakravi dismissed benieric’s stale review via 402fbc9 May 14, 2025 16:23

evakravi temporarily deployed to auto-approve May 14, 2025 16:24 — with GitHub Actions Inactive

fix: unit tests

c9ceff2

evakravi temporarily deployed to auto-approve May 14, 2025 19:18 — with GitHub Actions Inactive

Merge branch 'master' into fix/jumpstart-estimator-gated-uncompressed…

2830f39

…-training

evakravi temporarily deployed to auto-approve May 15, 2025 14:00 — with GitHub Actions Inactive

chore: add unit test + minor fix

4d45973

evakravi temporarily deployed to auto-approve May 15, 2025 15:03 — with GitHub Actions Inactive

chore: only attach eula for model channel

f26a170

evakravi temporarily deployed to auto-approve May 15, 2025 16:20 — with GitHub Actions Inactive

chore: undo changes to serverless_inference_config_dict

b1ffab2

evakravi temporarily deployed to auto-approve May 15, 2025 18:23 — with GitHub Actions Inactive

chore: cleanup unit tests

a7ff61d

evakravi temporarily deployed to auto-approve May 15, 2025 18:30 — with GitHub Actions Inactive

fix: unit tests

4c46f01

evakravi temporarily deployed to auto-approve May 15, 2025 19:09 — with GitHub Actions Inactive

Merge branch 'master' into fix/jumpstart-estimator-gated-uncompressed…

87d752f

…-training

evakravi temporarily deployed to auto-approve May 15, 2025 21:06 — with GitHub Actions Inactive

bencrabtree reviewed May 16, 2025

View reviewed changes

evakravi closed this May 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: jumpstart estimator for gated uncompressed training #5175

fix: jumpstart estimator for gated uncompressed training #5175

evakravi commented May 13, 2025 •

edited

Loading

benieric May 14, 2025

evakravi May 15, 2025

bencrabtree May 16, 2025

Narrohag May 16, 2025

fix: jumpstart estimator for gated uncompressed training #5175

fix: jumpstart estimator for gated uncompressed training #5175

Conversation

evakravi commented May 13, 2025 • edited Loading

Merge Checklist

General

Tests

benieric May 14, 2025

Choose a reason for hiding this comment

evakravi May 15, 2025

Choose a reason for hiding this comment

bencrabtree May 16, 2025

Choose a reason for hiding this comment

Narrohag May 16, 2025

Choose a reason for hiding this comment

evakravi commented May 13, 2025 •

edited

Loading