A few corrections to the original post.
When I set CLEARML_DEFAULT_OUTPUT_URI on SageMaker, the model save was not captured and nothing was showing on the artifact tab.
If CLEARML_DEFAULT_OUTPUT_URI is not set, the model save is captured, but it only records the file path and does not upload the entity.
@<1523721697604145152:profile|YummyWhale40> are you able to manually save models from SageMaker using OutputModel
? None
I’ve just try hard-coding but the result doesn’t change.
Hi @<1523701435869433856:profile|SmugDolphin23> 👋
Yes, I can upload a Python file by the following line.
task.upload_artifact(name="config", artifact_object="config.py")
The artifact was uploaded to the file server with or without output_uri specification.
Hi @<1523721697604145152:profile|YummyWhale40> _, what if you specify the output_uri
through the code in Task.init()
?
Could you please try with an older sdk version just to make sure there were no regressions?
SDK: 1.14.1
WebApp: 1.14.0-431
Server: 1.14.0-431
API: 2.28
Hi @<1523721697604145152:profile|YummyWhale40> ! Are you able to upload artifacts of any kind other than models to the CLEARML_DEFAULT_OUTPUT_URI?
Why go into the environment variable and not just state it directly?
task = Task.init(
project_name="my_project",
task_name="my_task",
output_uri="
"
)
hmm, It seems that 1.10.2 also doesn’t work.
manual upload is ok, model save capture is not.
It doesn’t work…
task = Task.init(
project_name="my_project",
task_name="my_task",
output_uri=os.getenv("CLEARML_DEFAULT_OUTPUT_URI", None),
)
@<1523701435869433856:profile|SmugDolphin23> Sorry for my late response.
I've tried OutputModel in local and SageMaker like:
task = Task(...)
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="my file server URI")
...
for path in Path(cfg.work_dir).glob("**/*.pth"):
output_model.update_weights(str(path))
and got the results in both envs.
2024-03-01 10:45:44,592 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_1.pth'
2024-03-01 10:45:44,593 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:44,903 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_2.pth'
2024-03-01 10:45:44,903 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:45,261 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/latest.pth'
2024-03-01 10:45:45,262 - clearml.Task - INFO - Failed model upload
Is there anything wrong for my usage OutputModel?