A few corrections to the original post.
When I set CLEARML_DEFAULT_OUTPUT_URI on SageMaker, the model save was not captured and nothing was showing on the artifact tab.
If CLEARML_DEFAULT_OUTPUT_URI is not set, the model save is captured, but it only records the file path and does not upload the entity.
Why go into the environment variable and not just state it directly?
task = Task.init(
project_name="my_project",
task_name="my_task",
output_uri="
"
)
It doesn’t work…
task = Task.init(
project_name="my_project",
task_name="my_task",
output_uri=os.getenv("CLEARML_DEFAULT_OUTPUT_URI", None),
)
Hi @<1523701435869433856:profile|SmugDolphin23> 👋
Yes, I can upload a Python file by the following line.
task.upload_artifact(name="config", artifact_object="config.py")
The artifact was uploaded to the file server with or without output_uri specification.
@<1523721697604145152:profile|YummyWhale40> are you able to manually save models from SageMaker using OutputModel
? None
Hi @<1523721697604145152:profile|YummyWhale40> _, what if you specify the output_uri
through the code in Task.init()
?
Could you please try with an older sdk version just to make sure there were no regressions?
hmm, It seems that 1.10.2 also doesn’t work.
manual upload is ok, model save capture is not.
Hi @<1523721697604145152:profile|YummyWhale40> ! Are you able to upload artifacts of any kind other than models to the CLEARML_DEFAULT_OUTPUT_URI?
SDK: 1.14.1
WebApp: 1.14.0-431
Server: 1.14.0-431
API: 2.28
I’ve just try hard-coding but the result doesn’t change.
@<1523701435869433856:profile|SmugDolphin23> Sorry for my late response.
I've tried OutputModel in local and SageMaker like:
task = Task(...)
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="my file server URI")
...
for path in Path(cfg.work_dir).glob("**/*.pth"):
output_model.update_weights(str(path))
and got the results in both envs.
2024-03-01 10:45:44,592 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_1.pth'
2024-03-01 10:45:44,593 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:44,903 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_2.pth'
2024-03-01 10:45:44,903 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:45,261 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/latest.pth'
2024-03-01 10:45:45,262 - clearml.Task - INFO - Failed model upload
Is there anything wrong for my usage OutputModel?