Reputation
Badges 1
15 × Eureka!hmm, It seems that 1.10.2 also doesn’t work.
manual upload is ok, model save capture is not.
Hi @<1523701435869433856:profile|SmugDolphin23> 👋
Yes, I can upload a Python file by the following line.
task.upload_artifact(name="config", artifact_object="config.py")
The artifact was uploaded to the file server with or without output_uri specification.
I’ve just try hard-coding but the result doesn’t change.
It doesn’t work…
task = Task.init(
project_name="my_project",
task_name="my_task",
output_uri=os.getenv("CLEARML_DEFAULT_OUTPUT_URI", None),
)
any suggestion?
@<1523701435869433856:profile|SmugDolphin23> Sorry for my late response.
I've tried OutputModel in local and SageMaker like:
task = Task(...)
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="my file server URI")
...
for path in Path(cfg.work_dir).glob("**/*.pth"):
output_model.update_weights(str(path))
and got the results in both envs.
2024-03-01 10:45:44,592 - clearml.storage - ERROR - Exception encountered while upl...
oh I got it. my codes output models and the task catch it automatically.
if you have any idea to reuse id even if models are outputted, please tell me thx
In my case, I write codes and run single batch train-val, which contains model saving, in developing phase. I want TRAINS to overwrite the dev runs for keeping dashboard clean.
I don't mean continuous training but I want to know about your plans for it 😋
A few corrections to the original post.
When I set CLEARML_DEFAULT_OUTPUT_URI on SageMaker, the model save was not captured and nothing was showing on the artifact tab.
If CLEARML_DEFAULT_OUTPUT_URI is not set, the model save is captured, but it only records the file path and does not upload the entity.
SDK: 1.14.1
WebApp: 1.14.0-431
Server: 1.14.0-431
API: 2.28
maybe the arguments is simply passed to Task.init()
self._trains = Task.init( project_name=project_name, task_name=task_name, task_type=task_type, reuse_last_task_id=reuse_last_task_id, output_uri=output_uri, auto_connect_arg_parser=auto_connect_arg_parser, auto_connect_frameworks=auto_connect_frameworks, auto_resource_monitoring=auto_resource_monitoring )
I would like to confirm just in case.
In the desired behavior, reuse_last_task_id=True
forces it for any intervals?