Reputation
Badges 1
15 × Eureka!maybe the arguments is simply passed to Task.init()
self._trains = Task.init( project_name=project_name, task_name=task_name, task_type=task_type, reuse_last_task_id=reuse_last_task_id, output_uri=output_uri, auto_connect_arg_parser=auto_connect_arg_parser, auto_connect_frameworks=auto_connect_frameworks, auto_resource_monitoring=auto_resource_monitoring )
A few corrections to the original post.
When I set CLEARML_DEFAULT_OUTPUT_URI on SageMaker, the model save was not captured and nothing was showing on the artifact tab.
If CLEARML_DEFAULT_OUTPUT_URI is not set, the model save is captured, but it only records the file path and does not upload the entity.
if you have any idea to reuse id even if models are outputted, please tell me thx
oh I got it. my codes output models and the task catch it automatically.
@<1523701435869433856:profile|SmugDolphin23> Sorry for my late response.
I've tried OutputModel in local and SageMaker like:
task = Task(...)
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="my file server URI")
...
for path in Path(cfg.work_dir).glob("**/*.pth"):
output_model.update_weights(str(path))
and got the results in both envs.
2024-03-01 10:45:44,592 - clearml.storage - ERROR - Exception encountered while upl...
In my case, I write codes and run single batch train-val, which contains model saving, in developing phase. I want TRAINS to overwrite the dev runs for keeping dashboard clean.
I would like to confirm just in case.
In the desired behavior, reuse_last_task_id=True
forces it for any intervals?
I don't mean continuous training but I want to know about your plans for it 😋
hmm, It seems that 1.10.2 also doesn’t work.
manual upload is ok, model save capture is not.
SDK: 1.14.1
WebApp: 1.14.0-431
Server: 1.14.0-431
API: 2.28
any suggestion?
Hi @<1523701435869433856:profile|SmugDolphin23> 👋
Yes, I can upload a Python file by the following line.
task.upload_artifact(name="config", artifact_object="config.py")
The artifact was uploaded to the file server with or without output_uri specification.
It doesn’t work…
task = Task.init(
project_name="my_project",
task_name="my_task",
output_uri=os.getenv("CLEARML_DEFAULT_OUTPUT_URI", None),
)
I’ve just try hard-coding but the result doesn’t change.