Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I’M Trying To Upload Output Model Files (Like .Pth) To Clearml Server. Assume My

Hi, I’m trying to upload output model files (like .pth) to ClearML server. Assume my train.py is as follows:

from clearml import Task

task = Task.init(
    project_name="my_project",
    task_name="my_task",
)

# Training codes w/o ANY ClearML stuff.
...

task.close()

I have already succeeded to upload them from my EC2 instances with the environment variable CLEARML_DEFAULT_OUTPUT_URI :

$ CLEARML_DEFAULT_OUTPUT_URI=$CLEARML_FILES_HOST python train.py
clearml.Task - INFO - Completed model upload to 

However, I couldn’t do the same in SageMaker Training Job.
It does not upload anything even if model saving is captured (i.e., logged file system paths only) and CLEARML_DEFAULT_OUTPUT_URI is set.
Note that logging is working well in both runtimes.

The main difference between EC2 and SageMaker is the initial setting.
In EC2, I set up ClearML SDK with clearml-init command.
In SageMaker, I can’t set it up interactively so I achieve it with pass the environment variables CLEARML_WEB_HOST, CLEARML_API_HOST, CLEARML_FILES_HOST, CLEARML_API_ACCESS_KEY, CLEARML_API_SECRET_KEY .
I suspect that the additional setting is needed to upload models in SageMaker but I can’t find anything good in documents.

Does anyone have any useful information? Thanks.

  
  
Posted 2 months ago
Votes Newest

Answers 16


What clearml sdk version are you using?

  
  
Posted 2 months ago

A few corrections to the original post.

When I set CLEARML_DEFAULT_OUTPUT_URI on SageMaker, the model save was not captured and nothing was showing on the artifact tab.
If CLEARML_DEFAULT_OUTPUT_URI is not set, the model save is captured, but it only records the file path and does not upload the entity.

  
  
Posted 2 months ago

SDK: 1.14.1
WebApp: 1.14.0-431
Server: 1.14.0-431
API: 2.28

  
  
Posted 2 months ago

@<1523721697604145152:profile|YummyWhale40> are you able to manually save models from SageMaker using OutputModel ? None

  
  
Posted 2 months ago

Hi @<1523721697604145152:profile|YummyWhale40> _, what if you specify the output_uri through the code in Task.init() ?

  
  
Posted 2 months ago

It doesn’t work…

task = Task.init(
    project_name="my_project",
    task_name="my_task",
    output_uri=os.getenv("CLEARML_DEFAULT_OUTPUT_URI", None),
)
  
  
Posted 2 months ago

1.10.2 should be old enough

  
  
Posted 2 months ago

sure

  
  
Posted 2 months ago

Hi @<1523721697604145152:profile|YummyWhale40> ! Are you able to upload artifacts of any kind other than models to the CLEARML_DEFAULT_OUTPUT_URI?

  
  
Posted 2 months ago

Could you please try with an older sdk version just to make sure there were no regressions?

  
  
Posted 2 months ago

@<1523701435869433856:profile|SmugDolphin23> Sorry for my late response.

I've tried OutputModel in local and SageMaker like:

task = Task(...)
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="my file server URI")

...

for path in Path(cfg.work_dir).glob("**/*.pth"):
    output_model.update_weights(str(path))

and got the results in both envs.

2024-03-01 10:45:44,592 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_1.pth'
2024-03-01 10:45:44,593 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:44,903 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_2.pth'
2024-03-01 10:45:44,903 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:45,261 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/latest.pth'
2024-03-01 10:45:45,262 - clearml.Task - INFO - Failed model upload

Is there anything wrong for my usage OutputModel?

  
  
Posted 2 months ago

Hi @<1523701435869433856:profile|SmugDolphin23> 👋
Yes, I can upload a Python file by the following line.

task.upload_artifact(name="config", artifact_object="config.py")

The artifact was uploaded to the file server with or without output_uri specification.

  
  
Posted 2 months ago

I’ve just try hard-coding but the result doesn’t change.

  
  
Posted 2 months ago

any suggestion?

  
  
Posted 2 months ago

Why go into the environment variable and not just state it directly?

task = Task.init(
    project_name="my_project",
    task_name="my_task",
    output_uri="
"
)
  
  
Posted 2 months ago

hmm, It seems that 1.10.2 also doesn’t work.
manual upload is ok, model save capture is not.

  
  
Posted 2 months ago
170 Views
16 Answers
2 months ago
2 months ago
Tags