Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I’M Trying To Upload Output Model Files (Like .Pth) To Clearml Server. Assume My

Hi, I’m trying to upload output model files (like .pth) to ClearML server. Assume my train.py is as follows:

from clearml import Task

task = Task.init(
    project_name="my_project",
    task_name="my_task",
)

# Training codes w/o ANY ClearML stuff.
...

task.close()

I have already succeeded to upload them from my EC2 instances with the environment variable CLEARML_DEFAULT_OUTPUT_URI :

$ CLEARML_DEFAULT_OUTPUT_URI=$CLEARML_FILES_HOST python train.py
clearml.Task - INFO - Completed model upload to 

However, I couldn’t do the same in SageMaker Training Job.
It does not upload anything even if model saving is captured (i.e., logged file system paths only) and CLEARML_DEFAULT_OUTPUT_URI is set.
Note that logging is working well in both runtimes.

The main difference between EC2 and SageMaker is the initial setting.
In EC2, I set up ClearML SDK with clearml-init command.
In SageMaker, I can’t set it up interactively so I achieve it with pass the environment variables CLEARML_WEB_HOST, CLEARML_API_HOST, CLEARML_FILES_HOST, CLEARML_API_ACCESS_KEY, CLEARML_API_SECRET_KEY .
I suspect that the additional setting is needed to upload models in SageMaker but I can’t find anything good in documents.

Does anyone have any useful information? Thanks.

  
  
Posted one month ago
Votes Newest

Answers 16


@<1523701435869433856:profile|SmugDolphin23> Sorry for my late response.

I've tried OutputModel in local and SageMaker like:

task = Task(...)
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="my file server URI")

...

for path in Path(cfg.work_dir).glob("**/*.pth"):
    output_model.update_weights(str(path))

and got the results in both envs.

2024-03-01 10:45:44,592 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_1.pth'
2024-03-01 10:45:44,593 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:44,903 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_2.pth'
2024-03-01 10:45:44,903 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:45,261 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/latest.pth'
2024-03-01 10:45:45,262 - clearml.Task - INFO - Failed model upload

Is there anything wrong for my usage OutputModel?

  
  
Posted one month ago

@<1523721697604145152:profile|YummyWhale40> are you able to manually save models from SageMaker using OutputModel ? None

  
  
Posted one month ago

Why go into the environment variable and not just state it directly?

task = Task.init(
    project_name="my_project",
    task_name="my_task",
    output_uri="
"
)
  
  
Posted one month ago

It doesn’t work…

task = Task.init(
    project_name="my_project",
    task_name="my_task",
    output_uri=os.getenv("CLEARML_DEFAULT_OUTPUT_URI", None),
)
  
  
Posted one month ago

Hi @<1523721697604145152:profile|YummyWhale40> _, what if you specify the output_uri through the code in Task.init() ?

  
  
Posted one month ago

I’ve just try hard-coding but the result doesn’t change.

  
  
Posted one month ago

A few corrections to the original post.

When I set CLEARML_DEFAULT_OUTPUT_URI on SageMaker, the model save was not captured and nothing was showing on the artifact tab.
If CLEARML_DEFAULT_OUTPUT_URI is not set, the model save is captured, but it only records the file path and does not upload the entity.

  
  
Posted one month ago

Hi @<1523701435869433856:profile|SmugDolphin23> 👋
Yes, I can upload a Python file by the following line.

task.upload_artifact(name="config", artifact_object="config.py")

The artifact was uploaded to the file server with or without output_uri specification.

  
  
Posted one month ago

Hi @<1523721697604145152:profile|YummyWhale40> ! Are you able to upload artifacts of any kind other than models to the CLEARML_DEFAULT_OUTPUT_URI?

  
  
Posted one month ago

Could you please try with an older sdk version just to make sure there were no regressions?

  
  
Posted one month ago

SDK: 1.14.1
WebApp: 1.14.0-431
Server: 1.14.0-431
API: 2.28

  
  
Posted one month ago

What clearml sdk version are you using?

  
  
Posted one month ago

any suggestion?

  
  
Posted one month ago

sure

  
  
Posted one month ago

hmm, It seems that 1.10.2 also doesn’t work.
manual upload is ok, model save capture is not.

  
  
Posted one month ago

1.10.2 should be old enough

  
  
Posted one month ago
132 Views
16 Answers
one month ago
one month ago
Tags