Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I’M Trying To Upload Output Model Files (Like .Pth) To Clearml Server. Assume My

Hi, I’m trying to upload output model files (like .pth) to ClearML server. Assume my train.py is as follows:

from clearml import Task

task = Task.init(
    project_name="my_project",
    task_name="my_task",
)

# Training codes w/o ANY ClearML stuff.
...

task.close()

I have already succeeded to upload them from my EC2 instances with the environment variable CLEARML_DEFAULT_OUTPUT_URI :

$ CLEARML_DEFAULT_OUTPUT_URI=$CLEARML_FILES_HOST python train.py
clearml.Task - INFO - Completed model upload to 

However, I couldn’t do the same in SageMaker Training Job.
It does not upload anything even if model saving is captured (i.e., logged file system paths only) and CLEARML_DEFAULT_OUTPUT_URI is set.
Note that logging is working well in both runtimes.

The main difference between EC2 and SageMaker is the initial setting.
In EC2, I set up ClearML SDK with clearml-init command.
In SageMaker, I can’t set it up interactively so I achieve it with pass the environment variables CLEARML_WEB_HOST, CLEARML_API_HOST, CLEARML_FILES_HOST, CLEARML_API_ACCESS_KEY, CLEARML_API_SECRET_KEY .
I suspect that the additional setting is needed to upload models in SageMaker but I can’t find anything good in documents.

Does anyone have any useful information? Thanks.

  
  
Posted 10 months ago
Votes Newest

Answers 16


A few corrections to the original post.

When I set CLEARML_DEFAULT_OUTPUT_URI on SageMaker, the model save was not captured and nothing was showing on the artifact tab.
If CLEARML_DEFAULT_OUTPUT_URI is not set, the model save is captured, but it only records the file path and does not upload the entity.

  
  
Posted 10 months ago

Why go into the environment variable and not just state it directly?

task = Task.init(
    project_name="my_project",
    task_name="my_task",
    output_uri="
"
)
  
  
Posted 10 months ago

It doesn’t work…

task = Task.init(
    project_name="my_project",
    task_name="my_task",
    output_uri=os.getenv("CLEARML_DEFAULT_OUTPUT_URI", None),
)
  
  
Posted 10 months ago

Hi @<1523701435869433856:profile|SmugDolphin23> 👋
Yes, I can upload a Python file by the following line.

task.upload_artifact(name="config", artifact_object="config.py")

The artifact was uploaded to the file server with or without output_uri specification.

  
  
Posted 10 months ago

@<1523721697604145152:profile|YummyWhale40> are you able to manually save models from SageMaker using OutputModel ? None

  
  
Posted 10 months ago

Hi @<1523721697604145152:profile|YummyWhale40> _, what if you specify the output_uri through the code in Task.init() ?

  
  
Posted 10 months ago

Could you please try with an older sdk version just to make sure there were no regressions?

  
  
Posted 10 months ago

hmm, It seems that 1.10.2 also doesn’t work.
manual upload is ok, model save capture is not.

  
  
Posted 10 months ago

What clearml sdk version are you using?

  
  
Posted 10 months ago

1.10.2 should be old enough

  
  
Posted 10 months ago

Hi @<1523721697604145152:profile|YummyWhale40> ! Are you able to upload artifacts of any kind other than models to the CLEARML_DEFAULT_OUTPUT_URI?

  
  
Posted 10 months ago

sure

  
  
Posted 10 months ago

any suggestion?

  
  
Posted 10 months ago

SDK: 1.14.1
WebApp: 1.14.0-431
Server: 1.14.0-431
API: 2.28

  
  
Posted 10 months ago

I’ve just try hard-coding but the result doesn’t change.

  
  
Posted 10 months ago

@<1523701435869433856:profile|SmugDolphin23> Sorry for my late response.

I've tried OutputModel in local and SageMaker like:

task = Task(...)
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri="my file server URI")

...

for path in Path(cfg.work_dir).glob("**/*.pth"):
    output_model.update_weights(str(path))

and got the results in both envs.

2024-03-01 10:45:44,592 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_1.pth'
2024-03-01 10:45:44,593 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:44,903 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/epoch_2.pth'
2024-03-01 10:45:44,903 - clearml.Task - INFO - Failed model upload
2024-03-01 10:45:45,261 - clearml.storage - ERROR - Exception encountered while uploading [Errno 2] No such file or directory: 'outputs/latest.pth'
2024-03-01 10:45:45,262 - clearml.Task - INFO - Failed model upload

Is there anything wrong for my usage OutputModel?

  
  
Posted 10 months ago
747 Views
16 Answers
10 months ago
9 months ago
Tags