Reputation
Badges 1
37 × Eureka!I used an env variable to avoid creating and endless loop of init/enqueue (using an argument like clearml.queue that would be captured and forwarded to the agent)
Related GitHub issue https://github.com/allegroai/clearml/issues/847
By the way, since if i create the task locally, reset it and enqueue it, it works. This is the workaround that i'm using right now
Ok i did some investigations and the bug appear from version 1.8.0. In version 1.7.0 there is not. I open a issue in GitHub
I opened the issue on github https://github.com/allegroai/clearml-web/issues/46
I specified the upload destination in the logger. Logger.current_logger().set_default_upload_destination(cfg.clearml.media_uri)
. Yes Minio with no special config. The s3 config is in the clearml.conf
The debug samples are correctly uploaded in the bucket (is a minio bucket) i can see them from the minio webapp. I have used logger.report_image
No problem! Thank you for your amazing work!
Also 1.9.1-312 is affected
if i print config_list
from def from_config(cls, s3_configuration):
in file bucket_config.py
line 121 i get{'key': '', 'secret': '', 'region': '', 'multipart': True, 'use_credentials_chain': False, 'bucket': 'clearml', 'host': 's3.myhost.tld:443', 'token': '', 'extra_args': ConfigTree()}
just tried 1.9.1 and it is affected
Yes. It seems a bug of the UI but it is weird that was unnoticed
This is the configuration in the webapp
The error is2022-11-28 14:40:17,099 - clearml.storage - ERROR - Failed creating storage object
Reason: Missing key and secret for S3 storage access (
)
Everything works fine i force those values using AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY
env variables
1.9.0 is still affected
I also save the models in the s3 bucket using output_uri=cfg.clearml.output_uri,
in the Task.init
This is the full print(cfg)
` {'dataset': {'name': '<my-dataset-name>', 'path': '', 'target_shape': [128, 128]}, 'model': 'unet', 'models': {'unet': {'dim': 8, 'dim_mults': [1, 2, 4], 'num_blocks_per_stage': [2, 2, 2], 'num_self_attn_per_stage': [0, 0, 1], 'nested_unet_depths': [0, 0, 0], 'nested_unet_dim': 16, 'use_convnext': False, 'resnet_groups': 2, 'consolidate_upsample_fmaps': True, 'weight_standardize': False, 'attn_heads': 2, 'attn_dim_head': 16}}, 'train': {'accelerator': 'auto...
When enqueued the configuration tab still shows the correct arguments
But not argument is passed to the scripts. Here i am printing sys.argv
Nevermind my aws config was not under sdk
:face_palm:
Yes it uses hydra and everything works fine without clearml. The script is similar to this one https://github.com/galatolofederico/lightning-template/blob/main/train.py
I tried like that clearml-task --script train.py --args overrides="log.clearml=True train.epochs=200 clearml.save=True" --project mornag-plain-dense --name mornag-plain-dense-training --queue tesla-t4 --skip-task-init