![Profile picture](https://clearml-web-assets.s3.amazonaws.com/scoold/avatars/TrickyFox41.png)
Reputation
Badges 1
37 × Eureka!Yes it uses hydra and everything works fine without clearml. The script is similar to this one https://github.com/galatolofederico/lightning-template/blob/main/train.py
Yes the task is running on a remote agent with the --docker
flag
this is the config on the machine(s) running the agent
`
agent {
venvs_cache: {
max_entries: 50
free_space_threshold_gb: -1
path: ~/.clearml/venvs-cache
}
extra_docker_arguments: [
"--network", "host",
"-v", "/home/ubuntu/.ssh:/root/.ssh:ro",
"-v", "/home/ubuntu/.cache:/root/.cache",
]
docker_internal_mounts {
sdk_cache: "/clearml_agent_cache"
...
The debug samples are correctly uploaded in the bucket (is a minio bucket) i can see them from the minio webapp. I have used logger.report_image
just tried 1.9.1 and it is affected
I opened the issue on github https://github.com/allegroai/clearml-web/issues/46
I used an env variable to avoid creating and endless loop of init/enqueue (using an argument like clearml.queue that would be captured and forwarded to the agent)
But not argument is passed to the scripts. Here i am printing sys.argv
When enqueued the configuration tab still shows the correct arguments
This is the configuration in the webapp
1.9.0 is still affected
This is the full print(cfg)
` {'dataset': {'name': '<my-dataset-name>', 'path': '', 'target_shape': [128, 128]}, 'model': 'unet', 'models': {'unet': {'dim': 8, 'dim_mults': [1, 2, 4], 'num_blocks_per_stage': [2, 2, 2], 'num_self_attn_per_stage': [0, 0, 1], 'nested_unet_depths': [0, 0, 0], 'nested_unet_dim': 16, 'use_convnext': False, 'resnet_groups': 2, 'consolidate_upsample_fmaps': True, 'weight_standardize': False, 'attn_heads': 2, 'attn_dim_head': 16}}, 'train': {'accelerator': 'auto...
if i print config_list
from def from_config(cls, s3_configuration):
in file bucket_config.py
line 121 i get{'key': '', 'secret': '', 'region': '', 'multipart': True, 'use_credentials_chain': False, 'bucket': 'clearml', 'host': 's3.myhost.tld:443', 'token': '', 'extra_args': ConfigTree()}
Related GitHub issue https://github.com/allegroai/clearml/issues/847
The file has been uploaded correctly in the bucket
Yes. It seems a bug of the UI but it is weird that was unnoticed
No problem! Thank you for your amazing work!
I specified the upload destination in the logger. Logger.current_logger().set_default_upload_destination(cfg.clearml.media_uri)
. Yes Minio with no special config. The s3 config is in the clearml.conf
I also save the models in the s3 bucket using output_uri=cfg.clearml.output_uri,
in the Task.init
I installed clearml from source and printed the internal S3 configurations, basically key
and secret
are empty
The error is2022-11-28 14:40:17,099 - clearml.storage - ERROR - Failed creating storage object
Reason: Missing key and secret for S3 storage access (
)
By the way, since if i create the task locally, reset it and enqueue it, it works. This is the workaround that i'm using right now
i printed cfg
in the script and the config has not been overwritten 😢