looking in the web app, under the “App Credentials” section, it lists those credentials as “used” when I attempted the
Yeah, the server notes that
Yeah... I see it's quoted there, so it shouldn't be a problem...
` $ clearml-agent config
Current configuration (clearml_agent v1.0.0, location: /home/username/clearml.conf):
agent.worker_name = computer
agent.force_git_ssh_protocol = false
agent.package_manager.type = pip
agent.package_manager.pip_version = <20.2
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = defaults
agent.package_manager.torch_nightly = false
agent.venvs_dir = ~/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.vcs_cache.enabled = true
agent.vcs_cache.path = ~/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = ~/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = ~/.clearml/pip-cache
agent.docker_apt_cache = ~/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04
agent.enable_task_env = false
agent.default_python = 3.7
agent.cuda_version = 112
agent.cudnn_version = 0
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.credentials.access_key = 7L******
api.host = `
here’s a the file with the keys and IP redacted: https://clearml.slack.com/files/U01PN0S6Y67/F0231N0GZ19/clearml.conf
$ clearml-agent -d daemon --gpus 1 --foreground DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): DIFFERENT_IP_ADDRESS:8008 DEBUG:urllib3.util.retry:Incremented Retry for (url='/auth.login'): Retry(total=239, connect=3, read=240, redirect=240, status=240) WARNING:urllib3.connectionpool:Retrying (Retry(total=239, connect=3, read=240, redirect=240, status=240)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff49318dd10>: Failed to establish a new connection: [Errno 111] Connection refused')': /auth.login DEBUG:urllib3.connectionpool:Starting new HTTP connection (2): DIFFERENT_IP_ADDRESS:8008 DEBUG:urllib3.util.retry:Incremented Retry for (url='/auth.login'): Retry(total=238, connect=2, read=240, redirect=240, status=240) WARNING:urllib3.connectionpool:Retrying (Retry(total=238, connect=2, read=240, redirect=240, status=240)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff49318d6d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /auth.login DEBUG:urllib3.connectionpool:Starting new HTTP connection (3): DIFFERENT_IP_ADDRESS:8008
Could it be you have old OS environment overriding the configuration file ?
Can you change the IP of the server in the conf file, and make sure it has an effect (i.e. the error changed)?
NastyFox63 try using the credentials in the
curl request, just to make sure you can authenticate them with the server... do:
curl -u <key>:<secret>
except for the IP address and the actual keys, it’s the vanilla config generated by
Can you share the
clearml.conf ? Maybe something will pop ?
also, i’m noticing the “last used” field does not update when I try to start an agent, but does change when I issue the
curl command you gave earlier
be the same as the
Yes, it should. Isn't it?
api.credentials.access_key be the same as the
Can you just try
clearml-agent config ?
no, it’s a key I don’t recognize
looks like a previous user set
/etc/environment and then disabled the keys in the web app. I removed the two items from
/etc/environment and was able to successfully start a worker.
it seems, though, that the env vars take precedence even when a
--config-file is explicitly specified?
Env vars always win over a config file, but explicit CLI params trump env vars as well. In this case it's a close call - maybe we'll need to change that?
okay, they are somehow set as environment variables. let me figure out how they were set.
that seems like a good solution 🙂
thank you SuccessfulKoala55 and AgitatedDove14 for your help! Martin identified the problem early on, but I only checked my
.bashrc , 😞
hmm, it was confusing to me, but it’s kind of an edge case where I was taking over a computer after a colleague left, seems like that might not be a common scenario
I think we should just add some kind of a warning in these cases
I'm assuming something is wrong with the key/secret quoting ?!
Could you generate another one and test it ?
(you can have multiple key/secretes on the same user)
after generating a fresh set of keys
when you have a new set, copy paste them idirectly into the 'cleaml.conf' (should be at the top, can't miss it)
yes, i can do this again. i did use
clearml-agent init to generate
clearml.conf after generating a fresh set of keys
i did this and have the same error.
yes, that call appeared to be successful—had to wrap in quotes because of the contents of the key:
$ curl -u 'J9*****':'R2*****'
SERVER_IP_ADDRESS is the actual IP address of the server, AND i made sure that
CLEARML_HOST_IP was set correctly before issuing the
well, as generated by
clearml-agent init —i pasted the text directly from the web app into the CLI interface, and it generated
OK, so we know these credentials are good... how exactly do they appear in the config file?
i’ll try changing the IP and look for a different error.