Reputation
Badges 1
22 × Eureka!The error is somehow connected to reinitializing task twice, I don't know what's the "true" way of using transformer's ClearMLCallback within clearml pipeline.
Something is changed by executing the program through agent, because I executed exactly the same code on exactly the same docker image and it doesn't produce this error.
I have attached full log. This error happened during starting some standard transformers training loop.
Alright, I have disabled the proxy entirely and now everything is fine. I still don't know what the reason is for this behaviour, GET requests get through just fine.
For example, I'd like to setup agent.python_binary variable for my "agent-services" service, how can I achieve that?
Any idea how to handle that apart from not using Git LFS?
Yep, custom image, extension of the nvidia/cuda:12.0.0-base-ubuntu22.04 image.
These are the last lines of the Dockerfile:
In the default docker compose file None there is "agent-services" service. It is an agent which for example runs pipeline controllers. Agent run from CLI can be configured, using clearml.conf file. Can I use clearml.conf file to configure "agent-services" service?
Yep, timeout as well, although requests.get(" None ") works just fine
Simple docker compose. The connection shouldn't go through VPN. Generally even while using an agent at the same machine, upload is woefully slow - a 4 GB file is uploading for more than 40 minutes.
Yes, it's the same, because the passed url is the same. I need to have git url in the mentioned format or the agent cannot clone the repo.
I need to use SSH-based authentication, so I guess that's not an option. Well the removal affects cloning URL, i.e. while ssh://git@repo.com:2222/jks/experiment.git works fine ssh://repo.com:2222/jks/experiment.git doesn't, because it assumes current user as SSH user.
I think it should be something like this:f = furl(url) if f.scheme in ['http', 'https', 'ssh'] and f.password: url = f.remove(username=True, password=True).tostr()
instead of:url = furl(url).remove(username=True, password=True).tostr()
Therefore it's a minor bug and not working as intended because that function converts:
ssh://git@repo.com:2222/jks/experiment.git into
ssh://repo.com:2222/jks/experiment.git
On the other hand: remove_user_pass_from_url("git@repo.com:2222/jks/experiment.git")
works correctly as it results in git@repo.com:2222/jks/experiment.git
I think it works now by accident, as mailto:git@repo.com is not a valid scheme URL, so it doesn't get parsed at all by furl library
The custom callback I have used is:
` class MyClearMLCallback(ClearMLCallback):
def init(self, *args, **kwargs):
self._task_name = kwargs.pop("task_name", None)
self._project_name = kwargs.pop("project_name", None)
super().init(*args, **kwargs)
def setup(self, args, state, model, tokenizer, **kwargs):
if self._clearml is None:
return
if state.is_world_process_zero:
logger.info("Automatic ClearML logging enabled."...
api {
# Notice: 'host' is the api server (default port 8008), not the web server.
api_server:
web_server:
files_server:
# Credentials are generated using the webapp, /settings
# Override with os environment: CLEARML_API_ACCESS_KEY / CLEARML_API_SECRET_KEY
credentials {"access_key": "snip", "secret_key": "snip"}
}