Reputation
Badges 1
22 × Eureka!I think it works now by accident, as mailto:git@repo.com is not a valid scheme URL, so it doesn't get parsed at all by furl library
Something is changed by executing the program through agent, because I executed exactly the same code on exactly the same docker image and it doesn't produce this error.
Yes, it's the same, because the passed url is the same. I need to have git url in the mentioned format or the agent cannot clone the repo.
Yep, custom image, extension of the nvidia/cuda:12.0.0-base-ubuntu22.04 image.
These are the last lines of the Dockerfile:
Any idea how to handle that apart from not using Git LFS?
Therefore it's a minor bug and not working as intended because that function converts:
ssh://git@repo.com:2222/jks/experiment.git into
ssh://repo.com:2222/jks/experiment.git
I need to use SSH-based authentication, so I guess that's not an option. Well the removal affects cloning URL, i.e. while ssh://git@repo.com:2222/jks/experiment.git works fine ssh://repo.com:2222/jks/experiment.git doesn't, because it assumes current user as SSH user.
The error is somehow connected to reinitializing task twice, I don't know what's the "true" way of using transformer's ClearMLCallback within clearml pipeline.
api {
# Notice: 'host' is the api server (default port 8008), not the web server.
api_server:
web_server:
files_server:
# Credentials are generated using the webapp, /settings
# Override with os environment: CLEARML_API_ACCESS_KEY / CLEARML_API_SECRET_KEY
credentials {"access_key": "snip", "secret_key": "snip"}
}
On the other hand: remove_user_pass_from_url("git@repo.com:2222/jks/experiment.git")
works correctly as it results in git@repo.com:2222/jks/experiment.git
I think it should be something like this:f = furl(url) if f.scheme in ['http', 'https', 'ssh'] and f.password: url = f.remove(username=True, password=True).tostr()
instead of:url = furl(url).remove(username=True, password=True).tostr()
Alright, I have disabled the proxy entirely and now everything is fine. I still don't know what the reason is for this behaviour, GET requests get through just fine.
I have attached full log. This error happened during starting some standard transformers training loop.
For example, I'd like to setup agent.python_binary variable for my "agent-services" service, how can I achieve that?
In the default docker compose file None there is "agent-services" service. It is an agent which for example runs pipeline controllers. Agent run from CLI can be configured, using clearml.conf file. Can I use clearml.conf file to configure "agent-services" service?
Yep, timeout as well, although requests.get(" None ") works just fine
The custom callback I have used is:
` class MyClearMLCallback(ClearMLCallback):
def init(self, *args, **kwargs):
self._task_name = kwargs.pop("task_name", None)
self._project_name = kwargs.pop("project_name", None)
super().init(*args, **kwargs)
def setup(self, args, state, model, tokenizer, **kwargs):
if self._clearml is None:
return
if state.is_world_process_zero:
logger.info("Automatic ClearML logging enabled."...
Simple docker compose. The connection shouldn't go through VPN. Generally even while using an agent at the same machine, upload is woefully slow - a 4 GB file is uploading for more than 40 minutes.