Reputation
Badges 1
22 × Eureka!I need to use SSH-based authentication, so I guess that's not an option. Well the removal affects cloning URL, i.e. while ssh://git@repo.com:2222/jks/experiment.git works fine ssh://repo.com:2222/jks/experiment.git doesn't, because it assumes current user as SSH user.
I think it should be something like this:f = furl(url) if f.scheme in ['http', 'https', 'ssh'] and f.password: url = f.remove(username=True, password=True).tostr()
instead of:url = furl(url).remove(username=True, password=True).tostr()
Therefore it's a minor bug and not working as intended because that function converts:
ssh://git@repo.com:2222/jks/experiment.git into
ssh://repo.com:2222/jks/experiment.git
api {
# Notice: 'host' is the api server (default port 8008), not the web server.
api_server:
web_server:
files_server:
# Credentials are generated using the webapp, /settings
# Override with os environment: CLEARML_API_ACCESS_KEY / CLEARML_API_SECRET_KEY
credentials {"access_key": "snip", "secret_key": "snip"}
}
I think it works now by accident, as mailto:git@repo.com is not a valid scheme URL, so it doesn't get parsed at all by furl library
Yes, it's the same, because the passed url is the same. I need to have git url in the mentioned format or the agent cannot clone the repo.
Something is changed by executing the program through agent, because I executed exactly the same code on exactly the same docker image and it doesn't produce this error.
The custom callback I have used is:
` class MyClearMLCallback(ClearMLCallback):
def init(self, *args, **kwargs):
self._task_name = kwargs.pop("task_name", None)
self._project_name = kwargs.pop("project_name", None)
super().init(*args, **kwargs)
def setup(self, args, state, model, tokenizer, **kwargs):
if self._clearml is None:
return
if state.is_world_process_zero:
logger.info("Automatic ClearML logging enabled."...
Alright, I have disabled the proxy entirely and now everything is fine. I still don't know what the reason is for this behaviour, GET requests get through just fine.
Simple docker compose. The connection shouldn't go through VPN. Generally even while using an agent at the same machine, upload is woefully slow - a 4 GB file is uploading for more than 40 minutes.
For example, I'd like to setup agent.python_binary variable for my "agent-services" service, how can I achieve that?
Yep, custom image, extension of the nvidia/cuda:12.0.0-base-ubuntu22.04 image.
These are the last lines of the Dockerfile:
I have attached full log. This error happened during starting some standard transformers training loop.
On the other hand: remove_user_pass_from_url("git@repo.com:2222/jks/experiment.git")
works correctly as it results in git@repo.com:2222/jks/experiment.git
Yep, timeout as well, although requests.get(" None ") works just fine
Any idea how to handle that apart from not using Git LFS?
In the default docker compose file None there is "agent-services" service. It is an agent which for example runs pipeline controllers. Agent run from CLI can be configured, using clearml.conf file. Can I use clearml.conf file to configure "agent-services" service?
The error is somehow connected to reinitializing task twice, I don't know what's the "true" way of using transformer's ClearMLCallback within clearml pipeline.