But what should I do? It does not work, it says incorrect password as you can see
How are you spinning the agent machine ?
Basically 10022 port from the host (agent machine) is routed into the container, but it still needs to be open on the host machine, could it be it is behind a firewall? Are you (client side runnign clearml-session) on the same network as the machien runnign the agent ?
set the following:CLEARML_AGENT_DISABLE_SSH_MOUNT=1 clearml-agent daemon ...
The issue is, it will automatically mount the .ssh of the host into the container, so that if you are using SSH to clone git you have credentials, in your case, it also mounts the configuration, hence failing to login.
I will make sure we add it to the configuration file, so it is more visible
hmm can you share the log of the Task? (the clearml-session created Task)
no available π
I mean if I enter my host machine ssh password it works. But we will disable password auth in future, so itβs not an option
To clarify, it should not allow users to ssh into the host machine (if you can do that this means you own it), it only allows users to SSH into the container the host machine spins, make sense ?
But itβs running in docker mode and it is trying to ssh into the host machine and failing
It is Not sshing to the machine it is sshing directly Into the container.
Notice the port is is sshing to is 10022 which is mapped into the container
same: Not Found (#404)
May I suggest to DM it to me (so it is not public)
It does not use key auth, instead sets up some weird password and then fails to auth:
AdventurousButterfly15 it ssh Into the container inside the container it sets new daemon with new random very long password
It will Not ssh to the host machine (i.e. the agent needs to run in docker mode, not venv mode), make sense ?
Btw it seems the docker runs in
network=host
Yes, this is so if you have multiple agents running on the same machine they can find a new open port π
I can telnet the port from my mac:
Okay this seems like it is working
Trains is fully open-source, that said properly publishing and maintaining the web client is still on our to do list (I mean there is totally readable JavaScript code packaged in the trains-server and the dockers). It is constantly pushed because there is generally less contributions on the front-end with these kind of projects. That said of you guys are willing to help, it will greatly help in pushing it forward... LivelyLion31 what do you think, would you guys like to help with the fronte...
Awesome, PRs are always welcome, and we try to help with any request and feature coming for users. We just added audio support (RC releasing in a few days) based only on users request.
https://github.com/allegroai/trains/issues/120
Hi, is there a possibility to use one GPU card with 2 agents concurrently
RoundMosquito25 / EnviousPanda91
You need to change the WORKER_ID (no two workers can share the same ID)CLEARML_WORKER_ID="machine:gpu01" clearml-agent daemon ....
Hmm let me check, I think we changed the offline mode to use the latest API version (because by definition it cannot know what's the server).
Let me check if you can override it
Hi RipeGoose2
I just test the hydra example, seems to work when you add the offline right after the import:
` from clearml import Task
Task.set_offline(True) `
Hi @<1634001100262608896:profile|LazyAlligator31>
Is this because the code repo is being recreated in this directory?
Yes this is correct π
Basically the entire code base + venv is installed there, to make sure it does not intyerfere with the "system" preinstalled environment
(it also allows for caching on the host machine π )
Right, DepressedChimpanzee34 what's the clearml version you are using ?
Hi DisgustedDove53
Is redis used as permanent data storage or just cache?
Mostly cache (Ithink)
Would there be any problems if it is restarted and comes up clean?
Pretty sure it should be fine, why do you ask ?
PunyBee36 to get https add an aws elb before the server , the elb will add the https to any outside connection
Hi JitteryCoyote63 ,
When you shutdown the task (manually with close() or when the process finish) it wait for the uploads...
Why do you need to specifically wait for all the artifacts upload? (currently you can stop the artifacts upload thread and wait for all the artifacts, but that seems like a bad hack)
Hi DeliciousKoala34
I am using Pycharm and i have set up the clear-ml plugin, but it still doesnt work.
Did you provide the key/secret to the plugin? I think this is a must for it to actually work
In your code, can you print the following:import os print(os.environ.keys())
There should be a few keys the Pycharm plugin is sending from the local machine, pointing to the git repo
hmm DeliciousKoala34
what are you getting if you put this at the top of your code (the one you are running in the remote docker)import os print([(k, os.environ[k]) for k in os.environ if k.startswith("CLEARML_")])
Hi DeliciousKoala34
This means the pycharm plugin was not able to run git on your local machine.
Whats your OS ?
could it be that if you open cmd / shell "git" is not in the path ?
BTW: latest PyCharm plugin with 2022 support was just released:
https://github.com/allegroai/clearml-pycharm-plugin/releases/tag/1.1.0
Can you also make sure you did not check "Disable local nachine git detection" in the clearml PyCharm plugin?
DeliciousKoala34 any chance you are using PyCharm 2022 ?
And this is with the latest pycharm plugin 1.1.0 ?
Hmm, could it be that the working dir is outside of the git repo?
I think RoughTiger69 was discussing this exact scenario
https://clearml.slack.com/archives/CTK20V944/p1629885416175500?thread_ts=1629881415.172600&cid=CTK20V944
wdyt?
Thanks SubstantialElk6 !
Happy new year π πΊ πΎ π