Reputation
Badges 1
25 × Eureka!WickedElephant66 is this issue the same as this one?
https://clearml.slack.com/archives/CTK20V944/p1656537337804619?thread_ts=1656446563.854059&cid=CTK20V944
What do you mean cache files ? Cache is machine specific and is set in the clearml.conf file.
Artifacts / models are uploaded to the files server (or any other object storage solution)
DepressedChimpanzee34
I might have an idea , based on the log you are getting LazyCompletionHelp
in stead of str
Could it be you installed hyrda bash completion ?
https://github.com/facebookresearch/hydra/blob/3f74e8fced2ae62f2098b701e7fdabc1eed3cbb6/hydra/_internal/utils.py#L483
I am actually saving a dictionary that contains the model as a value (+ training datasets)
How are you specifically doing that? pickle?
This, however, requires that I slightly modify the clearml helm chart with the aws-autoscaler deployment, right?
Correct π
clearml.conf is the file thatΒ
clearml-init
Β suppose to create, right?
Correct, specifically ~/clearml.conf
Building the pipeline in runtime from external configuration is very cool!!
I think nested components is exactly the correct solution, and it is a great use case.
ReassuredTiger98 can you send the full log?
Also, what's the clearml-agent version?
fyi: we fixed an issue where the default order of the conda repositories cause pytorch to be installed form the conda forge instead of the pytorch repo, making it the cpu version instead of the gpu version:
This is the correct conda repo orderL
https://github.com/allegroai/clearml-agent/blob/cb6bdece39751eaef975287609b8bab603f116e5/docs/clearml.conf#L66
yes, so it does exist the local process (at least, the command returns),
What do you mean the command returns ? are running the scipt from bash and it returns to bash ?
hen, in the bash console, after some time, I see some messages being logged from clearml
JitteryCoyote63 Hmm that is strange, let me check something
SmallBluewhale13
And the Task.init registers 0.17.2 , even though it prints (while running the same code from the same venv) 0.17.2 ?
and you have clearml v0.17.2 installed on the "system" packages level, and 0.17.5rc6 installed inside the pyenv venv ?
It seems stuck somewhere in the python path... Can you check in runtime what's os.environ['PYTHONPATH']
I'm glad it worked out, thanks SmallBluewhale13 π
Hi FiercePenguin76
Hereβs my workaround - ignore the fail messages, and manually create an SSH connection to the server with Jupyter port forwarded.
You are correct, clearml-session assumes it can SSH into the remote agent machine, from that point it automatically tunnels all other connections on top of the original SSH (well with some fancy SSH keep-alive proxy).
I'm assuming that from home you cannot connect to the SSH machine at the office, which makes sense, but out of curiosity...
ngrok to connect to the remote server at the office?
That makes sense, I guess this is the equivalent of using a VPN, from that point onward clearml-session can directly access the remote machine, right?
Is it possible to do something so that the change of the server address is supported and the pictures are pulled up on the new server from the new server?
The link itself (full link) is stored inside the server. Can I assume the access is IP based not host based (i.e. dns) ?
Okay, I think I lost you...
DilapidatedDucks58 you mean detect at which "iteration" the max value was reported, and then extract all the other metrics for that iteration ?
π thank you so much @<1556450111259676672:profile|PlainSeaurchin97> !!!
Hmmm, yes we should definitely add --debug (if you can, please add a GitHub issue so it is not forgotten).
FiercePenguin76 Specifically are you able to ssh manually to <external_address>:<external_ssh_port> ?
It looks somewhat familiar ... π
SuccessfulKoala55 any idea?
CheerfulGorilla72
yes, IP-based access,
hmm so this is the main downside of using IP based server, the links (debug images, models, artifacts) store the full URL (e.g. http://IP:8081/ http://IP:8081/... ) This means if you switched IP they will no longer work. Any chance to fix the new server to the old IP?
(the other option is somehow edit the DB with the links, I guess doable but quite risky)
If i have an alternative location for the vscode, where should i indicate in the configuration?
We might need to add support for that, but it should not be a problem to override (e.g. downloadable link like http/s3/ etc.)
Is this something that is doable ?