
Reputation
Badges 1
53 × Eureka!Hi @<1523703436166565888:profile|DeterminedCrab71> and @<1523701070390366208:profile|CostlyOstrich36> , coming back to this after a while. It actually seems to be related to Google Cloud permissions:
- The images in the ClearML dashboard to not show as discussed above
- If I copy the image url (coming out as something like None and open it in another tab where I’m logged into my Google Account, the image loads
- If I do t...
I’m on Safari actually, but I just checked on Chrome (which shows this unsecure connection indicator) and images are activated. Might it still be due to non-HTTPS connection? We should get on that anyhow
Yes for example, or some other way to get credentials over to the container safely without them showing up in the checked-in code or web UI
Won’t they be printed out as well in the Web UI? That shows the full Docker command for running the task right…
Sorry to ask again, but the values are still showing up in the WebUI console logs this way (see screenshot.)
Here is the config that I paste into the EC2 Autoscaler Setup:
` agent {
extra_docker_arguments: ["-e AWS_ACCESS_KEY_ID=XXXXXX", "-e AWS_SECRET_ACCESS_KEY=XXXXXX"]
hide_docker_command_env_vars {
enabled: true
extra_keys: ["AWS_SECRET_ACCESS_KEY"]
parse_embedded_urls: true
}
} `Never mind, it came from setting the options wrong, it has to be ...
Although, some correction here: While the secret is indeed hidden in the logs, it is still visible in the “execution” tab of the experiment, see two screenshots below.
One again I set them withtask.set_base_docker(docker_arguments=["..."])
Hi SuccessfulKoala55 , thanks for getting back to me!
In the docs of Task.set_base_docker()
it says “When running remotely the call is ignored”. Does that mean that this function call is executed when running locally to “record” the arguments and then when I duplicate the experiment and clone it remote, the call is ignored and the recorded values are used?
That was the missing piece - thank you!
Awesome to all the details you have considered in ClearML 😉
So the container itself gets deleted but everything is still cached because the cache directory is mounted to the host machine in the same place? Makes absolute sense and is what I was hoping for, but I can’t confirm this currently - I can see that the data is reloaded each time, even if the machine was not shut down in between. I’ll check again to find the cached data on the machine
Hey guys, really appreciating the help here!
So what I meant by “it does work” is that the environment variables go through to the container, I can use them there, everything runs.
The remaining problem is that this way, they are visible in the ClearML web UI which is potentially unsafe / bad practice, see screenshot below.
Ok, I re-checked and saw that the data was indeed cached and re-loaded - maybe I waited a little too long last time and it was already a new instance. Awesome implementation guys!
Hi @<1523701087100473344:profile|SuccessfulKoala55> , sorry there was a mistake on my end - clearml.conf pointed to the wrong URL 🙈
More stack trace:
clearml-elastic | ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];
clearml-elastic | Likely root cause: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes
clearml-elastic | at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
clearml-elastic | at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
clearml-el...
SuccessfulKoala55 AgitatedDove14 So I’ve tried the approach and it does work, however, this of course results in the credentials being visible in the ClearML web interface output, which comes close to just hard-coding them in…
Is there any way to send the secrets safely?
Is there any way to access the clearml.conf file contents from within code? (afaik, the file does not get send over to the container - otherwise I could just yml-read it myself…)
SuccessfulKoala55 just in case you have any more thoughts, but we could also continue as is 😊
Well duh, now it makes total sense! Should have checked docs or examples more closely 🙏
Yes if that works reliably then I think that option could make sense, it would have made things somewhat easier in my case - but this is just as good.
It was related, special characters also made prevented some access.
But it was and is also related to some authentication problem with Google: If you open the dashboard in Chrome, go to the developer console, you see a bunch of failed links to authenticate to. If you click one of them in another tab, it shows the Google signin screen and afterwards you can see the Debug Samples in Dashboard.
That all does not work in safari though for some reason 🙂
To recap, the server started up on GCP as expected before migrating the data over. The migration was done by
- deleting the current data
sudo rm -fR /opt/clearml/data/*
- unpacking the backup
sudo tar -xzf ~/clearml_backup_data.tgz -C /opt/clearml/data
- setting permissions
sudo chown -R 1000:1000 /opt/clearml
Yes and yes - is that the issue and it might likely go away if we host it via HTTPS?
So AgitatedDove14 if we use the CLEARML_OFFLINE_MODE
environment variable instead the program runs through again.
The only thing is that now we get errors of the form
` 0%| | 0/18 [00:00<?, ?image/s]ClearML running in offline mode, session stored in /home/manuel/.clearml/cache/offline/offline-167ceb1cd3c946df8abc7206b781b486
2022-11-07 07:49:06,986 - clearml.metrics - WARNING - Failed uploading to /home/manuel/.clearml/cache/offline/offline-167ceb1cd3c946df8abc7206b781b486/...
Yes totally, but we’ve been having problems of getting these GPUs specifically (even manually in the EC2 console and across regions), so I thought maybe it’s easier to get one big one than many small ones, but I’ve never actually checked if that is true 🙂 Thanks anyhow!
I actually wanted to load a specific artifact, but didn’t think of looking through the tasks output models. I have now changed to that approach which feels much safer, so we should be all done here. Thanks!