Reputation
Badges 1
49 × Eureka!Wonât they be printed out as well in the Web UI? That shows the full Docker command for running the task rightâŚ
Yes for example, or some other way to get credentials over to the container safely without them showing up in the checked-in code or web UI
Ok, I re-checked and saw that the data was indeed cached and re-loaded - maybe I waited a little too long last time and it was already a new instance. Awesome implementation guys!
So the container itself gets deleted but everything is still cached because the cache directory is mounted to the host machine in the same place? Makes absolute sense and is what I was hoping for, but I canât confirm this currently - I can see that the data is reloaded each time, even if the machine was not shut down in between. Iâll check again to find the cached data on the machine
Sorry to ask again, but the values are still showing up in the WebUI console logs this way (see screenshot.)
Here is the config that I paste into the EC2 Autoscaler Setup:
` agent {
extra_docker_arguments: ["-e AWS_ACCESS_KEY_ID=XXXXXX", "-e AWS_SECRET_ACCESS_KEY=XXXXXX"]
hide_docker_command_env_vars {
enabled: true
extra_keys: ["AWS_SECRET_ACCESS_KEY"]
parse_embedded_urls: true
}
} `Never mind, it came from setting the options wrong, it has to be ...
Hi SuccessfulKoala55 , thanks for getting back to me!
In the docs of Task.set_base_docker()
it says âWhen running remotely the call is ignoredâ. Does that mean that this function call is executed when running locally to ârecordâ the arguments and then when I duplicate the experiment and clone it remote, the call is ignored and the recorded values are used?
Although, some correction here: While the secret is indeed hidden in the logs, it is still visible in the âexecutionâ tab of the experiment, see two screenshots below.
One again I set them withtask.set_base_docker(docker_arguments=["..."])
That was the missing piece - thank you!
Awesome to all the details you have considered in ClearML đ
Ok so actually if I run task.flush(wait_for_uploads=True)
at the end of the script it works â
I meant maybe me activating offline mode, somehow changes something else in the runtime and that in turn leads to the interruption. Let me try to build a minimal reproducible version đ
So without the flush I got the error apparently at the very end of the script - all commands of my actual Python code had run.
Hey guys, really appreciating the help here!
So what I meant by âit does workâ is that the environment variables go through to the container, I can use them there, everything runs.
The remaining problem is that this way, they are visible in the ClearML web UI which is potentially unsafe / bad practice, see screenshot below.
It might be broken for me, as I said the program works without the offline mode but gets interrupted and shows the results from above with offline mode. But there might be another issue in between of course - any idea how to debug?
The environment variable is good to know, I will try with that as well and report back.
Happy to and thanks!
I actually wanted to load a specific artifact, but didnât think of looking through the tasks output models. I have now changed to that approach which feels much safer, so we should be all done here. Thanks!
Hi AgitatedDove14 , so it took some time but Iâve finally managed to reproduce. The issue seems to be related to writing images via Tensorboard:
` from torch.utils.tensorboard import SummaryWriter
import torch
from clearml import Task, Logger
if name == "main":
task = Task.init(project_name="ClearML-Debug", task_name="[Mac] TB Logger, offline")
tb_logger = SummaryWriter(log_dir="tb_logger/demo/")
image_tensor = torch.rand(256, 256, 3)
for iter in range(10):
t...
Hi @<1523703436166565888:profile|DeterminedCrab71> and @<1523701070390366208:profile|CostlyOstrich36> , coming back to this after a while. It actually seems to be related to Google Cloud permissions:
- The images in the ClearML dashboard to not show as discussed above
- If I copy the image url (coming out as something like None and open it in another tab where Iâm logged into my Google Account, the image loads
- If I do t...
@<1523701070390366208:profile|CostlyOstrich36> thank you, now everything works so far!
Last thing: Is there any way to change all the links in the new ClearML server such that an artifact that was previous under s3://âŚ
is now taken from gs://âŚ
? The actual data is already available under the gs:// link of course