Hi @<1523708920831414272:profile|SuperficialDolphin93> , simply set output_uri=/mnt/nfs/shared
in Task.init
Answered
Hi!
How To Correctly Configure Clearml With Pytorch-Ignite To Write Checkpoints To The
Hi!
How to correctly configure Clearml with PyTorch-Ignite to write checkpoints to the /mnt/nfs/shared
Project Dir in a 3-agent cluster?
I tried this
task = Task.init(
project_name="train-unet-giraffe",
task_name=f"split-{config.data.split_index}",
output_uri=config.clearml.output_uri,
)
...
last_checkpointer = Checkpoint(
to_save={"model": model, "optimizer": optimizer, "trainer": trainer},
save_handler=ClearMLSaver(create_dir=True),
n_saved=2,
filename_prefix="last",
global_step_transform=global_step_from_engine(trainer),
)
But the ClearMLSaver uses a temporary folder per agent which is hard to collect afterwards.
6 Views
1
Answer
one day ago
19 hours ago
Tags