Unanswered
Hi, I See Some Strange Behavior Where The Training Fails When Running On Clearml (Loss = Nan) Compared To Running W/O Clearml. This Is Entirely Reproduceable.
Has Anyone Seen This?
The code that generates this is the fit method in TFmodel.fit(train_dataset, validation_data=val_dataset, epochs=cfg.fit.epochs, callbacks=callbacks, verbose=2)
Clearml is activated in the usual way:task = Task.init(project_name=project_name, task_name=name, output_uri=True, auto_connect_frameworks={'tensorflow': False}, **kwargs)
77 Views
0
Answers
6 months ago
6 months ago