I'll see if I can get time to do a reproducible example.
The issue only arises upon sending Images. (Both numpy, mpl and PIL)
Hmm can you test with the latest RC?pip install clearml==0.17.6rc1
Likely related to: https://github.com/allegroai/clearml/issues/321 - clearML waits indefinitely on logs from TF v1
It seems there is some async behavior going on. After ending a run, this prompt just hangs for a long time:
2021-04-18 22:55:06,467 - clearml.Task - INFO - Waiting to finish uploads
And there's no sign of updates on the dashboard.
The issue only arises upon sending Images. (Both numpy, mpl and PIL)
BTW: they should appear under debug-samples
Tab in the results
I'm manually executing on a local machine
Hi TrickyRaccoon92
TKinter
is suddenly used as backend, and instead of writes to the dashboard I get popups per figure.
Are you running with an agent of manually executing the code ?
And another detail, upon running the same code in a notebook session, everything gets stored as intended (to clearML dashboard)
If this is the case, then we do not change the maptplotlib backend
Also
I've attempted converting the
mpl
image to
PIL
and use
report_image
to push the image, to no avail.
What are you getting? error / exception ?
I see, no i do not get anything. I either get pop-ups when trying to capture the mpl
figure, and otherwise nothing happens when using report_image
.
I should mention this is run within a TF v1 session context
Yes, I am aware of this. None of the values are being reported, neither the scalar, images in debug-samples, nor in the plots tab.
I am able to register that the image exists, however, the push towards the clearML server just does not happen
t seems there is some async behavior going on. After ending a run, this prompt just hangs for a long time:
2021-04-18 22:55:06,467 - clearml.Task - INFO - Waiting to finish uploads
And there's no sign of updates on the dashboard
Hmm that could point to an issue uploading the last images (which are larger than regular scalars) could you try flushing and waiting ?
i.e.task.flush() sleep(45)
Scalars plot fine, images do not find their way to the dashboard
AgitatedDove14 , after some more pronging, the error seem to stem from the clearML server. The upload from client side does not seem to occur, or the server is not registering the uploads.
I've attempted to restart the server and pull the latest image, same error.
Got it to work now. Suspect something to do with a hydra issue, as both clearML hosted nor self-hosted option presented the same issue. Will update as I get a chance.
I might add, this error only started to occur upon upgrading from trains to clearmL
Hi TrickyRaccoon92
Yes please update me once you can, I would love to be able to reproduce the issue so we could fix for the next RC 🙂
I should mention this is run within a TF v1 session context
This should not be connected.
everything gets stored as intended (to clearML dashboard)
So in jupyter it works? But from command line it does not ? what's the difference ?
I'm running in the command line with Hydra and TFv1