Reputation
Badges 1
50 × Eureka!Hi Martin, thanks for the swift response.
Yes, the artifacts, as backing up the full database would not resolve the question of capacity. Unless I’m missing something
Noted. Thanks SuccessfulKoala55 - I was aware of this, yet I didn’t think it would also compare in the comparison window - It does. My god, I love your product.
Additionally -
Are there any clever functionality for dumping experiment data to external storage to avoid filling up the server?
Ahh excellent! Thanks AgitatedDove14 🙂
I guess I could do a backup of the DB and flush the data, but what I’m looking for is more of a “Select X experiments -> Send to blob storage” to free up space.
Upon removing the phase
loop the epoch was detected automatically again.
Hi AgitatedDove14
Turns out my double loop caused some issues.
for e in range(num_epochs): for phase in ['train','valid']: for batch in dataloader:
Yes, exactly, from a previously executed run. Essentially, I write a grid of images which are supposed to learn a generative task, i’d like to download all images and generate a gif from the collection of images.
I’d like to create an animated gif
according to https://plotly.com/python/v3/LaTeX/ plotly should support Latex in labels since 3.6.0
Hi SuccessfulKoala55 , I believe I was only given one option in my region (EU Stockholm) which was the 0.16.1 version with the AMI location:
aws-marketplace/allegroai-trains-server-0.16.1-320-273-c5c210e4-5094-4eb9-a613-a32c0378de31-ami-06f5e9f4dfa499dca.4
I used the Trains AMI, and I am not sure whether it was the auto-updated or static one
Looks like we’re up and running
Hi AgitatedDove14 - I used the Tensorboard writer.scalars
function. Haven't tried the Plotly natively, but I guess its the same, since I imagine you're just doing a passthrough.
Hang on, It was the static - No auto update AMI:ami-0d6f44a1a7145a9f8
Yes, the login details to the Trains UI
A lot of bad requests and connection refused
AgitatedDove14 , after some more pronging, the error seem to stem from the clearML server. The upload from client side does not seem to occur, or the server is not registering the uploads.
I've attempted to restart the server and pull the latest image, same error.
I might add, this error only started to occur upon upgrading from trains to clearmL
Yes, I am aware of this. None of the values are being reported, neither the scalar, images in debug-samples, nor in the plots tab.
I am able to register that the image exists, however, the push towards the clearML server just does not happen
I see, no i do not get anything. I either get pop-ups when trying to capture the mpl
figure, and otherwise nothing happens when using report_image
.
I should mention this is run within a TF v1 session context
Hi AgitatedDove14 , I am also unable to archive the individual experiments in it.
It's not a major issue, but it would be nice to remove as some users may get confused.
However in this case the option is grayed out
Scalars plot fine, images do not find their way to the dashboard
It seems there is some async behavior going on. After ending a run, this prompt just hangs for a long time:
2021-04-18 22:55:06,467 - clearml.Task - INFO - Waiting to finish uploads
And there's no sign of updates on the dashboard.
I'll see if I can get time to do a reproducible example.
The issue only arises upon sending Images. (Both numpy, mpl and PIL)