Reputation
Badges 1
43 × Eureka!but the reason I said the comparison could be an issue is because I'm not being able to do comparisons of experiments
I can't access the WebAPP nor ssh the server
yes, the code is inside a git repository In the main script: from callbacks import function_plot_conf_matrix
and inside callbacks.py
of course at the beginning we have from sklearn.metrics import confusion_matrix
or something like that
Worked perfectly, thanks!
sorry, in my case it's the default mode
Ok, I think figured it out. We started with a main script that imported sklearn and then we moved that function outside the main script, and instead imported that function.
So when we cloned the first time we had sklearn in the Installed Packages, and therefore our agent was able to run. The (now) cached clearml-venv had sklearn installed, and when it run the second experiment without the sklearn import in the main script and therefore without it in the Installed Packages it didn't matter, b...
could it be a memory issue triggered by the comparison of 3 experiments?
oh right, it will try to use globals from /etc/pip.conf first and then from the virtualenv's pip.conf
great! thank you for such a quick response!
I'm creating them for tensorboard yes, and they appear under the debug samples
tab
how quick is "very quickly"? we are talking about maybe 30 minutes to reach 100 epochs
I see the correct confusion matrices in tensorboard
I'm afraid I'm still having the same issue..
don't think so, I'm saving the model at the end of each epoch
it's very odd for me too, I have another project running trainings longer that 100 epochs and I don't have this issue
Awesome! I'll let you know if it works now
I don't understand though..why doesn't this happen on my other experiments?
the issue is that the confusion matrix showing for epoch 101 is in fact the one for epoch 1.
The images are stored in the default files server
I need to wait 100 epochs 😅
I'm plotting the confusion matrices the regular way, plot, then read figure from buffer to create the tensor, and save the tensor
oh wait, I was using clearml == 0.17.5 and I also had this issue