
Reputation
Badges 1
63 × Eureka!TRAINS Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-start TRAINS Monitor: Reporting detected, reverting back to iteration based reporting
yes, that solved the errors, however the two lines "could not detect iteration reporting" and "reporting detected" a few moments later, still show up
I'll look at the security group. Any tips on how to configure it so that it isn't exposed to the entire world, but also not locked to me?
so, inbound rules should allow custom TCP for the three ports, 8080, 8001, 8081? what about the outbound rules?
SuccessfulKoala55 in step 1. it isn't written
but if I just enter the URL with ending :8080 it takes me to the old login
` Traceback (most recent call last):
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/torch/utils/tensorboard/init.py", line 2, in <module>
from tensorboard.summary.writer.record_writer import RecordWriter # noqa F401
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/trains/binding/import_bind.py", line 59, in __patched_import3
level=level)
ModuleNotFoundError: No module named 'tensorboard'
During handling of the above exception, ...
it turns out that my docker-compose.yml
wasn't in the environment path, so when I first ran the down
command, it had no effect
Good morning Alon, since you helped me so much getting tensorboard to show results yesterday, I'm hoping you can help me understand why some results I'm getting are strange:
Thanks for letting me know, I'd be very happy to update.
Thanks Jake, clearing the Cache did the trick! thank you so much for your assistance!
(and I didn't use the -f
switch since it wasn't in the instructions, and I'm not familiar with dockers all that much)
Turns out the step I missed (maybe should be mentioned in the doc...) the configuration of the Security Group for the EC2 machine to allow inbound connections to the ports 8080, 8008, 8081, and to limit the source to my ip (or my office ip) only
it will switch to the new one
Thank you Martin for your fast response! Will do
after running docker ps
I saw that all the ports are still listed. I then changed the name of /opt/clearml
back to /opt/trains
and ran the command sudo docker-compose -f /opt/trains/docker-compose.yml down
and it did the trick
if I add :8080/login I get the new "clearML" welcome page
this is an error during training that points out to ElasticSearch error. This might be also the cause of the delete error, what do you think SuccessfulKoala55 ?
I think you have the page cached - can you try reloading using
Ctrl-F5
?
using Ctrl-F5 it redirects to the ClearML new login
in the meantime, I got this error message, this time regarding Trains:
the train_loss is on the second from left column (the far left is epoch num 30-36)