the train_loss is on the second from left column (the far left is epoch num 30-36)
Hi MinuteWalrus85 ,
Do you have tensorboard
installed too?
I installed trains
, fastai
, tensorboard
and tensorboardx
and run a simple example, can be view in this link -
https://demoapp.trains.allegro.ai/projects/bf5c5ffa40304b2dbef7bfcf915a7496/experiments/e0b68d0fe80a4ff6be332690c0d968be/execution
The valid_loss and Accuracy are showing on the Tboard in the same number values as they show up on the terminal, but the train_loss is showing in a different scale and I can't figure out why. I did not change anything in the core files of either torc, Tboard or fastai, and used the intialization in the same way that you showed, and was on fastai docs, using learn.callback_fns.append(partial(LearnerTensorboardWriter, base_dir=tboard_path, name=taskName))
TRAINS Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-start TRAINS Monitor: Reporting detected, reverting back to iteration based reporting
` Traceback (most recent call last):
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/torch/utils/tensorboard/init.py", line 2, in <module>
from tensorboard.summary.writer.record_writer import RecordWriter # noqa F401
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/trains/binding/import_bind.py", line 59, in __patched_import3
level=level)
ModuleNotFoundError: No module named 'tensorboard'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/fastai/callbacks/tensorboard.py", line 234, in _queue_processor
request.write()
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/fastai/callbacks/tensorboard.py", line 424, in write
self.tbwriter.add_graph(model=self.model, input_to_model=self.input_to_model)
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/tensorboardX/writer.py", line 793, in add_graph
from torch.utils.tensorboard._pytorch_graph import graph
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/trains/binding/import_bind.py", line 59, in __patched_import3
level=level)
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/torch/utils/tensorboard/init.py", line 4, in <module>
raise ImportError('TensorBoard logging requires TensorBoard with Python summary writer installed. '
ImportError: TensorBoard logging requires TensorBoard with Python summary writer installed. This should be available in 1.14 or above. `
yes, that solved the errors, however the two lines "could not detect iteration reporting" and "reporting detected" a few moments later, still show up
Here is an example of the results from terminal:
MinuteWalrus85 thanks for the screenshot, asking about TB dashboard to understand where the issue is coming from.
Trains is patching TB stats and showing it to you in the web-app, so if the results are the same in TB dashboard, the reporting of the values can be wrong, if the TB dashboard and the web-app have different results, there can be an issue with web-app reporting
I will try and let you know in the next experiment
This is the valin_loss, which is correct:
it isn't an error, but just an observation
Good morning Alon, since you helped me so much getting tensorboard to show results yesterday, I'm hoping you can help me understand why some results I'm getting are strange:
here is the result on the Tboard in Trains:
Understood. If there is something I can tweak in the reporting, I couldn't find where I tweak it since it is supposed to be related to the one line of activation of the reporting learn.callback_fns.append(partial(LearnerTensorboardWriter, base_dir=tboard_path, name=taskName))
do you have any ideas what are the options I can do to change the report of the train_loss?
Hi MinuteWalrus85 , Good morning 🌞
What do you get when view the results with Tensorboard dashboard?
(you can view those with tensorboard --logdir=<tboard_path>
)
change the report of the train_loss?
Are you referring for not sending the train_loss
results?
MinuteWalrus85 didn’t success to reproduce it, can you share with me your experiment (without any data, just how to reproduce)? We can continue in DM if you like
Thanks for letting me know, I'd be very happy to update.