Hi MinuteWalrus85 .
Good news about fastai
, the integration in almost done and a version will be release in the coming days :)
Thanks for letting me know, I'd be very happy to update.
in the meantime, I got this error message, this time regarding Trains:
` Traceback (most recent call last):
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/torch/utils/tensorboard/init.py", line 2, in <module>
from tensorboard.summary.writer.record_writer import RecordWriter # noqa F401
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/trains/binding/import_bind.py", line 59, in __patched_import3
level=level)
ModuleNotFoundError: No module named 'tensorboard'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/fastai/callbacks/tensorboard.py", line 234, in _queue_processor
request.write()
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/fastai/callbacks/tensorboard.py", line 424, in write
self.tbwriter.add_graph(model=self.model, input_to_model=self.input_to_model)
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/tensorboardX/writer.py", line 793, in add_graph
from torch.utils.tensorboard._pytorch_graph import graph
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/trains/binding/import_bind.py", line 59, in __patched_import3
level=level)
File "/home/ubuntu/MultiClassLabeling/myenv/lib/python3.6/site-packages/torch/utils/tensorboard/init.py", line 4, in <module>
raise ImportError('TensorBoard logging requires TensorBoard with Python summary writer installed. '
ImportError: TensorBoard logging requires TensorBoard with Python summary writer installed. This should be available in 1.14 or above. `
TRAINS Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-start TRAINS Monitor: Reporting detected, reverting back to iteration based reporting
Hi MinuteWalrus85 ,
Do you have tensorboard
installed too?
I installed trains
, fastai
, tensorboard
and tensorboardx
and run a simple example, can be view in this link -
https://demoapp.trains.allegro.ai/projects/bf5c5ffa40304b2dbef7bfcf915a7496/experiments/e0b68d0fe80a4ff6be332690c0d968be/execution
yes, that solved the errors, however the two lines "could not detect iteration reporting" and "reporting detected" a few moments later, still show up
it isn't an error, but just an observation
Hi MinuteWalrus85 ,
You are right and this is an observation about the reports on your experiment.
Onvce trains can't detect reports sending by the script, it will fallback to report iterations according to const time, until reports will be detected.
Good morning Alon, since you helped me so much getting tensorboard to show results yesterday, I'm hoping you can help me understand why some results I'm getting are strange:
The valid_loss and Accuracy are showing on the Tboard in the same number values as they show up on the terminal, but the train_loss is showing in a different scale and I can't figure out why. I did not change anything in the core files of either torc, Tboard or fastai, and used the intialization in the same way that you showed, and was on fastai docs, using learn.callback_fns.append(partial(LearnerTensorboardWriter, base_dir=tboard_path, name=taskName))
Here is an example of the results from terminal:
the train_loss is on the second from left column (the far left is epoch num 30-36)
here is the result on the Tboard in Trains:
This is the valin_loss, which is correct:
Hi MinuteWalrus85 , Good morning 🌞
What do you get when view the results with Tensorboard dashboard?
(you can view those with tensorboard --logdir=<tboard_path>
)
I will try and let you know in the next experiment
MinuteWalrus85 thanks for the screenshot, asking about TB dashboard to understand where the issue is coming from.
Trains is patching TB stats and showing it to you in the web-app, so if the results are the same in TB dashboard, the reporting of the values can be wrong, if the TB dashboard and the web-app have different results, there can be an issue with web-app reporting
Understood. If there is something I can tweak in the reporting, I couldn't find where I tweak it since it is supposed to be related to the one line of activation of the reporting learn.callback_fns.append(partial(LearnerTensorboardWriter, base_dir=tboard_path, name=taskName))
do you have any ideas what are the options I can do to change the report of the train_loss?