I call
Task.init
after I import tensorflow (and thus tensorboard?)
That should have worked...
Can you manually add a TB report before calling opennmt
function ?
(I want to verify the Task.init is indeed catching the TB calls, my theory is that somewhere inside the opennmt
we loose the TB)
It worked! I added this call shortly after Task.init
:tf.summary.create_file_writer("C:/mypath/logs")
Hmm StrangePelican34
Can you verify you call Task.init before TB is created ? (basically at the start of everything)
No TB (Tesnorboard) is not enabled.
That explains it 🙂 did you manage to get it working ?
The only place I see subprocess being called in opennmt is to determine the batch size, but not for the primary training task.
So, accordintg to the article (and the code as far as I could tell), OpenNmt-tf automatically enabled TensorBoard. That is, it auto-logs the relevant features through tf.summary ( https://www.tensorflow.org/api_docs/python/tf/summary ). This is output on the cmd line with the likes of:INFO:tensorflow:Evaluation result for step 9000: loss = 1.190986 ; perplexity = 3.290324 ; bleu = 63.569644 INFO:tensorflow:Step = 9100 ; steps/s = 2.17, source words/s = 28293, target words/s = 39388 ; Learning rate = 0.000927 ; Loss = 1.381563
However, this data is not picked up automatically by ClearML. I am specifically looking at opennmt's Runner.train: https://opennmt.net/OpenNMT-tf/package/opennmt.Runner.html
In Tensorflow's init .py, tensorboard appears to be initialized (including tf.summary):
` # Hook external TensorFlow modules.
Import compat before trying to import summary from tensorboard, so that
reexport_tf_summary can get compat from sys.modules. Only needed if using
lazy loading.
_current_module.compat.v2 # pylint: disable=pointless-statement
try:
from tensorboard.summary._tf import summary
_current_module.path = (
[_module_util.get_parent_dir(summary)] + _current_module.path)
setattr(_current_module, "summary", summary)
except ImportError:
_logging.warning(
"Limited tf.summary API due to missing TensorBoard installation.") I call
Task.init ` after I import tensorflow (and thus tensorboard?) but before I create the opennmt runner. Should this be ok? Are you referring to something else when saying "call Task.init beofre TB is created"?
No TB (Tesnorboard) is not enabled. I just googled it and found this: https://forum.opennmt.net/t/running-tensorboard/4242 . I will try enabling TB and see if that fixes it.
From the docs I think what's going on is that the https://opennmt.net/OpenNMT-tf/package/opennmt.Runner.html#opennmt.Runner.train is spinning a new subprocess, and the training itself happens on the subprocess.
If this is the case this will explain the lack of automagic, as the subprocess is lacking the "Task.init" call
wdyt, could that be the case ?
Hi StrangePelican34
What exactly I not working? Are you getting any TB reports?