 
			Reputation
Badges 1
41 × Eureka!trains ( 0.15.1-367 )  appears to be the version, same as you. Thank you. Appears Trains is up to date.
Apparently there should be 6 of them:
Thank you 😉
Same problem with  775
Ok it's that the user group also has to be root. I ran the following:sudo chmod 775 -R /opt/trains/ sudo chown -R root:root /opt/trainsand it works.
It seems that it has to be  775  with both user and group as root. E.g.  771  does not work, because than the  docker   command has to be used with  sudo   (if I want to use my default sudo-user account)
Would have been nice if they would have reached out to you guys/gals before removing Trains 😅
AgitatedDove14 TB has the confusion matrix like this:
Aah, I couldn't find it under PLOTS, but indeed it's there under DEBUG SAMPLES.
With PyTorch Lightning, I only use this line at the beginning of a Jup Notebook:Task.init(project_name=project_name, task_name=task_name)The code to log the confusion matrix is in some .py file though that does not have any Trains code.
Is it possible to log it in a TB compatible way, that will be automatically picked up by Trains? I prefer to keep the .py Trains free.
AppetizingMouse58  If I:sudo chmod 771 -R /opt/trains/(taking all permission away from other except execution)
The file permission error comes back, even though everything is under  the root user.
Hi  AgitatedDove14
Not using trains-agent yet. Just using PyTorch Lightning in Jupyter Notebook with as Logger Trains.
So I'm talking about runtime and GPU usage in experiments.
AgitatedDove14 Done!
Ah I see, it's based on a naming scheme, thanks. Sorry I forgot to link the tutorial I was looking at: https://allegro.ai/docs/examples/frameworks/pytorch/pytorch_tensorboard/
That's useful to know! But actually in this case I want to just test if the code works (run 2 epochs and see if it works). I don't want this to be logged, so I don't  Task.init  in those cases.
I don't want the code to crash on Trains in those cases.
I see that  Task.current_task()  returns None if no task is running, so I can use that with an if statement  🙂
Ah my bad, it seems I had to rundocker-compose -f /opt/trains/docker-compose.yml pullonce. I quickly tried trains like half a year ago, so maybe it was using the old images? However, I thought  --build  would take care of that.
Now it's working 🙂
Even when I do a "clean install" (renamed the  /opt/trains ) folder and followed the instructions to setup TRAINS, the error appears.
First I tried without build, but same problem.  --build  just means that it will re-download all layers instead of using the ones already cached.
Is it possible it's not just about the root user, but also the root group?
After a while I get the message:
New version available
Click the reload button below to reload the web page
I click the "RELOAD" button and the "newer version" message disappear. However, some plots still don't show up (fixed in 0.15.1). If I refresh the TRAINS webinterface, the "newer version" message appears again.
AgitatedDove14  There is only a  events.out.tfevents.1604567610.system.30991.0  file.
If I open this with a text editor, most is unreadable, but I do find a the letters "PNG" close to the name of the confusion matrix. So it looks like the image is encoded inside the TB log file?
It's my colleague's experiment (with scikit-learn), so I'm not sure about the details.
Hmm, after connecting with the VPN again and using ctrl + F5, there is no complaint anymore. Although a colleague uploaded a Seaborn plot, but it's still not showing up, which I thought was fixed in the new version?
The plots page is pure white of that experiment, and not the usual "No chart data" if no plot was uploaded.
The only change I made in the .yml file was:
` ports:
- "8080:80" toports:
- "8082:80" `
 I already had something running on 8080, but since it's the trains-apiserver and not the webserver, this shouldn't be an issue.
As there are quite some hparams, which also change depending on the experiment, I was hoping there was some automatic way of doing it?
For example that it will try to find all dict entries that match  "yet_another_property_name": "some value"  , and ignore those that don't.
The value has to be converted to a string btw?
Is there anyway how I can figure out in the webinterface what version of Trains is actually running?
AgitatedDove14 Thank you, this code example is very helpful!
` trains-elastic exited with code 1
trains-elastic    | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
trains-elastic    | {"type": "server", "timestamp": "2020-11-02T08:04:57,699Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "cluster.name": "trains", "node.name": "trains", "message": "uncaught exception in thread [main]",
trains-elastic    | "stacktrace": ["org.elast...
Ok, it was indeed something with permission. When I chown everything to root (1000) and chmod 777 it worked. 777 is of course not desirable, so I'm going to narrow it down now.
Thank you for the reply! The migration indeed created this elastic_7 folder.
Port  8008  cannot be changed apparently:
https://allegroai-trains.slack.com/archives/CTK20V944/p1592478619463200?thread_ts=1592476990.463100&cid=CTK20V944
So if I want it under plots, I would need to call e.g.  report_confusion_matrix   right?