Reputation
Badges 1
186 × Eureka!Error 12 : Validation error (value β['13b46b9325954517ab99381d5f45237dβ, βbc76c3a7f0f6431b8e064212e9bdd2c0β, β5d2a57cd39b94250b8c8f52303ccef92β, βe4731ee5b33e41d992d6d3fdb2913045β, β698d9231155e41fbb61f8f3faa605727β, β2171b190507f40d1be35e222045c58eaβ, β55c81a5db0ad40bebf72fdcc1b3be2a4β, β94fbdbe26ef242d793e18d955cb3de58β, β7d8a6c8f2ae246478b39ae5e87def2adβ, β141594c146fe495886d477d9a27c465fβ, β640f87b02dc94a4098a0aba4d855b8f5β]' length is bigger than allowed maximum β10β.)
we often do ablation studies with more than 50 experiments, and it was very convenient to compare their dynamics at the different epochs
fantastic, everything is working perfectly
thanks guys
LOL
wow π
I was trying to find how to create a queue using CLI π
another stupid question - what is the proper way to delete a worker? so far I've been using pgrep to find the relevant PID π
same here, changing arguments in the Args section of Hyperparameters doesnβt work, training script starts with the default values.
trains 0.16.0
trains-agent 0.16.0
trains-server 0.16.0
I updated the version in the Installed packages section before starting the experiment
I change the arguments in Web UI, but it looks like they are not parsed by trains
nope, same problem even after creating a new experiment from scratch
I donβt connect anything explicitly, Iβm using argparse, it used to work before the update
it prints an empty dict
Iβm doing Task.init() in the script, maybe it somehow resets connected parametersβ¦ but it used to work before, weird
I'm so happy to see that this problem has been finally solved!
for me, increasing shm-size usually helps. what does this RC fix?
{
username: "username"
password: "password"
name: "John Doe"
},
I updated S3 credentials, I'll check if they work later
it doesn't explain inability to delete logged images and texts though
self-hosted ClearML server 1.2.0
SDK version 1.1.6
standalone-mode gives me "Could not freeze installed packages"
problem is solved. I had to replace /opt/trains/data/fileserver to /opt/clearml/data/fileserver in Agent configuration, and replace trains to clearml in Requirements
if you click on the experiment name here, you get 404 because link looks like this:
https://DOMAIN/projects/PROJECT_ID/EXPERIMENT_ID
when it should look like this:
https://DOMAIN/projects/PROJECT_ID/experiments/EXPERIMENT_ID
I'm not sure it's related to the domain switch since we upgraded to the newest ClearML server version at the same time