Reputation
Badges 1
2 × Eureka!Check your agent logs (through clearml console tab) and check if there isn't any error thrown.
What is probably happening is that your agent tries to upload the model but fails due to some kind of networking/firewall/port issue. For example: make sure you host your self-hosted server on 0.0.0.0 host so it's able to accept external connections other than localhost
Don't paste your API keys! 🙈
After re-reading your question, it might be difficult to have cross-process communication though. So if you want the preprocessing to happen at the same time as the training and the training to pull data from the preprocessing on the fly, that might be more difficult. Is this your usecase?
You're not the first one with this problem, so I think I'll ask the devs to maybe add it as a parameter for clearml-agent
in that way it will show up in the docs and you might have found it sooner. Do you think that would help?
Thanks! I've asked this to the autoscaler devs and it might be a possible bug, you are the second one. He's checking and we'll come back to you!
Well I'll be had, you're 100% right, I can recreate the issue. I'm logging it as a bug now and we'll fix it asap! Thanks for sharing!!
Hi LackadaisicalDove24 !
Does this happen with every csv file? If so, I can reproduce it to check if it is a bug 🙂
Hi ExuberantParrot61 ! Can you try using a wildcard? E.g. ds.remove_files(dataset_path='folder_to_delete/*')
The above works for me, so if you try and the command line version does not work, there might be a bug. Please post the exact commands you use when you try it 🙂
Nice find! I'll pass it through to the relevant devs, we'll fix that right up 🙂 Is there any feedback you have on the functionality specifically? aka, would you use alias give what you know now or would you e.g. name it differently?
Based on the screenshot of you package versions, it does seem like tensorboard is not installed there. We depend on that, because every scalar logged to tensorboard is captured in ClearML too. My guess would be that maybe you installed tensorboard in e.g. the wrong virtualenv.
However, you do say you tested it with Tensorboard and even then it didn't work. In that case, are the scalars correctly logged to tensorboard? You should be able to easily check this by doing a run, and then launching...