My internet traffic looks wierd.I think this is because tensorboard logs too much data on each batch and ClearML send it to server. How can i fix it? My training speed decreased by 5-6 times.
BTW: ComfortableShark77 the network is being sent in background process, it should not effect the processing time, no?
DeliciousBluewhale87 out of curiosity , what do you mean by "deployment functionality" ? is it model serving ?
HandsomeCrow5 Seems like the right place would be in the artifacts, as a summary of the experiment (as opposed to on going reporting), is that the case?
If it is then in the Artifacts tab clicking on the artifact should open another tab with your summary, which sounds like what you were looking for (with the exception of the preview thumbnail 🙂
DeliciousBluewhale87 and is it working?
Sure thing, and I agree it seems unlikely to be an issue 🙂
So how do I solve the problem? Should I just relaunch the agents? Because they can't execute jobs now
Are you running in docker mode ?
If so you can actually delete mapped files (they will still be available inside the docker), just make sure you delete them X hours after they were created, and you should be fine.
wdyt?
Worker just installs by name from pip, and it installs not my package!
Oh dear ...
Did you configure additional pip repositories in the Agent's clearml.conf ? https://github.com/allegroai/clearml-agent/blob/178af0dee84e22becb9eec8f81f343b9f2022630/docs/clearml.conf#L77 It might be that (1) is not enough, as pip will first try to search the package in the pip repository, and only then in the private one. To avoid that, in your code you can point directly to an https of your package` Ta...
Hmm that is odd.
Can you verify with the latest from GitHub?
Is this reproducible with the pipeline example code?
Hi @<1571308003204796416:profile|HollowPeacock58>
I'm assuming this is the arm support (i,e, you are running on new mac) fix we released in one one of the last clearml-agent versions. could you update to the latest clearml-agent?
pip3 install clearml-agent==1.6.0rc2
Seems like settings on the clearml-server disappeared (specifically default queue tag?!)
Hi HealthyStarfish45
Funny just today I had a similar discussion on slurm:
https://allegroai-trains.slack.com/archives/CTK20V944/p1603794531453000
Anyhow, when you say "[scale up agents]" are you referring to a machine constantly running an agent pulling jobs from the queue, where the machine itself (aka the resource) is managed as a slurm job?
parser.add_argument( "--dataset_mean", type
=
float, nargs
=
"+", default
=
0.5)
I think providing nargs='+ ' assumes the type is a list. nonetheless we should be able to support it. Could you please add a GitHub issue so we do not forget ?
on the side note, is there any way to automatically give more meaningful names to the running docker containers?
What do you mean by that? running where? and where will you see them ?
DeliciousBluewhale87 not on the opensource, for some reason it is not passed 😞
Could you explain the use case ?
This one is used when the agent manually downloads wheels, (pytorch mostly), but as you can see it is under ~/.clearml directory, which usually is already shared on the host
Hi BurlyPig26
I think you can easily change the Web port, but not the API (8008) or files (8081) port
How are you deploying it?
Task deletion failed: unhashable type: 'dict'Hi FlutteringWorm14 trying to figure where this is coming from, give me a sec
Hi SucculentBeetle7
The parameters passed to add_step need to contain the section name (maybe we should warn if it is not there, I'll see if we can add it).
So maybe something like:{'Args/param1', 1}Or{'General/param1', 1}Can you verify it solves the issue?
Yep, that would do it ...
You can disable it with:Task.init(..., auto_connect_frameworks={'scikit': False})
I can see all the steps like git clone,
git clone has nothing to do with "env setup" this is brining the code, you cannot skip that one, that said, this is why the git itself is cached on the host machine, so it is fast
... There may be some odd package that need to be installed because one of our DS is experimenting ... But all that we can see what is happening.
even if everything is preinstalled, it Verifies the packages match, this might take a long time. It's just pip being ...
Woot woot, great to hear 🎊
Notice you have configure the shared driver for the docker, as the volume mount doesn't work without it. https://stackoverflow.com/a/61850413
Hi PanickyMoth78
You mean like another Task? or maybe Slack message?