Reputation
Badges 1
25 × Eureka!MagnificentSeaurchin79
"requirements.txt" is ignored if the Task has an "installed packges" section (i.e. not completely empty) Task.add_requirements('pandas') needs to be called before Task.init() (I'll make sure there is a warning if called after)
Are they expanded in the "api_server" ? (I verified on a linux machine, same error, the env in the api_server is not being resolved)
MagnificentSeaurchin79 do you have the "." package listed under "installed packages" after you reset the Task ?
Yes, only task.execute_remotely()
should be the last call. because it literally will stop the local run before you add the Args section
Hi EnviousStarfish54
Color coding on the entire UI is stored per user (I think that on your local cookies, but I might be wrong). Anyhow any title/series combination will have the select color regardless of the project.
This way you can configure once that loss is red and accuracy is green, etc.
It seems the code is trying to access an s3 bucket, could that be the case? PanickyMoth78 any chance you can post the full execution log? (Feel free to DM so it won't end up being public)
but I can't seem to figure out a way to do something similar using a task in add_step
VexedCat68 With "add_step" it assumes the Task you are adding is self contained (i.e. there is no "return object" to serialize), this means you can only add arguments, or use the artifacts the Task (i.e. step) will recreate, obviously you knowing in advance what the step creates. Make sense ?
oh...so is this a bug?
It was always a bug, only an elusive one π
Anyhow, I'll make sure we push a fix to GitHub, an RC is planned for later this week, it will contain it
yea, does the enterprise version have more functionality like this?
yes, all sorts of bit and pieces for easier DevOps / K8s etc.
WithΒ
pipe.start(queue='services')
, it still tries to run some docker for some reason
The services agent is always running with --docker:
https://github.com/allegroai/clearml-agent/blob/e416ab526ba9fe05daa977b34c9e46b50fb214a0/docker/services/entrypoint.sh#L16
Actually I think we should have it as an argument, so it is easier to control from docker-compose
I'll be waiting for the full log to check the "git clone" issue
What are you seeing?
Yes π
BTW: do you guys do remote machine development (i.e. Jupyter / vscode-server) ?
SubstantialElk6 on the client side?
at that point we define a queue and the agents will take care of trainingΒ
This is my preferred way as well :)
This workflow however is the only way I have found to easily fix my previous βModule not foundβ errors
Hmm okay make sense,
Did you try to set these ?
or even hack the sys.path with something likeimport sys, os sys.path.insert(0, os.path.abspath(os.path.dirname(__file__)+"/../")
WackyRabbit7 you can configure AWS autoscaler with two types of instances , with priority to one of them. So in theory you do not need two autoscaler processes, with that in mind I "think" single IAM should suffice
I just set the git credentials in the
clearml.conf
and it works out of the box
git has issues with passing the user/token from the main repo to the submodules, hence my surprise that it is working out-of-the-box.
Do notice that if you are ussing ssh-key this is a none issue.
Nope, no
.netrc
defined anywhere, ...
If this is the case can you try to add the following to your "extra_vm_bash_script"
` echo machine example.com > ~/.netrc && echo log...
Did you you set 'force_git_ssh_protocol: true '?
https://github.com/allegroai/clearml-agent/blob/249b51a31bee97d63f41c6d5542e657962008b68/docs/clearml.conf#L39
Meanwhile you can just sleep for 24hours and put it all on the services queue. it should work π
Example here:
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
Hmm that should have worked ...
I'm assuming the Task itself is running on a remote agent, correct ?
Can you see the changes in the OmegaConf section ?
what happens when you pass--args overrides="['dataset.path=abcd']"
Hi PanickyMoth78 , an RC is out with a fix.
pip install clearml==1.6.3rc0
Thank you for noticing the graph issue.
Btw do notice that since data is being changed inside the controller loop the parents are still kind of odd, because it is not clear to the logic the source of the data so it assumes it depends on the current state (i.e. all the leaves)
Okay, could you try to run again with the latest clearml package from github?pip install -U git+
Thanks EnviousStarfish54 we are working on moving them there!
BTW, in the mean time, please feel free to open GitHub issue under train, at least until they are moved (hopefully end of Sept).
Hi @<1533620191232004096:profile|NuttyLobster9>
Hi All, is there a way to clone a pipeline from the web UI like you can with a task?
Right click on the pipeline and select Run (it is basically the same thing as cloning it)
EnviousStarfish54 something is also off in the git detection, it has not remote address, it just says "origin"
Any chance you have no git server ?
Regrading the installed packages, any chance you can send a sample code for me to debug ?