
Reputation
Badges 1
25 × Eureka!Hi SkinnyPanda43
Are you trying to access the same Task or an external one ?
Hi SpotlessLeopard9
I got many tasks that were just hang at the end of the script without ...
I remember this exact issue was fixed with 1.1.5rc0, see here:
https://clearml.slack.com/archives/CTK20V944/p1634910855059900
Can you verify with the latest RC?pip install clearml==1.1.5rc3
ItchyJellyfish73
Unfortunately this needs backend support, and only available in the enterprise version, what is your use case for it? (It was designed to allow out of the box bare-metal multi gpu dynamic allocation, think DGX with 8 GPUs that instead of spinning down agents when you want to change the queue->num-gpu mapping you can do it on the fly)
Okay that look s good, now in the UI start here and then get to the artifacts Tab,
Is it there ?
Hi ComfortableHorse5
Yes this is more of a suggestion that you should write them using the platform capabilities, the UI implementation is being worked on, as well as a few helpers classes, I thin you'll be able to see a few in the next release π
SmarmySeaurchin8 yes, the package containing the Controller is only RC, plan is to release the stable one in a couple of days. In the meantime:pip install git+
Open source defaults π
Basically I think I'm asking, is your code multi-node enabled to begin with ?
BTW: if you only need the git diff you can just copy them from the UI into a txt file and do:git apply <copied-diff.txt>
Hi SteadyFox10
I'll use your version instead and put any comment if I find something.
Feel free to join the discussion π https://github.com/pytorch/ignite/issues/892
Thansk for theΒ
ouput_uri
Β can I put in theΒ
~/trains.conf
Β file ?
Sure you can π
https://github.com/allegroai/trains/blob/master/docs/trains.conf#L152
You can add it in the trains-agent machine's conf file, or/and on your development machine. Notice that once you run ...
In that case, no the helm chart does not spin a default agent (You should however spin a service mode agent for running pipelines logic)
Hi JitteryCoyote63 ,
When you shutdown the task (manually with close() or when the process finish) it wait for the uploads...
Why do you need to specifically wait for all the artifacts upload? (currently you can stop the artifacts upload thread and wait for all the artifacts, but that seems like a bad hack)
I see. If you are creating the task externally (i.e. from the controller), you should probably call. task.close() it will return when everything is in order (including artifacts uploaded, and other async stuff).
Will that work?
okay, wait I'll see if I can come up with something .
Hmm, not a bad idea π
Could you please open a Git Issue, so it will not get forgotten ?
(btw: I'm not sure how trivial it is to implement, nonetheless obviously possible π
JitteryCoyote63 Not sure how/why the X-Pack feature was on (it is not used by the system), but you can disable it with an environment variable in the docker-composexpack.security.enabled=false
Should solve the problem ...
JitteryCoyote63 There is a basic elastic license that should always be there. If for some reason it was deleted/expired then the following command should fix it:
curl -XPOST ' http://localhost:9200/_xpack/license/start_basic '
think this is because of the version of xgboost that serving installs. How can I control these?
That might be
I absolutely need to pin the packages (incl main DS packages) I use.
you can basically change CLEARML_EXTRA_PYTHON_PACKAGES
https://github.com/allegroai/clearml-serving/blob/e09e6362147da84e042b3c615f167882a58b8ac7/docker/docker-compose-triton-gpu.yml#L100
for example:export CLEARML_EXTRA_PYTHON_PACKAGES="xgboost==1.2.3 numpy==1.2.3"
I know there is a aux cfg with key value pairs but how can use it in the python code?
This is actually for helping to configure Triton services, you cannot (I think) easily access it from the code
now, I need to pass a variable to the Preprocess class
you mean for the construction ?
NaughtyFish36
what's the error you are getting?
Also did you try setting: force_git_ssh_protocol: true
?
https://github.com/allegroai/clearml-agent/blob/76c533a2e8e8e3403bfd25c94ba8000ae98857c1/docs/clearml.conf#L39
1
One reason I don't like using the configuration section is that it makes debugging much much harder.
debugging ? please explain how it relates to the configuration, and presentation (i.e. preview)
2.
Yes in theory, but in your case it will not change things, unless these "configurations" are copied on any Task (which is just storage, otherwise no real harm)
3.
I was thinking "zip" file that the Task creates and uploads, and a new configuration type, say "external/zip" , and in the c...
Could not find a version that satisfies the requirement open3d==0.15.2 .. from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)
This points to the agent installing using a different python version that you run the original code, I would guess python3.6
Sigint (ctrl c) only
Because flushing state (i.e. sending request) might take time so only when users interactively hit ctrl c we do that. Make sense?
Sorry @<1689446563463565312:profile|SmallTurkey79> just notice your reply
Hmm so I know the enterprise version has a built-in support for slurm, which would remove the need to deploy agents on the slurm cluster.
What you can do is on the SLURM login server (i.e. a machine that can run sbatch), write a simple script that pulls the Task ID from the queue and calls sbatch with clearml-agent execute --id <task_id_here>
, would this be agood solution