Reputation
Badges 1
25 × Eureka!Hi ConvolutedChicken69
but when running the script it only clones the repo the clearml task is on, how can it get the other repo also?
Do you have a wheel or a git you can install it from ?
EmbarrassedPeacock82 are you using keras/pytorch etc for serving (i.e. Triton) ?
MagnificentSeaurchin79
Can this be solved by using a docker image with the preinstalled packages at a user level?
Yes π
BTW: I think I missed how you managed to install the object_detection API in the first place?
Is it the git repo of the Task? did you fork it? is it a submodule of your git repo?
p.s.
Yes Slack is quite good at reminding you, but generally saying always prefer @ , it will send me an email if I miss the message :)
thanks MagnificentSeaurchin79 , yes that makes it clear.
If that is the case, I think building a container is the easiest solution π
(BTW: You could also build a wheel, if you have setup.py then running is once bdist_wheel will build a wheel, and then install the wheel)
Amazing! π
Let me know how we can help π
Even before we had a chance to properly notice everyone π
Thank you! All the details will follow in a dedicated post, for the time being, I can say that pushing a model with pre/post processing python code and full scalable inference solution has never been easier
https://github.com/allegroai/clearml-serving/tree/main/examples/sklearn
Task.current_task().get_logger().flush(wait=True). # <-- WILL HANG HERE
Okay a bit of theoretical "how it actually works" (and I might be mistaken here...)
Console logging is being reported because the underlining DDP infra (gloo) is pipeline stdout to the main process, where clearml will catch it (I think) The scalars not working on the subprocesss & the flush wait stuck I think are related, as the wait actually waits for the flush process, and it seems it cannot actually "talk" to i...
Please do, just so it wont be forgotten (it won't but for the sake of transparency )
We are planning an RC later this week, I'll make sure this fix is part of it
I want that last python program to be executed with the environment that was created by the agent for this specific task
Well basically they all inherit the Python environment that points to the venv they started from, so at least in theory it should be transparent when the agent is spinning the initial process.
I eventually found a different way of achieving what I needed
Now I'm curious, what did you end up doing ?
so that one app I am using inside the Task can use the python packages installed by the agent and I can control the packages using clearml easily
That's the missing part for me, You have all the requiremnts on the Task (that you can fully control), the agent is setting a brand new venv for each Task inside a container (the venv is cahced, and you can also make the agent just use the default python without installing anything). The part where I'm lost is why would you need the path to t...
in my repo I maintain a bash script to setup a separate python env.
Hmm interesting, now I have to wonder what is the difference ? meaning why doesn't the agent build a similar one based on the requirements ?
JitteryCoyote63 I think this only holds for the conda distribution.
(Actually quite interesting, I wonder what happens if you already installed cudatoolkit...)
if you have cuda 10.2, then the torch 1.3.1 from the cu101 version should work
no, using the system drivers
i.e. change them per experiment ?
BTW: UnevenDolphin73 you should never actually do "task = clearml.Task.get_task(clearml.config.get_remote_task_id())"
You should just do " Task.init()
" it will automatically take the "get_remote_task_id" and do all sorts of internal setups, you will end up with the same object but in an ordered fashion
Yes even without any arguments give to Task.init()
, it has everything from the server
I remember there were some issues with it ...
I hope not π Anyhow the only thing that does matter is the auto_connect arguments (meaning if you want to disable some, you should pass them when calling Task.init)
JitteryCoyote63 fix should be pushed later today π
Meanwhile you can manually add the Task.init() call to the original script at the top, it is basically the same π
JitteryCoyote63
Should be added before theΒ
if name == "main":
?
Yes, it should.
From you code I understand it is not ?
What's the clearml
version you are using ?
Task.add_requirements('.')
Should work
Hmm I assume it is not running from the code directory...
(I'm still amazed it worked the first time)
Are you actually using "." ?
JitteryCoyote63 I found it π
Are you working in docker mode or venv mode ?
JitteryCoyote63 instead of _update_requirements, call the following before Task.init:Task.add_requirements('torch', '1.3.1') Task.add_requirements('git+
')
Hi @<1523701066867150848:profile|JitteryCoyote63>
Could you please push the code for that version on github?
oh seems like it is not synced, thank you for noticing (it will be taken care immediately)
Regrading the issue:
Look at the attached images
None does not contain a specific wheel for cuda117 to x86, they use the pip defualt one
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9ATQXJ5-F05744CK09L/screenshot...
I am not sure what switching back will solve, here the wheel should have been correct, it's just the architecture of the card that is incompatible
So I tested the "old" code that did the parsing and matching, and it did resolve to the correct wheel (i.e. found that there is no 117 only 115 and installed this one)
I think we should switch back, and have a configuration to control which mechanism the agent uses , wdyt?
Hi @<1523701066867150848:profile|JitteryCoyote63>
RC is out,
pip3 install clearml-agent==1.5.3rc3
Then in pytorch_resolve: "direct"
None
Let me know if it worked