@<1671689437261598720:profile|FranticWhale40> could you test the fix? just pull & run
allegroai/clearml-serving-triton:1.3.1
allegroai/clearml-serving-inference:1.3.1
Sure :task = Task.init(..., auto_connect_arg_parser={'arg_not_to_log': False})This will cause all argparse to automatically be logged (and later editable) with the exception of the argument arg_not_to_log
Notice that if you have --arg-something, to exclude it add to the dict arg_something': False
You need to mount it to ~/clearml.conf (i.e. /root/clearml.conf)
Is there a way to move existing pipelines between projects?
You should be able to, go to your settings page and turn on "show hidden folders"
Then go to your project, you should see " .pipeline " sub project there, right click it and move it to another folder.
ConvolutedChicken69
, does it take the agent off the queue? does it know it's not available to take tasks?
You mean will it "release" the GPU? (i.e. the agent will pull another Task) ?
If so, then no it will not, an "Interactive Session" session is (from the agent's perspective) a Task that will end sometime, and it will continue to monitor and run it, until you manually close it. The idea is that you are actually using the GPU, hence no on else can run a job on it.
To shut it down, ...
Hi CourageousWhale20
Most documentation is here https://allegro.ai/docs
Hi TightElk12
would like to understand the limitations ofÂ
Task.current_task()
Basically this will always get you an instance of the current Task. This will work from sub-processes as well as the main process. Is there a specific scenario you have in mind, or a challenge with the use case ?
it does
not
include the “internal.repo” as a package dependency, so it crashes.
understood
And for the time being we have not used the decorators,
So how are you building the pipeline component ?
DeterminedToad86 were you running a jupyter notebook or a jupyter console ?
Yes, actually ensuring pip is there cannot be skipped (I think in the past it cased to many issues, hence the version limit etc.)
Are you saying it takes a lot of time when running? How long is the actual process that the Task is running (just to normalize times here)
RipeGoose2 That sounds familiar. Could you test with the latest RC?pip install trains==0.16.4rc0
Hi @<1524922424720625664:profile|TartLeopard58>
can’t i embed scalars to notion using clearml sdk?
I think that you need the hosted version for it (it needs some special CORS stuff on the server side to make it work)
Did you try in the clearml report? does that work?
BTW: from the instance name it seems like it is a VM with preinstalled pytorch, why don't you add system site packages, so the venv will inherit all the preinstalled packages, it might also save some space 🙂
DeterminedToad86 see here:
https://github.com/allegroai/clearml-agent/blob/0462af6a3d3ef6f2bc54fd08f0eb88f53a70724c/docs/clearml.conf#L55
Change it on the agent's conf file to:system_site_packages: true
RoundMole15 how does the Task.init look like?
When you login with user/pass in the UI the same "process" happens and you get a Token to work with, this is the same as secret/key
Since in both cases you provide credentials and get back access token, it should work
(This is of course only if you are setting user/pass manually and disabling pass_hashed as you have)
For some reason copying over everything and making another file and running it there does not allow it to run
Not sure i follow...
you should only have one ~/clearml.conf nad from wherever you are running your code it will always read the configuration from the same file
Hmm so the SaaS service ? and when you delete (not archive) a Task it does not ask for S3 credentials when you select delete artifacts ?
Specifically your error seems to be an issue with nvidia Triton container upgrade
In the docker bash startup scriptapt-get install poppler-utils
Hi BurlySeagull48
you mean for the clearml-server ?
It could be the model storing? could it be the peak is at the end of the epoch ?
tried it and restarted the agent, but not working properly
What do you mean not working? can you provide logs ?
Hi @<1523701066867150848:profile|JitteryCoyote63>
RC is out,
pip3 install clearml-agent==1.5.3rc3
Then in pytorch_resolve: "direct"
None
Let me know if it worked
Hmm I would have the docker file contain the default Azure credentials/output_uri, and then have the users clearml credentials passed as env variable in runtime. wdyt?
(I'm checking if you can pass the azure credentials as env in a minute)
Hmm... scaling these scalars while reporting might be a bit too much to do in the background, don't you think you will loose transparency as in the TB you'll see graphs that are diff from what you see in the system ?
Hi @<1544128915683938304:profile|DepravedBee6>
You mean like backup the entire instance and restore it on another machine? Or are you referring to specific data you want to migrate?
BTW if you are upgrading old versions of the server I would recommend upgrading to every version in the middle (there are some migration scripts that need to be run in a few of them)