AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 Hello Friends! I Am Trying To Play Around With The Configs For

Hi @<1547028116780617728:profile|TimelyRabbit96>
You are absolutely correct, we need to allow to override configuration
The code you want to change is here:
None
You can try:

channel = self._ext_grpc.aio.insecure_channel(triton_server_address, options=dict([('grpc.max_send_message_length', 512 * 1024 * 1024),  ('grpc.max_receive_message_len...

2 years ago

0 Sometimes I Notice That At The End Of An Experiment Clearml Keeps Hanging (Something With Repository Detection?) And The Script Does Not End. Do More People See This? Especially In Our Continuous Integration Pipeline This Give Problems Because Tests Are G

btw:
# in another process
How do you spin the subprrocess, is it with Popen ?
also what's the OS and python version you are using?

3 years ago

0 Hi There, I’Ve Been Trying To Play Around With The Model Inference Pipeline Following

Hi @<1547028116780617728:profile|TimelyRabbit96>
Start with the simple scikit learn example
https://github.com/allegroai/clearml-serving/tree/main/examples/sklearn
The pipeline example is more complicated, it needs the base endpoints, start simple 😃

2 years ago

Yey!!!!!

3 years ago

0 Hi, Is There A Way To List All Agents Running In A Host, I Do Not Find Relevant One In Clearml-Agent -H.

And the agent continue running.

oh just kill al the processes with clearml-agent in the cmd line

pkill -9 -f clearml-agent

one year ago

0 Hi, Is There A Way To List All Agents Running In A Host, I Do Not Find Relevant One In Clearml-Agent -H.

In the UI you can see all the agents and their IDs
Then you can so

clearml-agent daemon --stop <agent id>

one year ago

0 For Some Runs Of My Experiments The Ressource Monitoring Exists, For Other It Does Not. Any Idea Why This Could Be The Case?

make sense ?

4 years ago

0 So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

Could I use "register artifact"

I think this is somewhat deprecated and we should probably replace it with something similar to what you mentioned (i.e. watch a file change).
Right now the easiest way would e to manually upload the trainer_state.json every checkpoint:
Task.current_task().upload_artifact('trainer_state.json, name='state') `

4 years ago

training loop is within line 469, I think.

I think the model state is just post training loop (not inside the loop), no?

4 years ago

0 Anyone Doing Sagemaker With Clearml - Something Like The K8S Glue But The Tasks Are Pulled Into Sagemaker Training Jobs

Basic setup:
glues service per "job template" (e.g. k8s resources, for example cpu requirement, or gpu requirement).
queue per glue service, e.g. cpu_machine queue, and 1xGPU queue
wdyt?

4 years ago

0 Hi, I Am Trying To Run Experiment From Clearml Web Ui. I Did Experiment Copy, Enqueue, But In The Execution Log I See That It Runs Command

MortifiedDove27 did you update to the latest cleaml python package ?

4 years ago

0 Hi, I Noted That Clearml-Serving Does Not Support Spacy Models Out Of The Box And That Clearml-Serving Only Supports Following;

Hi SubstantialElk6

noted that clearml-serving does not support Spacy models out of the box and

So this is a good point.

To add any pissing package to the preprocessing docker you can just add them in the following environment variable here: https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/docker/docker-compose.yml#L83
EXTRA_PYTHON_PACKAGES="spacy>1"
Regrading a custom engine, basically this is supported with --engine custom
you c...

3 years ago

0 Hi Guys, I’M Trying To Install It My Lab Server, But When I Try To Create Credentials, It Says Error And Gives More Info: Error 301 : Invalid User Id: Id=F46262Bde88B4928997351A657901D8B, Company=D1Bd92A3B039400Cbafc60A7A5B1E52B

Can I assume that if we have two agents spinning the same experiment, your code will take it from there?

Is this true ?

4 years ago

0 Has Anyone Done This Exact Use Case - Updates To Datasets Triggering Pipelines?

time-based, dataset creation, model publish (tag),
Anything you think is missing ?

4 years ago

0 When I Tried To Create A Clearml Serving Inference Endpoint For Yolov8, I Received The Following Error:

This line 🙂
None
Notice Triton (and so is clearml-serving) needs the pytorch model to be converted into torchscript, so that the triton backend can load it

2 years ago

0 Hello! Question About

Notice this is per frame (single) not per 8

2 years ago

0 Trying To Setup A Trains-Agent Worker On A Remote Machine; When I Run Trains-Init And Follow The Steps To Give It Credentials For Our Trains Server I Get This

Could you manually configure the ~/trains.conf ?
(Just copy paste the section from the UI)
then try to run:
trains-agent list

4 years ago

I think that is good enough

4 years ago

0 Hi! I Have Local Minio Setup, Via Minio Browser I Can Upload 50-100 Mb Per Second As Its Local. But When I Try To Use Task.Upload_Artifact It Uploads 500 Kb Per Second. Does Anyone Have An Idea About This?

None of them is problematic, this is what I'm trying to say 🙂
I think the minio browser gets confused.
if you want to test the upload time on the client you can try:
task.flush(wait_for_uploads=True) tic = time() task.upload_artifact('test', '/tmp/localfile') task.flush(wait_for_uploads=True) print(time() - tic)

5 years ago

There is some overhead, but it should be negligible.

5 years ago

That seems reasonable to me:)

5 years ago

okay we are good 🙂

5 years ago

Thanks MuddyCrab47 !!!
I found it!
It turns out the artifact upload will always upload from stream (aka no multi-upload). I will make sure we fix it in the next RC (I think the plan is to have it out this weekend)

5 years ago

0 Hey, What Is The Exact Difference Between

It should work 🙂 as long as the versions match, if they don't the venv will install the version you need (which is great, only penalty is the install, download wise it will be cached)

5 years ago

0 Hey, What Is The Exact Difference Between

Okay I think this is our answer 🙂

5 years ago

0 Hi, When It First Asks Me To Enter My Full Name, It Fails To Perform The Request (Timed Out). Checked Server Side And Receiving This Error

What's the OS running the server?

5 years ago

0 I Know I Can Run This Manually In Step By Step But Wondering If This Can Be Automated As Scheduled Tasks

DAG which get scheduled at given interval and

Yes exactly what will be part of the next iteration of the controller/service

an example achieving what i propose would be greatly helpful

Would this help?
from trains.automation import TrainsJob job = TrainsJob(base_task_id='step1_task_id_here') job.launch(queue_name='default') job.wait() job2 = TrainsJob(base_task_id='step2_task_id_here') job2.launch(queue_name='default') job2.wait()

5 years ago

0 Hi Everyone

Hi MinuteCamel2

I can I disable it from automatically uploading model checkpoints to ClearML servers?

Maybe this one can help :)
https://www.youtube.com/watch?v=etGjxOKG9lo

deleted all of the models from my ClearML project but I still receive this message. Do you know why?

It might take it a few hours to update... 😞

2 years ago

0 Is It Possible To Embed A Streamlit App In A Clearml Report? Are There Other Ways To Integrate Streamlit Apps?

Hi @<1575656665519230976:profile|SkinnyBat30>
Streamlit apps are backend run (i.e. the python code drives the actual web app)
This means running your Tasks code and exposing the web app (i.e. http) streamlit.
This is fully supported with ClearML, but unfortunately only in the paid tiers 😞
You can however run your Task with an agent, make sure the agent's machine is accessible and report the full IP+URL as a hyper-parameter or property, and then use that to access your streaml...

2 years ago

0 Hi, I Encountered A Few Problems:

FierceFly22 wow that is a cool hack! Trains will capture any torch.save , so I think the actual driver here is the 'model.summary' . You can also upload any artifact with task.upload_artifact('name', 'modelsummary.txt')
Touching a file will not trigger Trains, as it does not monitor the files themselves. Make sense?
BTW, how will you get the file when running with the agent? If you are using the connect_configuration it will be downloaded from the trains-server for you. Otherwise you can alw...

5 years ago

Show more results