AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 When It Comes To Continuous Training, I Wanted To Know How You Train Or Would Train If You Have Annotated Data Incoming? Do You Train Completely Online Where You Train As Soon As You Have A Training Example Available? Do You Instead Train When You Have A

Sorry for pinging you on this old thread.
...
And what was the learning strategy? ADAM? RMSProp?

Sorry, missed it...
I would actually use the HPO to test various setups (it uses Optuna under the hood so really SOTA hyper band Bayesian optimization ontop of them)
https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py

3 years ago

0 Hi All, I Am Running Into Ssl Verification Issues With Trying To Upload Model Artifacts To Minio. We Are Running The Clearml Agent In A Container, Have Mounted A Ca Bundle To The Container And Referenced It On Env Vars So That Aws Cli/Boto And Requests Us

Hmm can you try with additional configuration, next to "secure: true" in your clearml.conf, can you add "verify: false"

3 years ago

0 Hi, I'M Trying To Get Tensorboard Plots Into The Allegro Trains Server. Although I Followed The Example

Hi TrickyRaccoon92 , TB is automatically collected and converted into data stored on the system The UI uses plotly to display the data itself (on your web browser).
You still have the original TB protobuf file, if you want to dive deeper and debug the data (it is not automatically uploaded, but some users do upload it as additional artifact on the experiment)
Make sense ?

4 years ago

0 Hi, Is A Remote Task Execution Wit Azure Devops Private Repository Working? I Am Hitting A Problem Where Clearml Is Creating A Wrong Url For Loading:

Hi MelancholyChicken65
I'm assuming you need ssh protocol not https user/token, set this one to true 🙂
force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/76c533a2e8e8e3403bfd25c94ba8000ae98857c1/docs/clearml.conf#L39

2 years ago

0 Hello Again, How Can I Use The

Hi JumpyDragonfly13

is "10.19.20.15" accessible from your machine (i.e. can you ping to it)?
Can you manually SSH to 10.19.20.15 on port 10022 ?

4 years ago

0 Another Question, Is It Possible To Run A Single Experiment Which Is Composed Of Multiple Steps Executed As Sequential Sub-Processes Where The Current Task Is Fetched As

Hi TightElk12
One option will be to call task.close() at the end of each step and task.init at the beginning of another.
Will that do?

4 years ago

0 Hi, I’M Training On Multi-Node, Clearml Captures Only A Single Machine Utility (Memory/Cpu/Etc.). I Assume It Captures Node 0. Is There A Way To Make It Report All Nodes?

@<1558624430622511104:profile|PanickyBee11> how are you launching the code on multiple machines ?
are they all reporting to the same Task?

2 years ago

0 Another Question Is If I Have A Conda Env Available On My Workers Systemwide.. Can I Use That Env Directly When Running Tasks With

hmm... try to run the trains-agent from the ml environment with "system_site_packages: true", it might do the trick. Anyhow please let me know if it worked 🙂

5 years ago

0 Hello. Is There Any Doc Where I Could Find What Contributes As Api Usage? And Is It Possible To View The Usage Breakdown By Source/Type? I Want To Estimate Api Usage Costs Before Signing Up For Pro Plan On The Saas.

TroubledHedgehog16 generally speaking you can expect about 10 api calls per minute if you have many reports, and about 3 per minute on low report. We just optimized the sdk so in cases there are lots of consequential reports they are better batched, I would recommend the latest RC

2 years ago

0 Hello, I Have A Problem With Task.Set_Initial_Iteration(0) In Google Colab. After Continuing The Experiment, Gaps Appear On My Graph, But If You Use Colab. I Tried It On My Computer And Everything Is Normal There.

Yey! okay let me make sure we add this feature to the Task.init arguments so one can control it from code 🙂

3 years ago

0 Hi Fam! Sorry For The Potential Dumb Question, But I Couldn’T Find Anything On The Interwebs About It. I’M Hosting A Clearml Server On Aws, Using S3 As A Backend For Artifact Storage. I Find That Whenever I Delete Archived Artifacts In The Web App, I Get

It seems like the web server doesn’t log the call to AWS, I just see this:

This points to the browser actually sending the AWS delete command. Let me check with FE tomorrow

3 years ago

0 Hi Everyone, I Wanted To Inquire If It'S Possible To Have Some Type Of Model Unloading. I Know There Was A Discussion Here About It, But After Reviewing It, I Didn'T Find An Answer. So, I Am Curious: Is It Possible To Explicitly Unload A Model (By Calling

@<1657918706052763648:profile|SillyRobin38> out of curiosity did you compare performance of tensorrt-llm vs vllm ?
(the jury is still out on that, just wondered if you had a chance)

one year ago

0 I Am Using `

Hi SarcasticSparrow10

Is it better to post such questions on Stackoverflow so they benefit everybody?

Yes, I think you are correct it would please do 🙂

Try to do " reuse_last_task_id='task_id_here'" ,t o specify the exact Task to continue )click on the ID button next to the task name in the UI)
If this value is true it will try to continue the last task on the current machine (based on project/name, combination) if the task was executed on another machine, it will just start a ...

4 years ago

0 Hello! Is It Possible To Run Pipeline Controller Tasks Locally, Similar To Regular Tasks That Run Locally By Default If

Hi SucculentBeetle7
Sure check the latest implementation, it now has "start" and "start_remotely" 🙂

4 years ago

0 Hi Everyone, Is There A Way To Avoid The Environment Setup When Running A Task Using A Worker? I Am Currently Using A Custom Docker Image That Already Has All The Require Packages Installed. I Tried Setting The Env Var

SteepDeer88
Try the following:
` Task.add_requirements("pycocotools-windows", "; platform_system == "Windows"")
Task.add_requirements("pycocotools", "; platform_system != "Windows"")

Task.init(...) You should see in your "installed packages" something like: pycocotools-windows ; platform_system == "Windows"
pycocotools ; platform_system != "Windows" `

3 years ago

0 Hi All! Let'S Say I Have Two Functions Decorated With

why are all defined components shown in the UI Results/Plots/PipelineDetails/ExecutionDetails section? Shouldn't it make more sense to show only the ones that are used in that pipeline?

They are listed there (because of the decorator, you basically "say" these are steps so they are listed), the actual resolving (i.e. which steps are actually being called) is done in "real-time"
Make sense ?

3 years ago

0 Trying To Setup A Trains-Agent Worker On A Remote Machine; When I Run Trains-Init And Follow The Steps To Give It Credentials For Our Trains Server I Get This

https ?

4 years ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

@<1541954607595393024:profile|BattyCrocodile47> first let me say I ❤ the dark theme you have going on there, we should definitly add that 🙂

When I run

python set_triggers.py; python basic_task.py

, they seem to execute, b

Seems like you forgot to start the trigger, i.e.
None
(this will cause the entire script of the trigger inc...

2 years ago

0 Hello! I’M Currently Using Clearml-Server As An Artifact Manager And Clearml-Serving For Model Inference, With Each Running On Separate Hosts Using Docker Compose. I’Ve Successfully Deployed A Real-Time Inference Model In Clearml-Serving, Configured Withi

Hi @<1697056701116583936:profile|JealousArcticwolf24>
Awesome deployment 🤩
Yes if you need another scalable model serving you can just run another instance of the clearml-serving-inference
https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/docker/docker-compose.yml#L77
So you end up with two of them, one per models environ...

one year ago

0 Hi, I Noted That Clearml-Serving Does Not Support Spacy Models Out Of The Box And That Clearml-Serving Only Supports Following;

Hi PerplexedCow66
I'm assuming an extension for this:
https://github.com/allegroai/clearml-serving/issues/32

Basically JWT can be used as a general access/block all endpoints, which is most efficnely used if handled by k8s loadbalancer (nginx/envoy),
but if you want a per-endpoint check (or maybe do something based on the JWT values)
See adding JWT to FastAPI here:
https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/?h=jwt#oauth2-with-password-and-hashing-bearer-with-jwt-tokens
T...

3 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

Okay, make sure that in your trains.conf on all the trains-agent machine you add the following:
agent.extra_docker_arguments: ["-v", "/etc/hosts:/etc/hosts",]

From here:
https://github.com/allegroai/trains-agent/blob/216b3e21790659467007957d26172698fd74e075/docs/trains.conf#L121

4 years ago

0 Do You Have Any Base Image Recommendation To Install Clearml Python Library? I'M Getting Error With Pip On Python:3.9.11-Alpine Image.

actually no it is not, alpine is Not a good baseline, is is very very slim missing a ton of stuff.
I would use bullseye or slim (depending how many aux things you need on the container)
https://hub.docker.com//python/tags?page=1&name=bullseye
https://hub.docker.com//python/tags?page=1&name=slim-bullseye

2 years ago

0 Hello, I Have A Small Question Regarding Ui: Currently, In The Artifacts Section Of A Task, The

Hi JitteryCoyote63

Or even better: would it be possible to have a support for HTML files as artifacts?

If you report html files as debug media they will be previewed, as long as the link is accessible.
You can check this example:
https://github.com/allegroai/trains/blob/master/examples/reporting/html_reporting.py

In the artifacts, I think html are also supported (maybe not previewed as nicely but clickable.
Regrading the s3 link, I think you are supposed to get a popup window as...

4 years ago

0 Hi. When Using The Logger'S

Hmm let me check, because I think it should have worked

3 years ago

0 Hello Folks! I Have A Pipeline With Three Tasks: A, B, And C I Want To Set It Up So That: A Gets Assigned A Machine (E.G. Based On The Queue) B Always Gets Assigned To The Same Machine As A (But May Run In A Different Docker Etc.) C Will Be Submitted To

Really what I need is for A and B to be separate tasks, but guarantee they will be assigned to the same machine so that the clearml dataset cache on that machine will be warm.

I think that what you are looking for is multi-machine cache (which is fully supported). Basically mount an NFS/SMB folder from a NAS to any of those machines, configure the cache folder to point to it, and not you do not need to worry about affinity ?
no?

Is there a way to group A and B into a sub-pipeline, h...

3 years ago

C will be submitted to a different queue and I don’t care as much

Is there a way to define “task affinity” in this way?

Hi RoughTiger69 ,
when you say Task affinity, you mean, I want C to be executed next to A/B ? Affinity as a concept doesn't really exist, it can be abstracted to a queue, where you have agents pulling from multiple queues. Then C can be pushed to one the the queues (in theory you might be able to programmtically control the Queue of C), wdyt?

3 years ago

0 Hi. When Using The Logger'S

DistressedGoat23 you are correct, since at the end this become a plotly object the extra_layout is for general purpose layout, but this specific entry is next to the data. Bottom line, can you open a github issue, so we do not forget to fix? In the mean time you can use the general plotly reporting as SweetBadger76 suggested

3 years ago

0 I'M Using

Hmm, I really like this one:
https://chart-studio.plotly.com/~empet/14632/plotly-joyplotridgelines/#plot
What I'm thinking is a global setting basically telling the TB binding layer to always do ridgeline instead of 3d surface.
wdyt?

3 years ago

0 Is There Documentation On The Clearml Slurm Enterprise Integration?

Hi @<1523711619815706624:profile|StrangePelican34>
Hmm, I think this is missing from the docs, let me ping the guys about that 🙏

one year ago

0 Hi, I Am Trying To Run Experiment From Clearml Web Ui. I Did Experiment Copy, Enqueue, But In The Execution Log I See That It Runs Command

orchestration module
When you previously mention clone the Task I the UI and then run it, how do you actually run it?
regarding the exception stack
It's pointing to a stdout that was closed?! How could that be? Any chance you can provide a toy example for us to debug?

4 years ago

Show more results