AgitatedDove14

49 Questions, 8112 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8112

0 Could You Please Explain A Bit More How Trains Adapt The Torch Version Depending On The Installed Cuda Version? Here Is My Setup:

You can set torch to be installed last:
post_packages: ["horovod", "torch"]
Which will make sure the "trains-agent" version (the one you specified in the "installed packages" will be installed last.

4 years ago

0 Hi

Hi SubstantialElk6
We try to push a fix the same day a HIGH CVE is reported, that said since the external API interface is relatively far away from DBs / OS, and since as a rule of thumb, authorized users are trusted (basically inherit agent code execution means they have to be), it is an exception to have a CVE that affects the system. I think even this high profile one, does not actually have an effect on the system as even if ELK is susceptible (which it is not), only authorized users co...

3 years ago

0 Hi, I Am Trying To Upload A Model But I Am Getting The Following Error:

Hi SkinnyPanda43

cannot schedule new futures after interpreter shutdown

This seems like a strange exception...
What's the setup here ? jupyter notebook ? how is the interpreter down ?

3 years ago

0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

I want to be able to access the data just avoid reporting the experiment results

Yes, you are correct 😞
If you just want to skip the logging you can always add an if to the Tasl.init call ?!

3 years ago

LazyTurkey38 configuration pushed to github :)

3 years ago

0 Regarding The New Version 1.1.2, I Have Noticed Type Hints Are Now Included In The Script Generated By

Hmm could it be this is on the "helper functions" ?

3 years ago

0 Hello, I Am Looking For A Way To Increase Number Of Images Saved In Results>Debug Samples. Looks Like There Is A Limit Of 100 Images Per Experiment, And All Images Saved After Are Not Displayed In Web Client. I Like To Have First Batch With Predictions V

you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32

3 years ago

0 Hey Everyone! Is It Possible To Trigger A Pipeline Run Via Api? We Have A Repo That Builds An Image For Serving To Clearml Server But We'Ve Wrapped It Inside A Fastapi Application So It Can Be Called From Another Web Service.

Hi @<1692345677285167104:profile|ThoughtfulKitten41>

Is it possible to trigger a pipeline run via API?

Yes! a pipeline is at the end a Task, you can take the pipeline ID and clone and enqueue it

pipeline_task = Task.clone("pipeline_id_here") 
Task.enqueue(pipeline_task, queue_name="services")

You can also monitor the pipeline with the same Task inyerface.
wdyt?

11 months ago

0 Hi! I'Ve Been Trying Out The

First that is awesome to hear PanickyFish98 !
Can you send the full exception? You might be on to something...
2. Actually we thought of it, but could not find a use case, can you expand?
3. I'm not sure I follow, do you mean you expect the first execution to happen immediately?

3 years ago

0 Web Server Ui Bug? When Trying To Extend The Width Of A Column In The Experiments Table, If You Try To Extend It More Then The Width Of The Column To The Right, It Doesn'T Do Anything..

Hi DepressedChimpanzee34

if you try to extend it more then the width of the column to the right, it doesn't do anything..

You mean outside of the window? or are you saying you cannot extend it?
Just verifying, we are talking about the latest version of clearml-server ?

3 years ago

0 When Running An Agent Inside Google Colab, I Always Get This Error After Dependency Installation:

Thanks @<1694157594333024256:profile|DisturbedParrot38> !
Nice catch.
Could you open a github issue so that at least we output a more informative error?

11 months ago

0 Hello, Can I Get Somehow Json Files Of Plots For The Given Task? I Know There Is The "Download Json" Button Near The Plots In Your Web Ui, But I Need Do It Programatically (There Are Many Plots And Many Tasks).

If this is the case then the easiest is:
from clearml.backend_api.session.client import APIClient client = APIClient() res = client.events.get_task_plots(task="<task-id>")We should defiantly have a nice interface 🙂

3 years ago

0 Hi, I'M Attempting To Use

Also, on the ClearML dashboard, I can see the

clearml-agent

log:

Is the clearml-agent running in docker mode ?

See https://github.com/allegroai/clearml-session/issues/3

3 years ago

0 How Can I Serve My Custom Yolov8 Model On Clearml?

Hi @<1691258563357315072:profile|ColorfulKitten60>
I think we need some context for this question 🙂

12 months ago

0 Hi Again

The difference is whether you are only supplying a "minutes" or you are also passing hour/day etc.
See the examples:
Every 15 minutes
add_task(task_id='1235', queue='default', minute=15)Every hour on minute 20 of the hour (i.e. 00:20, 01:20 ...)
add_task(task_id='1235', queue='default', hour=1, minute=20)

3 years ago

0 Hi Team, Me Again! Im Curious If Someone Can Explain To Me Better How Task And Optimisers Integrate With Each Other. In The Example Hyperparameter Optimisation, There Is Both A Task Initialised With

I see now, give me a minute I'll check

4 years ago

0 Hi All, Is There Any Guide On How To Deploy Clear Ml On Kubernetes But Hosted In Cloud, Like Eks Or Aks Or Gke

Hi @<1704304350400090112:profile|UpsetOctopus60>
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_kubernetes_helm
Just use the helm charts. It's the easiest

10 months ago

0 Clearml (Remote Execution) Sometimes Doesn'T "Pick-Up" Gpu. After I Rerun The Task It Picks It Up. Seems Random, Doesn'T Happen Too Often (Maybe Once In 30-40 Times) And I Cannot Seem To Detect Any Pattern. Did Anyone Else Notice This? Agents Are Vms On G

Is there an easy way to add a docker argument in the python script?

On the task it self in the UI you can edit the docker arguments and add any missing flags
(task.set_base_docker will do the same from code)
You can also edit the configuration and always add this flag:
None

8 months ago

0 Could You Please Explain A Bit More How Trains Adapt The Torch Version Depending On The Installed Cuda Version? Here Is My Setup:

no, using the system drivers

4 years ago

0 Hi! I Am Trying To Download Data From Gs Using

You might need to play around a bit, it might be that StorageHelper.get(' gs://bucket ') and then helper.list('folder/*')
Let me know what worked 🙂

4 years ago

0 Hey Guys, I'M Trying To Run An Experiment Using Trains-Agent. I Have A Custom Docker Image With Nightly Versions Of Pytorch And Our Own Library Installed From A Private Repo. I Was Assuming That These Packages Will Be Automatically Available To Trains Dur

docker mode. they do share the same folder with the training data mounted as a volume, but only for reading the data.

Any chance they try to store the TensorBoard on this folder ? This could lead to "No such file or directory: 'runs'" if one is deleting it, and the other is trying to access, or similar scenarios

4 years ago

0 Is It Possible To Give The Agent Access To Install Private Pip Packages (Needs To Be Installed From The Repo)?

This means that in your "Installed packages" you should see the line:
Notice that this is not a pypi artifactory (i.e. a server to add to the extra index url for pip), this is a direct pip install from a git repository, hence it should be listed in the "installed packages".
If this is the way the package was installed locally, you should have had this line in the installed packages.
The clearml agent should take care of the authentication for you (specifically here, it should do nothing).
If ...

3 years ago

0 Hello, Is There A Way To Update A Task Diff Programatically? Eg, I'M Creating A Task Using

Thanks ShakyJellyfish91 ! please let me know what you come up with, I would love for us to fix this issue.

3 years ago

0 Hello, Is There A Way To Update A Task Diff Programatically? Eg, I'M Creating A Task Using

Change to add_missing_installed_packages=False, here, and see if you end up with git diff
https://github.com/allegroai/clearml/blob/1f82b0c4010799be6157f5c845c7f6ac48e71c0c/clearml/backend_interface/task/populate.py#L158

3 years ago

0 I Just Getting This In My Agent Run Task. Would Appreciate If Someone Can Advise Where I Externalrequirement Is Pointing At.

SubstantialElk6
The ~<package name with first name dropped> == a.b.c is a known conda/pip temporary install issue. (Some left over from previous package install)
The easiest way is to find the site-packages folder and delete the package, or create a new virtual environment
BTW:
pip freeze will also list these broken packages

3 years ago

0 Hey Everyone

PricklyRaven28 basically this is the issue:

python -m fastai.launch <script>

There are multiple copies of the script running, but they are Not aware of one another.
are you getting any reporting from the diff GPUs? I'm assuming there is a hidden OS environment that signals the "master" node, so all processes can communicate with it. This is what we should automatically capture. There is a workaround the fastai.launch, that is probably similar to this one:

3 years ago

0 Hey All, I Want To Purchase The Pro Version Of Clearml But Would Like To Have A Better Understanding Of The Metric Events And Api Calls That Are Performed When Using Clearml-Serving. For Example: I Have No Understanding Which Docker Container Calls The Ap

Hi @<1526371965655322624:profile|NuttyCamel41>
I think that the only way to actually get huge number of api calls is with a lot of machines.
For example, regardless of the amount of console-logs you print, it will only be a single call, as these are packages every 2-10 seconds. The same with metric reporting etc.
On the free tier you cal already test the amount of API calls, I think the mechanism is exactly the same
fyi: I would put this question in the channel

2 years ago

0 Am I Doing Something Wrong Or Is Should I Open An Issue About It (Bug?)? I'M Using The

Yep it is the scale 🙂 and yes it should appear once you upgrade

3 years ago

0 Running Into A Strange Issue—

Can you share the clearml.conf ? Maybe something will pop ?

3 years ago

0 Hi Everybody, I’M Getting Errors With Automatic Model Logging On Pytorch (Running On A Dockered Agent).

I think I found something,
https://github.com/allegroai/clearml/blob/e3547cd89770c6d73f92d9a05696018957c3fd62/clearml/storage/helper.py#L1442
What's the boto version you have installed?

2 years ago

Show more results