AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 What Happens To File That Are Downloaded To A Remote_Execution Via Storagemanager? Are They Removed At The End Of The Run, Or Does It Continuously Increases Disk Space?

It's always the details... Is the new Task running inside a new subprocess ?
basically there is a difference between
remote task spawning new tasks (as subprocesses, or as jobs on remote machine), remote task still running remote task, is being replaced by a spawned task (same process?!)UnevenDolphin73 am I missing a 3rd option? which of these is your case?
p,s. I have a suspicion that there might be a misuse of "Task" here?! What are you considering a Task? (from clearml perspective a Task...

3 years ago

0 Hi! I Have A Question Regarding Performances Of The Clearml-Server: Are The Calls From The Agents Made Asynchronously/In A Non Blocking Separate Thread? Is The Connection To The Clearml-Server Expected To Be A Bottleneck If The Clearml-Server Is Far From

JitteryCoyote63

are the calls from the agents made asynchronously/in a non blocking separate thread?

You mean like request processing on the apiserver are multi-threaded / multi-processed ?

4 years ago

0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

Hmm, I think the issue is here (the docker command mount)
'-v', '/tmp/.clearml_agent.de0n48pm.cfg:/root/clearml.conf'

4 years ago

0 Hi, I Noted That Clearml-Serving Does Not Support Spacy Models Out Of The Box And That Clearml-Serving Only Supports Following;

Hi PerplexedCow66
I'm assuming an extension for this:
https://github.com/allegroai/clearml-serving/issues/32

Basically JWT can be used as a general access/block all endpoints, which is most efficnely used if handled by k8s loadbalancer (nginx/envoy),
but if you want a per-endpoint check (or maybe do something based on the JWT values)
See adding JWT to FastAPI here:
https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/?h=jwt#oauth2-with-password-and-hashing-bearer-with-jwt-tokens
T...

3 years ago

0 Encountered An Odd Bug. Upon Attempting To Write Images To Clearml (3D Projected, Matplotlib),

The issue only arises upon sending Images. (Both numpy, mpl and PIL)

BTW: they should appear under debug-samples Tab in the results

4 years ago

0 Hi, I Try To Run Locally

I wonder if I just need to join 2 docker-compose files to run everything in one session

Actually that could also work

But for reference, when I said IP i meant the actual host network IP not the 127.0.0.1 (which is the same as localhost)

3 years ago

0 Hi, Can You Pls Help Me? I Am Using V 0.14 (Will Update It Soon) And I Got The Following Error: /Usr/Bin/Python3.6: No Module Named Virtualenv Trains_Agent: Error: Command '['Python3.6', '-M', 'Virtualenv', '/Home/Ubuntu/.Trains/Venvs-Builds.2/3.6']' Ret

at the end of the manual execution

5 years ago

0 Latex In Plot Labels?

TrickyRaccoon92 the title provided by write.scalars is also a representing string for the specific metric. This is more than just a title on the plot itself.
It means that this will be the name of the scalar metric (title/series combination) .
Is that your intention, or is it for viewing purpose only?

4 years ago

0 Hope Everyone'S Having A Nice Holiday Period. I'Ve Been Debating Between Cron And The Clearml Taskscheduler Cron Is The Solution I'M Currently Using But I Wanted To Understand The Advantages To Using The Taskscheduler. Right Now I'M Using The Classic Cro

(apologies I just got to it now)
First of all, kudos on the video, this is so nice!!!
And thanks to you I think I found it:
None
we have to call serialize Before the execute_remotely
(the reason why sometimes it works is that it syncs in the background, so sometimes it's just fast enough and you get the config object)
Let me check if we can push an RC with a ...

one year ago

0 Hey! Is It Possible To Create Under Main Task (Training), New Task Of Type Evaluation? Or Any Other Way To Link Between Them?

Hi FranticCormorant35
So Tasks have parent field, that would link one to another.
Unfortunately there is no visual representation for it.
What we did with the hyper-parameter for example, was also to add a tag with the ID of the "parent" Task. This would make sense if you have multiple tasks all generated from the same "parent", like in hyper-parameter optimization.
What's your use case ? Is it a single evaluation Task per training, or multiple or con job alike ?

5 years ago

0 Hello I'M Running A Local Agent . While Its Running The Task I Get This Error. Any Suggestion? Uccessfully Installed Numpy-1.24.4 Found Pytorch Version Torch==2.0.1 Matching Cuda Version 0 Found Pytorch Version Torchaudio==2.0.2 Matching Cuda Version 0 Er

You should manually remove the cudatoolkit from the installed packages section in the UI, then try to send it to the agent and see if it works. The question is how it ended there in the first place

2 years ago

0 Hello! I Haven'T Used Trains Before, I Am Looking For Opinion From Anyone With More Experience On Whether Trains Is The Correct Tool For My Non-Ml Use Case. My Usecase:

I'll try to find the link...

5 years ago

0 Hi All! Are There Any Plans To Add Scatterplots To Visualize E.G. Hyperparemeter X Accuracy Comparisons Between Experiments? Mlflow Does This In A Really Nice Way, And I Missed This Feature On Our Transition To Clearml:

Hmm I guess doable 🙂 could you open a github issue with feature request ?
If we have enough support it will bump it in the priority 🤞

2 years ago

0 Hi. I'M Encountering A Problem With

BTW:

If I try to find the right model in the

task.models["output"]

(this time there is just one but in my code there may be several) it appears with the

(see other attached screenshot).

What would make sense here ? (I have to be honest I'm not sure).
To be specific there is "model name" which is not unique , and there is model-key which is unique to the Task (i.e. task.models["output"]["model-key"] )

2 years ago

0 Getting A Super Weird Error. Everything Works Fine On Local, When Trying To Run On Remote, Getting This Error Failing To Apply The Git Diff

WackyRabbit7 hmmm seems like non regular character inside the diff.
Let me check something

4 years ago

0 Hi, I Try To Write An Article On Medium About Clearml And Face Some A Problem With Plotly Figures. When Displaying The Figure Locally In A Browser Works Fine, But On The Cleaml Server (I Use The Free Tier Service) The Plot Is Empty And Has The Title 'Unkn

WickedGoat98 what's the clearml version you are using?

4 years ago

0 In Relation To Pytorch Lightning V1.X, Usage In Combination With Trains Has Become Much Smoother (Just Pure Tensorboard). However, When Checking The "Configuration" Tab Of An Experiment, It'S Empty. How Do I Get Trains To Log The Hyperparameters? I'Ve Tr

DefeatedCrab47 If I remember correctly v1+ has their arguments coming from argparse .
Are you using this feature ? 2. How do you set the TB HParam ? Currently Trains does not support TB HParams, the reason is the set of HParams needs to match a single experiment. Is that your case?

4 years ago

0 Hi There - I Am Attempting To Use The Hp Optimization Feature, But Keep Getting The Following Error:

Hi CharmingBeetle38
On the base task, do you see those arguments under the Configuration tab?
Also, if they are under Args section, you should add "Args/" prefix to the HP optimization (this is how you differentiate between the sections)

4 years ago

0 Hi, Is There Any Option To Run Clearml Agent In Docker?

Oh I see, that kind of make sense
I think this is the section you should use:
None
But instead of the clearml-services container you should use the regular container (or just have it installed as part of the entry-point on any ubuntu based container)
Notice the important parts here are:
[None](https://github.com/allegroai/clearml-server/blob/6a1fc04d1e8b112fb334c8743d...

one year ago

0 Hello Everone, I Have Hosted Clearml Server And Trained A Yolov8 Model To Test My Installations. The Model Was Trained Successfully And I Tried To Optimize The Hyderparameters By Using The Sample Code From Clearml But Im Getting Some Error In Doing So An

btw, I looked deeper into the log:

  File "/tmp/tmpfa8ifmka.py", line 80, in <module>
    model.train(data='coco128.yaml',epochs=20)

I'm assuming this all starts here, I think that the pipeline is Not running the code from the same folder, and you are just missing the 'coco128.yaml' try to pass a full path, wdyt?

one year ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

I'm so glad you mentioned the cron job, it would have taken us hours to figure

4 years ago

0 Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

The only downside is that you cannot see it in the UI (or edit it).
You can now do:
data = {'datatask': 'idhere'} task.connect(data, 'DataSection')This will create another section named "DataSection" on the configuration tab. then you will be able to see/edit the input Task.id
JitteryCoyote63 what do you think?

5 years ago

0 Is There Any Specific Version Of Numpy You Recommend To Use With Clearml Python Library? I Am Building An Python Alpine Docker Image With Clearml==1.7.2 But It Breaks When Building Image From Dockerfile.

I'm assuming you are building for x86

2 years ago

0 Hi Everyone. I Have An Issue With The Simple Pipeline - It Runs Two Similar Nn Training Steps (Tf2.3, Windows10, Python 3.7) With Only Difference Is A Batch Size. I'M Running First Separately Each Step To Have Them In Clearml Project Page. Then I Run Pipe

BattyLion34 let me see if I understand.
The same base_task_id when cloned by the UI and enqueues on the same queue as the pipeline, will work but when the pipeline runs the same Task it fails?!
Could it be that you enqueue them on different queues ?

4 years ago

0 Hi, I'M Getting A Lot Of The Following Logs

I'll see what I can do 🙂

5 years ago

0 When I Setup My Local Virtual Environment I Use A Combination Of Conda And Pip. I Use Conda As My Environment Manager, And Then Use Pip For Packages That Are Not In The Conda Repositories.

Hi VivaciousPenguin66
Seems like a CUDA/CUDNN issue.
You argent is configured to work in venvmode, which mean it will pull the correct pytorch version based on the detected CUDA driver support. Speicifally you can see in the log "agent.cuda_version = 111" which means CUDA 11.1 and from the log it found the correct pytorch version:
` Torch CUDA 111 download page found
Found PyTorch version torch==1.8.1 matching CUDA version 111
Found PyTorch version torchvision==0.9.1 matching CUDA version 1...

4 years ago

0 Hi, Can We Upload Our Project Repository To Trains Server? If We Can, How Should We Do? I Know When We Write "Task.Init()", It Uploads Our Experiment Into Server, But It Also Run The Experiment. However, I Want To Upload All My Experiments In Draft Status

MysteriousBee56 when you execute your code once it will appear in the server (with all fields pre-populated based on your setup/git etc.) once it is there you can "clone" them and move them around.
Is this what you mean?
A bit of background, the idea behind Trains is that the environment definition (i.e,. git repo packages etc, code entry arguments etc.) is collected when executing the code. This avoids the tedious task of generating and maintaining YAML/Json configuration files.
What is exa...

5 years ago

0 Trying To Compare Experiments W.R.T. Some Input Hyper-Parameters (On An On-Prem Server). When Selecting Any Additional Hyper-Parameter To Show On The Table, I Get The Following Error:

The issue is the 400 returned form the server, let me check with backend guys

4 years ago

0 Greetings, I Have A Question About Provide Arguments To Docker, By Clearml-Agent Could I Provide An Argument For Docker Not In Clearml.Conf, But In The Start Daemon? For Example Clearml-Agent --Config-File ~/Clearml.Conf Daemon --Docker Agent-Image-Test

Is this per Task or for all the Tasks always ?

4 years ago

0 I Have Got Experiments Training Pytorch Networks On A Remote Compute Run By

Does this mean the model weights are stored on the clearml-server file system?

By default they are just logged (i.e. the local path is stored, but the file is not uploaded). If you want to automatically store the model, pass output_uri=True to the Task.init , or any object store / shared folder (e.g. output_uri=' s3://bucket/folder ' ). ClearML will automatically create a subfolder for the Task, and upload all models/artifacts to it.
` task = Task.init(project_name='ex...

4 years ago

Show more results