AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Is It Possible To Avoid The Clearml-Agent For Local Installations, And Have The File Server Automatically Use An S3 Bucket? I'Ve Found

Are you aware of any other way then (other than the

secure: false

flag?

Actually self -signing and providing certificate file is already supported with boto (and thus clearml)
AWS_CA_BUNDLE
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html

2 years ago

0 I'M Trying To Understand How Clearml Serving Works And Trying To Set It Up. I Have An Agent Listening To The Serving Queue And I'M Trying To Set Up Clearml Serving To Launch On The Serving Queue. Do I Need To Have Clearml-Serving Installed On The Machine

Hi VexedCat68
Yes the serving is a bit complicated. Let me try to explain the underlying setup, before going into more details.

clearml-serving CLI is a tool to launch / setup. (it does the configuration and enqueuing not the actual serving) control plan Task -> Storing the state of the serving (i.e. which end points needs to be served, what models are used, collects stats). This Task has no actual communication with the serving requests/replies (Running on the services queue) Serving Task...

2 years ago

0 Hi, I Am Looking To Upload "Already Trained Models" As Experiments In My Clearml Server. How Should I Go About Doing That? Clearml Picks Up The Tensorboard Automatically While It'S Training And Reports It But How Would I Do This If I Had Everything Alread

SmarmyDolphin68 sadly if this was not executed with trains (i.e. the offline option of trains), this is not really doable (I mean it is, if you write some code and parse the TB 😉 but let's assume this is way to much work)
A few options:
On the next run, use clearml OFFLINE option, (i.e. in your code call Task.set_offline() , or set env variable CLEARML_OFFLINE_MODE=1) You can compress the upload the checkpoint folder manually, by passing the checkpoint folder, see https://github.com...

3 years ago

0 Is It Possible To Run Multiple Agent On Ec2 Machines Started By The Autoscaler? Or Have The One Agent Run Multiple Queue Jobs At Once? E.G. Having The Autoscaler Start 1X P3.8Xlarge (4 Gpu) On Aws Might Be Better Than 4X P3.2Xlarge (1 Gpu) In Terms Of Ava

Hi ScantChimpanzee51

Is it possible to run multiple agent on EC2 machines started by the Autoscaler?

I think that by default you cannot,

having the Autoscaler start 1x p3.8xlarge (4 GPU) on AWS might be better than 4x p3.2xlarge (1 GPU) in terms of availability, but then then we’d need one Agent per GPU.

I think that this multi-GPU setup is only available in the enterprise tier.
That said, the AWS pricing is linear, it costs the same having 2 instances with 1 GPU as 1 instanc...

one year ago

0 [Pipeline] Hey, Is It Possible To Specify The Output Uri For Pipelines And Their Components Using Pipeline Decorators? I Would Like To Store Pipeline Artifacts And Component Artifacts On S3.

Hi ReassuredOwl55
The easiest is to configure it as default output_uri in the clearml.conf of file the agent, wdyt?
https://github.com/allegroai/clearml-agent/blob/ebb955187dea384f574a52d059c02e16a49aeead/docs/clearml.conf#L430

one year ago

0 Hi Team, I Am Running Task Using This Command "Clearml-Agent Execute --Id <Taskid>". And My Config File Is Look Like Below, Do I Need To Change Anything In Config File. To Run My Task ,It Taking Too Much Of Time.

Did you set an agent on a machine? (See clearml agent in docs for details)

one year ago

I think what you are looking for is clearml-agent daemon
https://clear.ml/docs/latest/docs/clearml_agent
https://clear.ml/docs/latest/docs/getting_started/video_tutorials/agent_remote_execution_and_automation

one year ago

You need to adjust it to your setup , specifically change the queue name to one you have. Does that make sense ?

one year ago

0 Hi There, There Seems To Be An Issue In The Web Ui -> Viewing Plots In "View In Experiment Table" Doesn'T Respect The "Scalars To Display" One Sets When Viewing In "View In Fullscreen". Is This A Bug Or Expected Behaviour?

ElegantKangaroo44 it seems to work here?!
https://demoapp.trains.allegro.ai/projects/0e152d03acf94ae4bb1f3787e293a9f5/experiments/48907bb6e870479f8b230e6b564cd52e/output/metrics/plots

4 years ago

0 For Remote Execution Where The Queue Has

Hmm @<1523701083040387072:profile|UnevenDolphin73> I think this is the reason, None
and this means that even without a full lock file poetry can still build an environment

one year ago

0 I Need Some Clarification, How To Train The Cloned Model ? Because I Have Changed Hyper-Parameter Settings

How about this one:
None

one year ago

0 Hi All, I Am Testing The New

Hi GiganticTurtle0

I have found that

clearml

does not automatically detect the imports specified within the function decorated

The pipeline decorator will automatically detect the imports Inside the funciton, but not outside (i.e. global), to allow better control of packages (think for example one step needs the huge torch package, and the other does not.
Make sense ?

How can I tell

clearml

I will use the same virtual environment in all steps...

3 years ago

0 Hello, In The Following Context:

Metadata might be expensive, it's a RestAPI call, and we have found users putting hundreds of artifacts, with preview entries ...

4 years ago

0 Hi All, I Am Testing The New

Okay, so the idea behind the new decorator is not to group all the defined steps under the same script so that they share the same environment, but rather to simplify the process of creating scripts for each step and avoid manually calling

Task.init

on those scripts.

Correct, and allow users to more easily create Tasks from code.

Regarding virtual environment creation from caching, I will keep running benchmarks (from what you say it might be due to high workload ...

3 years ago

0 Hi Everyone, Is There Something Like A Clearml Context Manager To Disable Automatic Logging? I Use Torch.Save And Torch.Load To Temporarily Cache Something On Disk. I Delete It Afterwards. I Do Not Want Clearml To Push It To The Clearml-Server As An Artif

Hi @<1523701868901961728:profile|ReassuredTiger98>

is there something like a clearml context manager to disable automatic logging?

Sure just do a wildcard with the files you actually want to autolog the rest will be ignored:
None

task = Task.init(..., auto_connect_frameworks={'pytorch' : '*.pt'}

one year ago

0 Another Question Is If I Have A Conda Env Available On My Workers Systemwide.. Can I Use That Env Directly When Running Tasks With

PompousParrot44
It should still create a new venv, but inherit the packages from the system-wide (or specific venv) installed packages. Meaning it will not reinstalled packages you already installed, but it will ive you the option of just replacing a specific package (or install a new one) without reinstalling the entire venv

4 years ago

0 Hi All, I Am Having Trouble Using The

Notice both needs to be str
btw, if you need the entire folder just use StorageManager.upload_folder

3 years ago

0 Hi All, I Am Having Trouble Using The

Can you print the actual values you are passing? (i.e. local_file remote_url )

3 years ago

0 Hi, I Am Using Logger.Report_Plotly() To Get My Roc_Curves In The Plot Window. But When Using The Comparing Feature Of Clearml, I Would Like The Plots With The Same Figure Title To Overlap. Is There A Way To Do This ?

Hi BrightGoat74
So merging general purpose plotly plots is very hard (i.e. putting both on the same graph)
But if you report using logger.report_scatter(...) the UI will merge the ROC curves into the dame graph, wdyt?
https://clear.ml/docs/latest/docs/guides/reporting/scatter_hist_confusion_mat_reporting#2d-scatter-plots

2 years ago

Correct 🙂

2 years ago

0 When Running In

PompousParrot44 now that I think about it, you might be able to limit the cpu affinity, would that help?

4 years ago

0 Hi Everybody. I Have Problem When Logging Model In A Specific Case. If Model Has Parameter That Is A Dict Than It Is Not Saved To Clearml Even Tho It Is Saved In A Model Folder Normally. I Have Also Attached Example When This Is Happening As A Snippet. D

Thanks OutrageousGiraffe8
Any chance you can expand the example code to be a fully a reproducible toy code? (I would really like to make sure we fix it)

2 years ago

0 Hi There, I'M Training A Pytorch Model And Save It Every Epoch. It Seems Like The Model Wights Are Overridden And I Can'T Choose The Best Model After The Experiment Ends. This Feature Is Missing Or I'M Not Using The Library Well?

PompousBeetle71 just making sure, and changing the name solved it?

4 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

With the warning ?
I was able to reproduce it on the old versions, but it seems fixed on the latest from GitHub.

3 years ago

Thanks for pinging OutrageousGiraffe8
I think I was able to reproduce.

model is saved to the clearml as an output model when

b

is not a dictionary.

How did you make the example work with the automagic ?

2 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

Okay, progress.
What are you getting when running the following from the git repo folder:
git ls-remote --get-url origin

3 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

I think, this all ties into the none-standard git repo definition. I cannot find any other reason for it. Is it actually stuck for 5 min at the end of the process, waiting for the repo detection ?

3 years ago

0 Hi I Was Running An Hyperparameter Optimization Task Using The Optuna Optimizer And Even Though The Hyperparameteroptimizer’S Argument Is Set To

Hi UpsetBlackbird87
This is an Optuna decision on how many concurrent tests to run simultaneously.
You limited it to 100, but remember Optuna does a Bayesian optimization process, where it decides on the best set of arguments based on the performance of the previous set, this means it will first try X trials, then decide on the next batch.
That said you can a pruner to Optuna specifying how it should start
https://optuna.readthedocs.io/en/v1.4.0/reference/pruners.html#optuna.pruners.Median...

2 years ago

0 Second: Is There A Way To Take Internally Tracked Training Runs And Publish Them Publicly, E.G. For A Research Paper? "Appendix A: Training Runs Can Be Found Here, Feel Free To Explore Them And Look At The Loss Curves"? For Example

Hi SmallDeer34
On the SaaS you can right click on an experimenter and publish it 🙂
This will make the link available for everyone, would that help?

2 years ago

0 Hello All, I'M Trying To Adapt Clearml With My Workflow. I Installed A Server At My Server, With Workers Attached To It. I'M Trying To Execute A Task From My Local Within One Of My Workers. Trying To Use Docker Mode And A Custom Image. I Also Have A Local

ZanyPig66 you are correct in your assumptions. What exactly do you have in the Task? If there is no git repo the entire script should be under "uncommitted changes. What is your case?

2 years ago

Show more results