ExcitedFish86

8 Questions, 55 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Badges 1

43 × Eureka!

Questions 8
Answers 55

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi All, Is There A Way To Filter A Experiments In A Hyperparameter Sweep Based On A Given Range Of A Parameter/Metric In The Ui (Similar To

Hi all, Is there a way to filter a experiments in a hyperparameter sweep based on a given range of a parameter/metric in the UI (similar to wandb )? Also, is...

clearml

4 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi Guys, Just Wanted To Let You Know That Many Links In The Clearml Github Page Are Broken (I.E.,

Hi guys, just wanted to let you know that many links in the ClearML github page are broken (i.e., https://github.com/allegroai/clearml/blob/master )

clearml

4 years ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Hi All, I'M Trying To Upgrade

Hi all, I'm trying to upgrade clearml-server but I keep getting permission errors from the elastic search container: clearml-elastic | ElasticsearchException...

clearml

4 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hi All, I Have A Question Regarding Multi-Node Training Using The Clearml-Agent. What Is The Recommended Setup In This Case? Say I Have 3 Nodes With 3 Agents Running On Them. How Do I Make Sure They All Run The Same Job?

Hi all, I have a question regarding multi-node training using the clearml-agent. What is the recommended setup in this case? Say I have 3 nodes with 3 agents...

clearml

4 years ago

0 Votes

18 Answers

2K Views

0 Votes 18 Answers 2K Views

Hi All, I'M Using Clearml 1.0.3 With Clearml-Server <1 (How Do I Get The Current Running Version?) In Pytorch-Lightning I Use Ddp And I See Multiple Tasks (As The Number Of Gpus) Being Created And Remaining In Draft Mode. Is It A Problem Running Clearml

Hi all, I'm using clearml 1.0.3 with clearml-server <1 (how do I get the current running version?) In Pytorch-Lightning I use DDP and I see multiple tasks (a...

clearml

4 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Hi Folks! I'M Using

Hi folks! I'm using SummaryWriter from PyTorch's tensorboard utils to log pr_curve , and I get the attached curve. Looks like the X axis is reversed, and I c...

tensorboard

4 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

Hi guys! Is there a way to tell an agent to run a task in an existing venv (without creating a new one)?

mlops

3 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi All, I See There Is An Option For Running A Bash Script / Commands Inside A Container Started By An Agent. Is It Possible To Have This Set Differently Per

Hi all, I see there is an option for running a bash script / commands inside a container started by an agent. Is it possible to have this set differently per...

clearml

3 years ago

0 Hi Folks! I'M Using

thanks!

4 years ago

0 I Am Using Pytorch Lightning With Ddp Accelerator On 4 Gpus, And I Found Every Checkpoint Is Recorded 4 Times On Web Ui With Different Ids. One Is On

this is pretty weird. PL should only save from rank==0 :
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/connectors/checkpoint_connector.py#L394

4 years ago

0 Hi All, I'M Using Clearml 1.0.3 With Clearml-Server <1 (How Do I Get The Current Running Version?) In Pytorch-Lightning I Use Ddp And I See Multiple Tasks (As The Number Of Gpus) Being Created And Remaining In Draft Mode. Is It A Problem Running Clearml

got it

4 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

AgitatedDove14 , I'm running an agent inside a docker (using the image on dockerhub) and mounted the docker socket to the host so the agent can start sibling containers. How do I set the config for this agent? Some options can be set through env vars but not all of them 😞

3 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

Can you elaborate on what you would do with it? Like an OS environment disable the entire setup itself ? will it clone the code base ?

It will not do any setup steps. Ideally it would just pull an experiment from a dedicated HPO queue and run it inplace

3 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

the hack doesn't work if conda is not installed 😞

3 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

JitteryCoyote63 I still don't understand what is the actual CUDA version you are using on your machine

4 years ago

0 Hi All, I'M Trying To Upgrade

just docker-compose up with the latest compose file from the repo

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

just to be clear, multiple CUDA runtime version can coexist on a single machine, and the only thing that points to which one you are using when running an application are the library search paths (which can be set either with LD_LIBRARY_PATH , or, preferably, by creating a file under /etc/ld.so.conf.d/ which contains the path to your cuda directory and executing ldconfig )

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

libcudart

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

cudnn isn't cuda, it's a separate library.
are you running on docker on bare metal? you should have cuda installed at /usr/local/cuda-<>

4 years ago

0 Hi All, I Have A Question Regarding Multi-Node Training Using The Clearml-Agent. What Is The Recommended Setup In This Case? Say I Have 3 Nodes With 3 Agents Running On Them. How Do I Make Sure They All Run The Same Job?

not really... what do you mean by "free" agent?

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

note that the cuda driver was only recently added to nvidia-smi

4 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

I just don't fully understand the internals of an HPO process. If I create an Optimizer task with a simple grid search, how do different tasks know which arguments were already dispatched if the arguments are generated at runtime?

3 years ago

0 Hi Folks! I'M Using

you rock!!! thank!

4 years ago

0 Hi Folks! I'M Using

AgitatedDove14 I'm not sure this is fixed... using the latest RC

4 years ago

sounds great.
BTW the code is working now out-of-the-box. Just 2 magic line - import + Task.init

4 years ago

0 Hi All, I'M Trying To Upgrade

thanks!

4 years ago

0 Hi All, I'M Trying To Upgrade

Ok working now 🙂 had a mistake in the folder name

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

try:
sudo updatedb locate libcudart

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

this is the cuda driver api. you need libcudart.so

4 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

lol great hack. I'll check it out.
Although I'd be really happy if there was a solution in which I can just spawn an ad-hoc worker 🙂

3 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

I'm trying to achieve a workflow similar to the one in wandb for parameter sweep where there are no venvs involved other than the one created by the user 😅

3 years ago

0 Hi Folks! I'M Using

I'm not working with tensorflow. I'm using SummaryWriter from torch.utils.tensorboard . Specifically add_pr_curve :
https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter.add_pr_curve

4 years ago

Lets start with a simple setup. Multi-node DDP in pytorch

4 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

just seems a bit cleaner and more DevOps/k8s friendly to work with the container version of the agent 🙂

3 years ago

I can't find it in PyPi

4 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

so you dont have cuda installed 🙂

4 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

AgitatedDove14 Just to see that I understood correctly - in an HPO task, all subtasks (a specific parameter combination) are created and pushed to the relevant queue at the moment the main (HPO) task is created?

3 years ago

0 Hi All, I'M Trying To Upgrade

same thing 😞

4 years ago

Show more results