AdventurousButterfly15

17 Questions, 77 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

75 × Eureka!

Questions 17
Answers 77

0 Votes

21 Answers

1K Views

0 Votes 21 Answers 1K Views

Do I Understand Correctly That Python Versions Must Match Between Client (My Mac, Sends Task For Remote Execution) And Clearml-Agent? I Don’T Really Get How The Environments Are Managed. All I Want To Do Is Take My Code And Execute It On The Agent Machin

Do I understand correctly that python versions must match between client (my mac, sends task for remote execution) and clearml-agent? I don’t really get how ...

mlops pytorch

2 years ago

0 Votes

2 Answers

973 Views

0 Votes 2 Answers 973 Views

When I Try To Run Any Task The Agent Tries To Mount Something Vscode Related:

When I try to run any task the agent tries to mount something vscode related: 683637074988 adamastor:gpuall INFO Executing: ['docker', 'run', '-t', '--gpus',...

mlops

one year ago

0 Votes

5 Answers

979 Views

0 Votes 5 Answers 979 Views

I Am Training A Model With Pytorch Lightning. I Save

I am training a model with Pytorch lightning. I save .ckpt checkpoints. But they never get uploaded to clearml! How do I make clearml detect the checkpoints?...

clearml

one year ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

I Am Saving A Model With Pickle, But It Doesn’T Show Up As An Artifact. Why?

I am saving a model with pickle, but it doesn’t show up as an artifact. Why? Task.init(..., output_uri=True) model = SklearnPipeline() ... pickle.dump(model,...

clearml

one year ago

0 Votes

14 Answers

1K Views

0 Votes 14 Answers 1K Views

Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

ClearML task execution fails trying to pull data from Gitlab. The credentials are correct (username + access token), but I get this error: remote: HTTP Basic...

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

How Can I Extend A Dataset? On Community Edition I Have An Existing Dataset, I Want To Add Some Files And Make A New Version. I Tried Just Doing A

How can I extend a dataset? On community edition I have an existing dataset, I want to add some files and make a new version. I tried just doing a Dataset()....

clearml

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

When I Run A Task With

When I run a task with Dataset.get the agent requests the dataset from a weird url. adamastor.gaiavf.local in this case. 2022-10-03 17:50:17,556 - clearml.st...

mlops

2 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

I Am Trying To Do A Remote Execution Of A Test Task, But It Fails During Env Setup Due To Trying To Install An Obscure Version Of Pytorch. Been Trying To Solve This For Three Days! The Script:

I am trying to do a remote execution of a test task, but it fails during env setup due to trying to install an obscure version of pytorch. Been trying to sol...

mlops pytorch

2 years ago

0 Votes

5 Answers

995 Views

0 Votes 5 Answers 995 Views

Has Anyone Had Success Using Clearml With Huggingface Models? I Create My Hf

Has anyone had success using clearml with huggingface models? I create my HF Trainer with the ClearML callback, but the only thing I get in the logs is this ...

clearml

one year ago

0 Votes

6 Answers

941 Views

0 Votes 6 Answers 941 Views

I Am Getting

I am getting CLEARML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-start . How do fix it? The docs are not helpfu...

pytorch

one year ago

0 Votes

22 Answers

1K Views

0 Votes 22 Answers 1K Views

Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

Why does my task execution freeze after pip installation (running agent in foreground mode)? The task is: from clearml import Task task = Task.init(project_n...

mlops

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hey, Loving Clearml So Far. I Create An Agent With 1 Gpu And I Am Sending A Task To It. But It Says That It Couldn’T Create A Docker With Gpu Access. How Can I Fix That?

Hey, loving ClearML so far. I create an agent with 1 gpu and I am sending a task to it. But it says that it couldn’t create a docker with gpu access. How can...

mlops

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Is There A Way To Clear Clearml Caches, Maybe Some Command? My Server Ran Out Of Space And I Lost A Whole Weekend Of Training. My

Is there a way to clear ClearML caches, maybe some command? My server ran out of space and I lost a whole weekend of training. My venvs-cache folder was over...

clearml

one year ago

0 Votes

22 Answers

1K Views

0 Votes 22 Answers 1K Views

Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

clearml-session fails ssh tunneling. It does not use key auth, instead sets up some weird password and then fails to auth: Remote machine is ready Setting up...

remote-ssh

2 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

When I Set Agent Management To Conda It Tries To Create Envs With Python 3.1 And Fails.

When I set agent management to conda it tries to create envs with python 3.1 and fails. Executing Conda: /home/adamastor/anaconda3/bin/conda create --yes --m...

mlops

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

How Can I Stop Clearml From Uploading Temporary Models? I Am Running Cross_Validation, Training A Bunch Of Models In A Loop Like This:

How can I stop clearml from uploading temporary models? I am running cross_validation, training a bunch of models in a loop like this: models = [] for X_trai...

clearml

one year ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

I Trained A Model, Saved It. Now I Am Trying To Access It From Another Machine, But The Model Url Is A Local Path. How Can I Download Models From Clearml?

I trained a model, saved it. Now I am trying to access it from another machine, but the model url is a local path. How can I download models from Clearml?

clearml

one year ago

0 Hey, Loving Clearml So Far. I Create An Agent With 1 Gpu And I Am Sending A Task To It. But It Says That It Couldn’T Create A Docker With Gpu Access. How Can I Fix That?

The issue was that nvidia-docker2 was not installed on the machine where I was trying to run the agent. Following this guide fixed it:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

2 years ago

0 Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

Is there some minimal example of a docker env agent I can run, just to see that it works?

2 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

AgitatedDove14
made a new one:
https://pastebin.com/LxLFk7py

2 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

Sure, will send in a few min when it executes

2 years ago

0 Clearml-Session Fails Ssh Tunneling. It Does Not Use Key Auth, Instead Sets Up Some Weird Password And Then Fails To Auth:

Ok, it makes sense. But it’s running in docker mode and it is trying to ssh into the host machine and failing

2 years ago

0 Is There A Way To Clear Clearml Caches, Maybe Some Command? My Server Ran Out Of Space And I Lost A Whole Weekend Of Training. My

Thanks!

one year ago

0 How Can I Stop Clearml From Uploading Temporary Models? I Am Running Cross_Validation, Training A Bunch Of Models In A Loop Like This:

"realmodelonly.pkl"

should be the full path, or just the file name?

one year ago

0 How Can I Stop Clearml From Uploading Temporary Models? I Am Running Cross_Validation, Training A Bunch Of Models In A Loop Like This:

@<1523701205467926528:profile|AgitatedDove14> thanks!

one year ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

(base) boris@adamastor:~/clearml_config$ clearml-agent --version CLEARML-AGENT version 1.4.0

2 years ago

0 Has Anyone Had Success Using Clearml With Huggingface Models? I Create My Hf

I also use TB.

I solved the issue by implementing my own ClearML logger

one year ago

0 Do I Understand Correctly That Python Versions Must Match Between Client (My Mac, Sends Task For Remote Execution) And Clearml-Agent? I Don’T Really Get How The Environments Are Managed. All I Want To Do Is Take My Code And Execute It On The Agent Machin

CostlyOstrich36 in installed packages it has:
` # Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:41:22) [Clang 13.0.1 ]

Pillow == 9.2.0
clearml == 1.7.1
minio == 7.1.12
numpy == 1.23.1
pandas == 1.5.0
scikit_learn == 1.1.2
tensorboard == 2.10.1
torch == 1.12.1
torchvision == 0.13.1
tqdm == 4.64.1 `Which is the same as I have locally and on the server that runs clearml-agent

2 years ago

Locally I have a conda env with some packages and a basic requirements file.
I am running this thing:
` from clearml import Task, Dataset
task = Task.init(project_name='Adhoc', task_name='Dataset test')
task.execute_remotely(queue_name="gpu")

from config import DATASET_NAME, CLEARML_PROJECT
print('Getting dataset')

dataset_path = Dataset.get(
dataset_name=DATASET_NAME,
dataset_project=CLEARML_PROJECT,
).get_local_copy()#.get_mutable_local_copy(DATASET_NAME)

print('Dataset path', d...

2 years ago

Yeah, pytorch is a must. This script is a testing one, but after this I need to train stuff on GPUs

2 years ago

0 Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

The image I am using is pytorch/pytorch:1.7.0-cuda11.0-cudnn8-devel

2 years ago

0 I Am Getting

Version: 1.11.1

one year ago

0 I Trained A Model, Saved It. Now I Am Trying To Access It From Another Machine, But The Model Url Is A Local Path. How Can I Download Models From Clearml?

Thanks, I will try

one year ago

0 I Trained A Model, Saved It. Now I Am Trying To Access It From Another Machine, But The Model Url Is A Local Path. How Can I Download Models From Clearml?

That worked, thanks

one year ago

0 Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

AgitatedDove14 With --debug I see that after installing packages there is an endless stream of this:
` Retrying (Retry(total=239, connect=239, read=240, redirect=240, status=240)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fac842e8be0>: Failed to establish a new connection: [Errno 111] Connection refused',)': /auth.login
Retrying (Retry(total=238, connect=238, read=240, redirect=240, status=240)) after connection broken by 'NewConnec...

2 years ago

On the agent side it’s trying to install different pytorch versions (even though the env already has it all configured), then fails with torch_<something>.whl is not a valid wheel for this system

2 years ago

0 Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

Agent is running in docker mode. The host OS is ubuntu

2 years ago

0 Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

Freezing means that after the pip packages installation, pictured on screenshot, nothing happens. This screen hangs forever. No other output anywhere, including the web UI

2 years ago

0 Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

This issue was resolved by setting the correct clearml.conf (replacing localhost with a public hostname for the server) 🙂

2 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

Yes, I am able to clone locally on the same server the agent is running on. However I do it using ssh auth

2 years ago

0 Clearml Task Execution Fails Trying To Pull Data From Gitlab. The Credentials Are Correct (Username + Access Token), But I Get This Error:

Yes, the git user is correct. It does not display the password of course. I tested and the config is definitely coming from clearml.conf

Still, the error persists

2 years ago

The failure is that it does not even run