HollowPeacock58

9 Questions, 37 Answers

Active since 22 May 2023

Last activity one year ago

Reputation

Badges 1

37 × Eureka!

Questions 9
Answers 37

0 Votes

24 Answers

1K Views

0 Votes 24 Answers 1K Views

Hello I'M Running A Local Agent . While Its Running The Task I Get This Error. Any Suggestion? Uccessfully Installed Numpy-1.24.4 Found Pytorch Version Torch==2.0.1 Matching Cuda Version 0 Found Pytorch Version Torchaudio==2.0.2 Matching Cuda Version 0 Er

Hello I'm running a local agent . While its running the task i get this error. any suggestion? uccessfully installed numpy-1.24.4 Found PyTorch version torch...

mlops

one year ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi All, I'M A New User With Clearml-Agent. I Know It'S Supposed To Automatically Replicate The Environment Of A Task, Based On Installed Packages List. However, Installed Packages Of My Task Is Misses Many Of Installed Packages (Any Idea Why?) How Do I Co

Hi all, I'm a new user with clearml-agent. I know it's supposed to automatically replicate the environment of a task, based on INSTALLED PACKAGES list. Howev...

clearml

one year ago

0 Votes

2 Answers

655 Views

0 Votes 2 Answers 655 Views

Hi All, Did Anyone Try To Register "Pro" Recently? I Tried Now, But I Get A Receipt Of 0 $ , And Not Clear If Registered At All. Tnx

Hi all, Did anyone try to register "Pro" recently? I tried now, but I get a receipt of 0 $ , and not clear if registered at all. Tnx

clearml

one year ago

0 Votes

19 Answers

1K Views

0 Votes 19 Answers 1K Views

Hi, I Run 'Manually' On My Local Machine With No Errors. Then, I Clone The Completed Task And Enqueue It. I Get To Stage When 'Environment Setup Completed Successfully'. But Right After I Get An Error Related To 'Connect' Method - Task.Connect(Config.Mode

Hi, I run 'manually' on my local machine with no errors. Then, I clone the completed task and enqueue it. I get to stage when 'Environment setup completed su...

clearml

one year ago

0 Votes

2 Answers

832 Views

0 Votes 2 Answers 832 Views

Hello, I Got This Message Error - Failed Logging Task To Backend (1 Lines, <400/68: Events.Add_Batch/V1.0 (The Usage Quota Was Exceeded: Type=Metrics_Storage)>) I Removed Multiple Tasks To Clean Up Storage (Permanently Deleted From Archive) But Still Keep

Hello, I got this message error - failed logging task to backend (1 lines, <400/68: events.add_batch/v1.0 (The usage quota was exceeded: type=metrics_storage...

clearml

one year ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi ! I Have A Config Dictionary Which Is A Dot Dictionary ( A Dictionary That Supports Dot Notation As Well As Dictionary Access Notation Set Attributes: D.Val2 = 'Second' Or D['Val2'] = 'Second' Get Attributes: D.Val2 Or D['Val2'] ) I Ru

Hi ! I have a config dictionary which is a dot dictionary ( a dictionary that supports dot notation as well as dictionary access notation set attributes: d.v...

mlops

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hello! In My Code I Use A Package That Writes Into Wav Files, Named Soundfile (Import Soundfile As Sf). On 'Conda List' There Are - Soundfile 0.10.3.Post1 Pysoundfile 0.10.3.Post1 Pyhd3Deb0D_0 Conda-Forge Libsndfile

Hello! In my code I use a package that writes into wav files, named soundfile (import soundfile as sf). On 'conda list' there are - SoundFile 0.10.3.post1 py...

clearml

one year ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi ! I Have Some Hyper Parameters Of My Clearml Task Which I Connected To Task With- Parameters = Task.Connect(Model_Train_Dict, Name='Train_Params'). I Ran The Task Manually From Vscode. Than, From Clearml Dashboard I Cloned It , Changed One Of The Para

Hi ! I have some hyper parameters of my clearml task which i connected to task with- parameters = task.connect(model_train_dict, name='train_params'). I ran ...

mlops

one year ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, When Running A Task With An Agent (Located On My Pc) , I Get An Error Related To Pytorch Missing Some Attribute. I Validated The Installed Version By Clearml-Agent Is Same As On My Requirements List, Both For Pytorch And For Tensorboard. What Else Is

Hi, when running a task with an agent (located on my pc) , I get an error related to PyTorch missing some attribute. I validated the installed version by Cle...

mlops

one year ago

0 Hi ! I Have Some Hyper Parameters Of My Clearml Task Which I Connected To Task With- Parameters = Task.Connect(Model_Train_Dict, Name='Train_Params'). I Ran The Task Manually From Vscode. Than, From Clearml Dashboard I Cloned It , Changed One Of The Para

looking now i think it's probably since on my code I first read parameters from config file-
config = HpsYaml(paras.config)
and after that, I set these parameters to dictionary, which I connect to the task-
hparams_dict = {'batch_size': config.hparas.batch_size,
'valid_step': config.hparas.valid_step,
'max_step': config.hparas.max_step}
parameters = task.connect(hparams_dict, name='hyper_params')
So maybe the parameters from the config file override th...

one year ago

0 Hi, I Run 'Manually' On My Local Machine With No Errors. Then, I Clone The Completed Task And Enqueue It. I Get To Stage When 'Environment Setup Completed Successfully'. But Right After I Get An Error Related To 'Connect' Method - Task.Connect(Config.Mode

from einops import rearrange, repeat
ModuleNotFoundError: No module named 'einops'

one year ago

0 Hi, When Running A Task With An Agent (Located On My Pc) , I Get An Error Related To Pytorch Missing Some Attribute. I Validated The Installed Version By Clearml-Agent Is Same As On My Requirements List, Both For Pytorch And For Tensorboard. What Else Is

Training Translator ...
Traceback (most recent call last):
File "main.py", line 119, in <module>
from bin.train_module import Solver
File "/home/rakefet/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/clearml/binding/import_bind.py", line 54, in __patched_import3
mod = builtins.org_import(
File "/home/rakefet/.clearml/venvs-builds/3.8/task_repository/vq-bnf-translator-Rakefet.git/bin/train_module.py", line 4, in <module>
from src.solver import BaseSolver
File...

one year ago

I don't see why the dictionary is special

one year ago

Ok..so I should generally avoid connecting complex objects? I guess I would create a 'mini dictionary' with a subset of params, and connectvthis instead.

one year ago

Just to be clear, regarding the task.connect I have no solution. I can't keep these line commented out. Tnx in advance

one year ago

0 Hello I'M Running A Local Agent . While Its Running The Task I Get This Error. Any Suggestion? Uccessfully Installed Numpy-1.24.4 Found Pytorch Version Torch==2.0.1 Matching Cuda Version 0 Found Pytorch Version Torchaudio==2.0.2 Matching Cuda Version 0 Er

But what about this error?
ERROR: Invalid requirement: 'cudatoolkit=12.2'
Hint: = is not a valid operator. Did you mean == ?
RequirementsManager handler
...
exception: Failed installing GIT/HTTPs package 'cudatoolkit=12.2'
Failed installing GIT/HTTPs package 'cudatoolkit=12.2'
clearml_agent: ERROR: Could not install task requirements!

What is the origin of cudatoolkit=12.2 ? How should I resolve it?

one year ago

It may be related to the fact i re-installed cuda drivers. and did not re-create the virtual envs. However, on my pc it runs on gpu with no errors

one year ago

This is holding me from proceeding for quite a long.. perhapse we can meet virtually and solve it?

one year ago

Hi. To be on the safe side, I recreated the virtual env, ran locally and after through locally installed agent.
I get the same error - see log file.

one year ago

I dont know where cudatoolkit=12.2 is taken from. Its not on requirements.txt

one year ago

I tried adding
Task.add_requirements("cudatoolkit==12.2")#replacing pip install cudatoolkit==12.2

but then got
...
agent.package_manager.torch_nightly = false
agent.package_manager.poetry_files_from_repo_working_dir = false
agent.venvs_dir = /home/rakefet/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.venvs_cache.path = ~/.clearml/venvs-cache
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/rakefet/.clearml/vcs-cache
...

one year ago

Also, I am indeed using conda as package manager.
package_manager: {
# supported options: pip, conda, poetry
type: conda,

one year ago

0 Hello, I Got This Message Error - Failed Logging Task To Backend (1 Lines, <400/68: Events.Add_Batch/V1.0 (The Usage Quota Was Exceeded: Type=Metrics_Storage)>) I Removed Multiple Tasks To Clean Up Storage (Permanently Deleted From Archive) But Still Keep

Thanks

one year ago

0 Hi ! I Have A Config Dictionary Which Is A Dot Dictionary ( A Dictionary That Supports Dot Notation As Well As Dictionary Access Notation Set Attributes: D.Val2 = 'Second' Or D['Val2'] = 'Second' Get Attributes: D.Val2 Or D['Val2'] ) I Ru

You are right! I added this and indeed issue was solved. Thanks!

one year ago

0 Hello! In My Code I Use A Package That Writes Into Wav Files, Named Soundfile (Import Soundfile As Sf). On 'Conda List' There Are - Soundfile 0.10.3.Post1 Pysoundfile 0.10.3.Post1 Pyhd3Deb0D_0 Conda-Forge Libsndfile

Before doing anything I got -
Environment setup completed successfully
Starting Task Execution:
Traceback (most recent call last):
File "inference.py", line 10, in <module>
import soundfile as sf
File "/home/ubuntu/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/soundfile.py", line 142, in <module>
raise OSError('sndfile library not found')
OSError: sndfile library not found

one year ago

Ok. Thanks. will try

one year ago

What exactly do you mean by 'manually remove from installed packages in the UI'? Where on the UI?

one year ago

the original Task is created by simply executing code, not through agent

one year ago

on the virtual env, these are the installed packaged-

one year ago

Hi! after a deeper check I realized that I had also problem on my local pc to communicate with Nvidia driver. I now re-installed driver and dependencies, validated with nvidia-smi command, and local run looks ok.
I re-run with clearml-agent, now getting thie error-
Successfully installed AMFM_decompy-1.0.11 MarkupSafe-2.1.3 Pillow-10.0.0 PyYAML-6.0.1 antlr4-python3-runtime-4.8 appdirs-1.4.4 attrs-23.1.0 audioread-3.0.0 bitarray-2.7.6 cffi-1.15.1 clearml-1.12.2 cmake-3.27.2 colorama-0.4.6 con...

one year ago

locally the virtual env is created with conda, but inside it there are also packages installed with pip. Is that what you mean?

one year ago

Im running of Dell XPS 15 7590 with OS Ubuntu 22.04.2 (not a mac)
proocessor - x86_64.
Did update but still getting same error

one year ago

As you mentioned, requirement for 'cudatoolkit=12.2' is internal to clearml-agent, so I have no access of how to solve it.

one year ago

can you suggest a solution or a workaround?

one year ago

looking at 'installed packages' section after Taske reset I only see that ( NO cuda toolkit)-

Python 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0]

AMFM_decompy == 1.0.11
Cython == 3.0.2
Pillow == 10.0.0
PyYAML == 6.0.1
bitarray == 2.8.1
clearml == 1.12.2
einops == 0.6.1
hydra_core == 1.0.7
joblib == 1.3.2
librosa == 0.10.1
matplotlib == 3.7.2
numpy == 1.24.4
omegaconf == 2.0.6
packaging == 23.1
psutil == 5.9.5
regex == 2023.8.8
requests == 2.31.0
sacrebleu == 2.3.1
scikit_learn == ...

one year ago

Hi, I rerun now after minor updates. Get similar error in the same part of code-
Traceback (most recent call last):
File "main.py", line 84, in <module>
task.connect(config.model,name='model params')
File "/home/rakefet/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/clearml/task.py", line 1455, in connect
return method(mutable, name=name)
File "/home/rakefet/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/clearml/task.py", line 3573, in _connect_dictionary
dicti...

one year ago

ClearML results page: None
Traceback (most recent call last):
File "main.py", line 85, in <module>
task.connect(config.hparas,name='hyper params')
File "/home/rakefet/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/clearml/task.py", line 1455, in connect
return method(mutable, name=name)
File "/home/rakefet/.clearml/venvs-builds/3.8/lib/python3.8/site...

one year ago

After removing the task.connect lines, it encountered another error related to 'einops' that is not recognized. It does exist on my environment file but was not installed by the agent (according to what I see on 'Summary - installed python packages'. should I add this manually?

one year ago

Show more results