AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hello! Since Today I Get

Okay this seems correct:

pytorch=1.8.0=py3.7_cuda11.1_cudnn8.0.5_0

I can't seem to find what's the diff between the two.
Give me a second let me check if I can reproduce it somehow.

3 years ago

0 Hello! Since Today I Get

Let me check

3 years ago

0 Hello! Since Today I Get

Uninstall the current clearml-agent and reinstall this wheel, I hacked it to have ==, let's see if that works

3 years ago

0 Hello! Since Today I Get

@<1523701868901961728:profile|ReassuredTiger98> it works on my machine 😞

3 years ago

0 Hello! Since Today I Get

thanks!

3 years ago

0 Hello! Since Today I Get

Thanks @<1523701868901961728:profile|ReassuredTiger98>
From the log this is what conda is installing, it should have worked

/tmp/conda_env1991w09m.yml:
channels:
- defaults
- conda-forge
- pytorch
dependencies:
- blas~=1.0
- bzip2~=1.0.8
- ca-certificates~=2020.10.14
- certifi~=2020.6.20
- cloudpickle~=1.6.0
- cudatoolkit~=11.1.1
- cycler~=0.10.0
- cytoolz~=0.11.0
- dask-core~=2021.2.0
- decorator~=4.4.2
- ffmpeg~=4.3
- freetype~=2.10.4
- gmp~=6.2.1
- gnutls~=3.6.13
- imageio~=2.9.0
-...

3 years ago

0 Hello! Since Today I Get

send me the conda freeze:

# Name                    Version                   Build  Channel
...

3 years ago

0 Hello! Since Today I Get

Maybe the ~= is breaking the conda "magic" version resolver

3 years ago

0 Hello! Since Today I Get

Could you test with 4.7.5 ?

3 years ago

0 Hello! Since Today I Get

Okay this is very close to what the agent is building:
Could you start a new conda env,
then install cudatoolkit=11.1
then run:

conda env update -p <conda_env_path_here> --file the_env_yaml.yml

3 years ago

0 Hello! Since Today I Get

Does clearml resolve the CUDA Version from driver or conda?

Actually it starts with the default CUDA based on the host driver, but when it installs the conda env it takes it from the "installed packages" (i.e. the one you used to execute the code in the first place)

Regrading link, I could not find the exact version bu this is close enough I guess:
None

3 years ago

0 Hello! Since Today I Get

WTF?!

3 years ago

0 Hello, I Want To Report A Confusion Matrix With The Values

VirtuousFish83 I remember an issue on github with something similar , what's the cleamrl- server version you are using ?

2 years ago

0 Is There A Way To Get A Task'S Docker Container Id/Name? I'M Generally Interested In Resource Profiling Of Each Container, So I Noticed I Can Use

Oh, yes, that might be (threshold is 3 minutes if no reports) but you can change that:
task.set_resource_monitor_iteration_timeout(seconds_from_start=10)

one year ago

0 Hey, Using K8S With Trains 0.16.1-320, All Of A Sudden The Entire Data (I.E Experiments, Tasks, Api Creds) Is Not Showing In The Ui Anymore. All Logs Seems To Be Fine Afai Can Tell... Any Idea What Went Wrong?

Could it be it was never allocated to begin with ?

3 years ago

0 Clearml (Remote Execution) Sometimes Doesn'T "Pick-Up" Gpu. After I Rerun The Task It Picks It Up. Seems Random, Doesn'T Happen Too Often (Maybe Once In 30-40 Times) And I Cannot Seem To Detect Any Pattern. Did Anyone Else Notice This? Agents Are Vms On G

Is there an easy way to add a docker argument in the python script?

On the task it self in the UI you can edit the docker arguments and add any missing flags
(task.set_base_docker will do the same from code)
You can also edit the configuration and always add this flag:
None

2 months ago

0 Hi, I Am Having Difficulties When Using The Dataset Functionality. I Am Trying To Create A Dataset With The Following Simple Code:

But what I get with

get_local_copy()

is the following path: ...

Get local path will return an immutable copy of the dataset, by definition this will not be the "source" storing the data.
(Also notice that the dataset itself is stored in zip files, and when you get the "local-copy" you get the extracted files)
Make sense ?

2 years ago

0 Could You Please Explain A Bit More How Trains Adapt The Torch Version Depending On The Installed Cuda Version? Here Is My Setup:

😞

3 years ago

0 Hello! Since Today I Get

Where again does clearml place the venv?

Usually ~/.clearml/venvs-builds/<python version>/
Multiple agents will be venvs-builds.1 and so on

3 years ago

0 Hello! Since Today I Get

Thanks! Tomorrow is great, I'll put the wheel here 🙂

3 years ago

0 Hello! Since Today I Get

No worries, gnight :)

3 years ago

0 It Seems Like Clearml Agent Does Not Support Arparse Subparsers, Right?

if executed remotely...

You mean cloning the local execution, sending to the agent, then when running on the agent the Args/command is updated to a list ?

3 years ago

0 Is There Any Simple Way To Orchestrate A Batch To Train A Model With Different Features (In Order To Do Feature Selection, For Example) Through A Single .Py File? I Saw The Following Example

Could I just build it and log these parameters using

task.set_parameters()

so that I call

task.get_parameters()

later?

instead of manually calling set/get, you call task.connect(some_dict_or_object) , it does both:
When running manually (i.e. without an agent) it logs the keys/values on the Task,
when running with an agents, it takes the values from the backend (Task) and sets them on the dict/object
Make sense ?

2 years ago

0 Hi! Is There A Way To Export The Credentials Of The Aws Account Only During The Creation Of The Docker? I Don’T Want Every User In My Team To Know The Credentials To Access S3 Buckets. I Just Want Them To Be Able To Write In The Bucket Without The Credent

I see, so basically pull a fixed set of configuration for everyone from the server.
Currently only the scale/enterprise version supports such a feature 😞

2 years ago

0 Hi All, A Newbie Question: How Can I Store Single Value Results Per Experiment That Will Appear As Metrics I Can Select In The Experiments Tables Columns. My Reference Is The "Tracking Leaderboards" Tutorial.

WhimsicalLion91

What would you say the use case for running an experiment with iterations

That could be loss value per iteration, or accuracy per epoch (iteration is just a name for the x-axis in a sense , this is equivalent to time series)
Make sense?

3 years ago

0 Hi

Actually what my service do is to collect

stdout/stderr

from the Docker socket

That's exactly how the agent works, it cannot really filter it, it logs everything by default for full visibility ...

one year ago

0 Hi, I Am Trying To Clone An Experiment. Using The Server Gui, I Select 'Clone' And Then 'Enqueue'. In The Console Window, I See That Clearml Makes Sure The Environment Is Installed, And Then It Goes Into A 'Completed' Status Although The Experiment Did N

This seems to be the issue:
PYTHONPATH = '.'How is that happening ?
Can you try to run the agent with:
PYTHONPATH= clearml-agent daemon ....(Notice the prefix PYTHONPATH= clears the environment variable that obviously fails the python commands)

one year ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

. Curious what advantage it would be to use the StorageManager

Basically if you set the clearml cache folder to the EFS, users can always do:
from clearml import StorageManager local_file = StorageManager.get_local_copy(" ")where local_file is stored on persistent cache (EFS) and the cache is automatically cleaned based on last accessed file

one year ago

0 Hello Clearml Community, Does Anyone Have An Idea How I Could Integrate/Manager Carla (

my experiment logic

you mean the actual code doing the training ?

so that it gets lazily executed and not at task definition time

Task definition time -> when creating the Pipeline Task? remember the base_task_factory a the end creates a Task object (it does not run the code itslef).
BTW: if you have simple training logic you can use pipeline decorators , it might be a better fit?
https://clear.ml/docs/latest/docs/fundamentals/pipelines#pipeline-from-function-decorator

2 years ago

0 Hi, I Am Trying To Run A Task In An Agent From A Repository With An

It will always set it's own environment, wither with static analysis or with "pip freeze" / "conda freeze"
It needs to log the exact setup that was actually installed.
When you later launch it on a remote machine, it can either use this to recreate the environment (using pip or conda), or you can clear the entire section, where it will fall back to "requirements.txt"
Any reason for specifically using the "environment.yaml" ?

3 years ago

Show more results