AgitatedDove14

49 Questions, 8094 Answers

Active since 10 January 2023

Last activity 10 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8094

0 Let’S Imagine I’M Building A Pipeline With Five Consecutive Steps, Where Some Of The Steps Are Non Ml/Dl Based. Using Clearml I Run A Lot Of Experiments To Find The Right Pipeline Configuration. After I Found The Right Algorithms And Parameters For My Pip

Hi ClumsyElephant70

s there a way to run all pipeline steps, not in isolation but consecutive in the same environment?

You mean as part of a real-time inference process ?

3 years ago

0 Hi, I Have A Small Issue About Gpu Monitoring. I Run My Training Inside A Singularity Container And I Set The Cuda_Visible_Devices Variable. However, I Get The Following Message:

Yes, that means the nvidia drivers are present (as you mentioned the GPU seems to be detected).
Could you check you have libnvidia-ml.so.1 inside the container ?
For example in /usr/lib/nvidia-XYZ/

4 years ago

0 Hi, I Have Been Getting The Following For A While. Is There A More Detailed Log I Can Look Into? This Happens On Both Https And Http.

Setting the credentials on agent machine means the users cannot use their own credentials since an k8s glue agent serves multiple users.

Correct, I think "vault" option is only available on the paid tier 😞

but how should we do this for the credentials?

I'm not sure how to pass them, wouldn't it make sense to give the agent an all accessing credentials ?

3 years ago

0 When I Do

is this a config file on your side or something I can change, if we had enterprise version?

Yes, this is one of the things you can configure

2 years ago

0 Am Reading Through The New Docs…. How Do I Go About Creating The Cron Jobs In The Ui

Scheduled training is what I’m looking forward to

I'll try to remember to update here once we pushed into the GitHub repo, feedback is always appropriated 🙂
If in the next two weeks you hear nothing, please ping here to make sure I did not forget 😉

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

ConfusedPig65 could you send the full log (console) of this execution?

3 years ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

Is there an option to do this from a pipeline, from within the

add_step

method? Can you link a reference to cloning and editing a task programmatically?

Hmm, I think there is an open GitHub issue requesting a similar ability , let me check on the progress ...

nope, it works well for the pipeline when not I don't choose to continue_pipeline

Could you send the full log please?

3 years ago

0 Hi All, We’Re Interested In Using Trains For A New Ml Project. This Project Is An Early Proof Of Concept So We’D Like To Start With The Open Source Version. One Question We’Re Finding Difficult To Answer Is: What Tools Do People Successfully Combine With

EnchantingWorm39 you have great timing ;)

4 years ago

0 Is This An Expected Behaviour? Trains Version 0.16.4, Not Able To Upgrade Now To Latest Version But I Doubt This Was Changed

New version will contain much more advanced search (including all the task fields)

are there any more fields in this function with partial matching? for example project? tags?

Yes they can all be filtered (basically everything you see in the UI)
notice: tags are strings (you can provide list of tags), project is an ID of the project
(Use Task.get_project_id, I think)

4 years ago

0 Hi All, I'M A New User With Clearml-Agent. I Know It'S Supposed To Automatically Replicate The Environment Of A Task, Based On Installed Packages List. However, Installed Packages Of My Task Is Misses Many Of Installed Packages (Any Idea Why?) How Do I Co

Hi @<1523702969063706624:profile|PoisedShark13>

However, INSTALLED PACKAGES of my task is misses many of installed packages (any idea why?)

It automatically detects the directly imported packages, literally analyzing your code base and looking for imports
The derivative packages (i.e. the one that any of the "main" packages need, will be listed after the first time the agent installs everything)
If something specific is missing, you can manually add it with:

Task.add_requiremen...

one year ago

0 Hi Everyone! I Have A Short Question That You Can For Sure Help Me With. Is There A Way To Avoid Each Task To Create A New Environment? I'D Like To Specify Which Env To Use. I Tried With

@<1523703080200179712:profile|NastySeahorse61> / @<1523702868694011904:profile|AbruptCow41>

Is there a way to avoid each task to create a new environment?

You can just define CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1 it will just use whatever you have there (notice it will totally ignore requirements.txt and "installed packages" on the Task)

BTW I would recommend turning on the venv caching, this is per docker/python/packages caching so the next time you are using th exact requi...

2 years ago

0 Hello. Recently Installed Packages Behavior Has Been Changed. Previously It Was Following: 1. If Installed Packages Is Empty, Packages Should Be Installed From Requirements.Txt. 2. If Installed Packages Is Not Empty, They Should Be Installed. Now It'S Fol

Hi ItchyJellyfish73
The behavior should not have changed.

"force_repo_requirements_txt" was always a "catch all option" to set a behavior for an agent, but should generally be avoided

That said, I think there was an issue with v1.0 (cleaml-server) where when you cleared the "Installed Packages" it did not actually cleared it, but set it to empty.
It sounds like the issue you are describing.
Could you upgrade the clearml-server and test?

3 years ago

0 Hi I Saw This On The Clearml-Agent Docs But Other Than The Docker Image, I'M Not Sure How To Integrate This With Clearml Py And Clearml-Server. Please Advise.

Hi SubstantialElk6
No need for that, you can use the helm chart (or spin them once with kubctl) then they take care of scheduling by themselves.
You can also use the k8s glue (basically spinning kubernetes pods automatically for you, based on the Tasks that you push into the ClearML queue)
https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py

In short, two possible deployments
Static k8s pod running the agent (then the agent runs all the experiments inside t...

3 years ago

0 Assuming I Have A

A few implementation / design details:
When you run code with Trains (and call init) it will record your environment (python packages, git code, uncommitted changes etc) Everything is stored on the Task object in the trains-server, when you clone a task you literally create a copy of the Task object (i.e. a second experiment). on the cloned experiment, you can edit everything (parameters, git, base docker image etc) When you enqueue a Task you add its ID to the execution queue list a trains-a...

4 years ago

0 Hi Anyone

That sounds like an internal tritonserver error.
https://forums.developer.nvidia.com/t/provided-ptx-was-compiled-with-an-unsupported-toolchain-error-using-cub/168292

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

Okay ConfusedPig65 I found the problem. For some reason the latest TF.keras.load_model . save_model is not tracked.
I'll make sure we push a fix later today

3 years ago

0 Hello! Since Today I Get

Thanks @<1523701868901961728:profile|ReassuredTiger98>
From the log this is what conda is installing, it should have worked

/tmp/conda_env1991w09m.yml:
channels:
- defaults
- conda-forge
- pytorch
dependencies:
- blas~=1.0
- bzip2~=1.0.8
- ca-certificates~=2020.10.14
- certifi~=2020.6.20
- cloudpickle~=1.6.0
- cudatoolkit~=11.1.1
- cycler~=0.10.0
- cytoolz~=0.11.0
- dask-core~=2021.2.0
- decorator~=4.4.2
- ffmpeg~=4.3
- freetype~=2.10.4
- gmp~=6.2.1
- gnutls~=3.6.13
- imageio~=2.9.0
-...

3 years ago

0 Hi Again. While Configuring Clearml-Agent, I'Ve Got Broken Ui Page. After Completely Reinstalling Docker Image For Docker Desktop For Windows, To Run Clearml-Server On My Workstation, I Tried To Set It Up Using "Clearml-Init", But Can'T Set Credentials. I

BattyLion34 are you saying you do not have the "APP CREDENTIALS" section in the profile page?

4 years ago

0 Can I Somehow Disable The Popup “Welcome To Clearml” Window At Each Visit? After Closing And Refreshing The Site It Just Pops Up Again

LOL EnormousWorm79 you should have a "do not show again" option, no?

3 years ago

0 Hey! Is There A Way To Ignore The Spammy Output Of Progressbars Like

Long story short, work in progress.
BTW: are you referring to manual execution or trains-agent ?

4 years ago

0 Hello Everyone! The Question About Dataset.Squash(). The Squash Operation Copies All The Data And Is No Longer Linked To Previous Commits? I Thought This Operation Is Like Git Squash But It Seems To Me That Clearml Dataset.Squash() Create Just A Copy Of S

The Squash operation copies all the data and is no longer linked to previous commits?

Yes, basically the idea is if you have data version that relies on many parents that needs to be merged, the squash will create a merged copy and push it all as a single version, and then yes the parent versions are no longer needed

I thought this operation is like git squash but it seems to me

yeah... we did not want to actually delete the parents because unlike git, the operation is done ...

11 months ago

0 Well, This Is My Question... I'M Trying To Adapt Clearml To Aws Using Basically Ecs Fargate + Documentdb + Aws Es + Elasticache + Efs. I Could Start The Fileserver Component, But Now I'M Trying To Start The Api Server And Is Not Working, Before Stop The T

BTW: Full RestAPI reference here
https://allegro.ai/clearml/docs/rst/references/clearml_api_ref/index.html

3 years ago

0 I Have Set

Hi Guys, just curious here, what's was the final issue?
Also out of curiosity, what does that mean? "1.12.2 because some bug that make fastai lag 2x" ?

9 months ago

0 Hi Everyone! Is There A Way To Specify The Working Directory In A Pipeline Component? I’M Using Pipelines From Decorators, I Can Set The Repo Url Just Fine, But I’M Running Everything From A Subfolder, And The Working Dir Is Set To

Hi @<1570220858075516928:profile|SlipperySheep79>

Is there a way to specify the working dir from the decoratoe

not directly, but why would that change anything? I mean the coponent code will be created in the git root, and you can still access files inside the subfolders

from .subfolder import something

what am I missing?

one year ago

Okay this is a bit hacky but will work

@PipelineDecorator.component(...)
def step(...)
  import sys
  import os
  sys.path.append(os.path.join(os.path.abspath(os.path.dirname(__file__)), "projects", "main" ))
  
  from file import something

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

"sub nodes" inside pipeline, in my opinion, makes them much more useful, in sense that all the steps are visible.

Yeah I really like this idea... continuing this thread, would it also make sense to have a Task object per "sub-node" and run the sub-nodes as subprocess of the parent Node? I'm thinking this sounds like a combination of both local pipeline execution and remote pipeline execution.
wdyt?

2 years ago

0 Is There An Easy Way To Add A Link To One Of The Tasks Panels? (As An Artifact, Configuration, Info, Etc)? Edit: And Follow Up Regarding The Dataset. As Discussed Somewhere Previously, The Datasets Are Now Automatically Moved To A Hidden "Sub-Project" Pr

This seems to only work for a single file (weights_path implies a single file, not multiple ones). Is that the case?See update_weights_package actually packages an entire folder as zip and will do the extraction when you get it back (check the function docstring, I think you can also specify wildcard etc if needed)

Why do you see this as preferred to the dataset method we have now?

So it answers a few requirements that you raised
It is fully visible as part of the project and se...

2 years ago

0 When I Setup My Local Virtual Environment I Use A Combination Of Conda And Pip. I Use Conda As My Environment Manager, And Then Use Pip For Packages That Are Not In The Conda Repositories.

Hi VivaciousPenguin66
Seems like a CUDA/CUDNN issue.
You argent is configured to work in venvmode, which mean it will pull the correct pytorch version based on the detected CUDA driver support. Speicifally you can see in the log "agent.cuda_version = 111" which means CUDA 11.1 and from the log it found the correct pytorch version:
` Torch CUDA 111 download page found
Found PyTorch version torch==1.8.1 matching CUDA version 111
Found PyTorch version torchvision==0.9.1 matching CUDA version 1...

3 years ago

0 Hi I'M Using Clearml Datasets. How Do I Tell From The Clearml Ui Which Datasets Version Am I Using?

feature is however available in the Enterprise Version as HyperDatasets. Am i correct?

Correct
BTW you could do:
datasets_used = dict(dataset_id="83cfb45cfcbb4a8293ed9f14a2c562c0") task.connect(datasets_used, name='datasets') from clearml import Dataset dataset_path = Dataset.get(dataset_id=datasets_used['dataset_id']).get_local_copy()This will ensure that not only you have a new section called "datasets" on the Task's configuration, buy tou will also be able to replace the datase...

3 years ago

0 Hi! I’Ve Run A Task In A Docker Container With Memory Constraint 16Gb (Clearml-Task ….. --Docker_Args “--Memory=16G”), So I Expected To See The Max Memory Available Equal 16Gb In Web Ui (Scalars/Monitor:Machine), But It Shows Memory Available In The Whole

it will only if oom killer is enabled

true, but you will still get OOM (I believe). I think the main issue is the even from inside the container, when you query the memory, you see the entire machine's memory... I'm not sure what we can do about that

2 years ago

Show more results