AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi, I Have A Task That'S Running On A Docker Container. Now - There Are A Bunch Of Other Docker Containers (Namely, Nvidia'S Tf 21.1 To 21.10) For Which I Want To Run The Task. How Can I Do This Using Agents / Remote Execution? Thanks

ImmensePenguin78 this is probably for a different python version ...

4 years ago

0 ... And Yet Another

Here, I

know

the pattern is incomplete and invalid. A less advanced user might not understand what's up.

Basically like your suggestion that if the request fails while typing instead of the error popup the search bar will turn "dark red", and on the next key stroke will be "cleaned" ?

3 years ago

0 Hi All, Wanted To Know If There’S A Way (That’S Not A Hack) To Configure K8S Agents To Use Github Deploy Keys? As I Understand, Only User/Pass Combinations Are Possible With Agent Pods (Given By

Hi MassiveBat21
CLEARML_AGENT_GIT_USER is actually git personal token
The easiest is to have a read only user/token for all the projects.
Another option is to use the ClearML vault (unfortunately not part of the open source) to automatically take these configuration on a per user basis.
wdyt?

2 years ago

0 Hi Everyone! I Have A Question About The Pipeline Controller: I Would Like To Build A Ml Pipeline Similar To The One At

LovelyHamster1 NICE! 👍

4 years ago

0 Hi, I Encountered An Issue That Might Affect Others As Well: When Using "

Hi IrritableJellyfish76
https://clear.ml/docs/latest/docs/references/sdk/task#taskget_tasks

task_name

(

str

) – The full name or partial name of the Tasks to match within the specified

project_name

(or all projects if

project_name

is

None

). This method supports regular expressions for name matching. (Optional)

You are right, this is a bit confusing, I will make sure that we add in the docstring an examp...

3 years ago

0 Hi. I Have A Question About Pipelines And Their Generated Dependency Graphs. I Took The Code Of The Clearml Pipeline From Decorator Example:

The additional edges in the graph suggest that these steps somehow contain dependencies that I do not wish them to have.

PanickyMoth78 I think I understand what you are saying, but it is hard to see if there is a "bug" here or a feature...
Can you post the full code of the pipline?

3 years ago

0 Hi, I Have A Local Package That I Use To Train My Models. To Start Training, I Have A Script That Calls

JitteryCoyote63 I think that without specifically adding torch to the requirements, the agent will not be able to automatically resolve the correct cuda/torch version. Basically you should add torch to the requirements.txt file, and provide it to Task create, or use Task.force_requirements_env_freeze

3 years ago

0 Hi, I'M Following The Instructions For

This really makes little sense to me...

Can you send the full clearml-session --verbose console output ?

Something is not working as it should obviously, console output will be a good starting point

3 years ago

0 Hi, I Have A Future Roadmap Question On Clearml-Datasets. The Current Implementation Works Well For Small Datasets But Its Rather In Effective For Very Large Datasets. For Example, Let'S Say I Have 10 Million Images Just For The Training Dataset, And My T

Would you have an example of this in your code blogs to demonstrate this utilisation?

Yes! I definitely think this is important, and hopefully we will see something there 🙂 (or at least in the docs)

4 years ago

0 Hi. I Get Some Problem With Clearml Agent. I Start Training On My Local Device, Clone Run, And Start This Run In Docker On Cluster. But, Seems Like Clearml Agent Сaches Environment(Package Weels, Python Version, Etc). Can I Config Clearml Agent To Not Сac

I'm running agent inside docker.

So this means venv mode...

Unfortunately, right now I can not attach the logs, I will attach them a little later.

No worries, feel free to DM them if you feel this is to much to post them here

3 years ago

0 How To Use

This doesn't seem to be running inside a container...
What's the clearml-agent launch command you are using ? (i.e. do you have --docker flag)

4 years ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

Internally we use blob.upload_from_file it has a default 60sec timeout on the connection (I'm assuming the upload could take longer).

4 years ago

0 Hi, I'M Looking For An Example Show How

Hi OutrageousSheep60
Do you mean something like:
https://github.com/allegroai/clearml/tree/master/examples/datasets
?

3 years ago

0 Any Idea Why I Get This Error In All My Agents

Is this still an issue (if you provide queue name, the default tag is not used so no error should be printed)

4 years ago

0 Hi, I Have A Question About Queue Management Of Clearml Agents. I Am Still A Beginner To Clearml And Still Discovering The Potential It Has And As Of Now It Has Amazed Me With It Versatile Features

Hi UpsetBlackbird87

I might be wrong, but it seems like ClearML does not monitor GPU pressure when deploying a task to a worker rather rely only on its configured queues.

This is kind of accurate, the way the agent works is that you allocate a resource for the agent (specifically a GPU), then sets queues (plural) to listen to (by default priority ordered). Then each agent is individually pulling jobs and running on the allocated GPU.
If I understand you correctly, you want multiple ...

4 years ago

0 Hi Clearml, Does Clearml Orchestration Have The Ability To Break Gpu Devices Into Virtual Ones?

Hi BattyLizard6

does clearml orchestration have the ability to break gpu devices into virtual ones?

So this is fully supported on A100 with MIG slices. That said dynamic multi-tenant GPU on Kubernetes is a Kubernetes issue... We do support multi agents on the same GPU on bare metal, or over shared GPU instances over k8s with:
https://github.com/nano-gpu/nano-gpu-agent
https://github.com/intel/intel-device-plugins-for-kubernetes/tree/main/cmd/gpu_plugin#fractional-resources
http...

3 years ago

0 Heyo, After Building Some Custom Pipelining Functionality On Mlflow, I Started Looking For Better Software That Can Beat What I Created - With A Similar Amount Of Effort. Problem Has Been That Up Till Now, All I Found Could Make Things Way Better But Al

wdym 'executed on different machines'?The assumption is that you have machines (i.e. clearml-agents) connected to clearml, which would be running all the different components of the pipeline. Think out of the box scale-up. Each component will become a standalone Job and the data will be passed (i.e. stored and loaded) automatically on the clearml-server (can be configured to be external object storage as well). This means if you have a step that needs GPU it will be launched on a GPU machine...

2 years ago

0 Hi, Is It Intented Behavior That Models That Are Saved By A Clearml-Agent Will Have The Clearml-Agents User (So The User Of Which Generated The Api Credentials For The Agent) In The "User" Field Of The Model Instead Of The User Who Started The Task?

There is no way to create an artifact/model/dataset without a task, right?

Models are a an entity of it's own, and you can actually create one without a Task.

(just for my own interest: how much does the enterprise version divert from the open source version? It it just extended or are there core changes to the enterprise version)

It adds a few security layers on top, and adds a few features that are just not part of the open source (RBAC, hyper-datasets, advanced scheduling, cu...

3 years ago

0 Another Question: Is It Possible To Specify In Which Directory To Save All The Files That Clearml-Agent Creates (E.G. Cache Files Or Results Of The Currently Running Experiments)

Hmm, so the way the configuration works is it loads the default configuration (equivalent to the example in the docs) then it adds the ~/clearml.conf on top. That means that you can tell your users to just copy paste the credentials from the UI into a template you make. How is that ?

4 years ago

0 Hi All! Is There Any Simple Way To Use

Hmm yeah I can see why...
Now that I think about it, at least in theory the second process that torch creates, should inherit from the main one, and as such Task.init is basically "ignored"
Now I wonder why your first version of the code did not work?
Could it be that we patched the argparser on the subprocess and that we should not have?

2 years ago

0 Is It Not Possible To Add Artifacts To A Completed Task?

👍

4 years ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

Hi MortifiedCrow63
I finally got GS credentials, there is something weird going on. I can verify the issue, with model upload I get timeout error while upload_artifacts just works.
Just updating here that we are looking into it.

4 years ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

Hi MortifiedCrow63
Sorry getting GS credentials is taking longer than expected 🙂
Nonetheless it should not be an issue (model upload is essentially using the same StorageManager internally)

4 years ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

Maybe that's the issue :
https://github.com/googleapis/python-storage/issues/74#issuecomment-602487082

4 years ago

0 Hi Everyone, I'M Trying To Deploy My First Clearml Pipeline With A Configuration For Logging And Caching To Avoid Repeating Already Computed Steps. However, The Caching Doesn'T Seem To Be Working Correctly. Despite Not Changing The Configuration, The Firs

Hi @<1730396272990359552:profile|CluelessMouse37>

However, the caching doesn't seem to be working correctly. Despite not changing the configuration, the first step runs every time.

How are you creating the cached component?
is this a standalone script or a git repo link?

These parameters are dictionaries of specific configurations (dict of dict) that are the same but might not be taken into account properly by the caching mechanism.

hmm for the component to be cached (or reuse...

one year ago

0 I Want To Upload Models To The Server, But Store Data Locally Like Dvc, And Only Manage Data Meta-Information In Clearml. What Should I Do?

I see if this is the case try to set
'output_uri="file:///full/path/to/dir"'
Notice it has to have the full path there and the file:// prefix

2 years ago

0 Hey Everyone - I’M Trying To Run Trains On An Aws Lambda Function. To Purpose Is Just To Query My Trains Agent For Some Stats [# Of Experiments, Workers, Etc] Using The Backend Api. The Problem Is That Aws Lambda Have A Limitation Of 250Mb For A Codebas

BTW: trains-agent is leaner, and does not need plotly. And you can use the APIClient to basically query the entire system, would that be a better solution? See https://github.com/allegroai/trains-agent/blob/master/examples/archive_experiments.py

5 years ago

0 For Some Runs Of My Experiments The Ressource Monitoring Exists, For Other It Does Not. Any Idea Why This Could Be The Case?

make sense ?

4 years ago

0 Is Clearml Able To Intercept (Automatically) Metrics Gathered Via

When you have a bit of experience, please suggest a path forward, it will be great to integrate

2 years ago

0 Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

Does what you suggested here >

Yes, it is basically the same underlying mechanism, only instead of 1-to-1 it's 1-to-many

5 years ago

Show more results