AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 Hi, I Am Having Trouble With Comparing Plotly Plots From Different Experiments. The Plots, When You Look At Them Within One Experiment Look Fine (Attaching Screenshot), However Once You Try To Compare Plots From Two Experiments There Are Few Problems:

Hi @<1566596960691949568:profile|UpsetWalrus59>
Could it be the two experiments have the exact name ?
(I sounds like a bug in the UI, but I'm trying to make sure, and also understand how to reproduce)
What's your clearml-server version ?

2 years ago

0 Hi All! Does Anyone Know A Solution To My Issue With Deploying Models Saved On Azure On The Clearml-Serving Docker Container?

Hi all! Does anyone know a solution to my issue with deploying models saved on azure on the clearml-serving docker container?

Hi NuttyCamel41
The easiest is to map the clearml.conf to both the serving and triton containers in your docker-compose.yaml (or k8s secrets) and make sure the conf file has the credentials to access the azure blob. wdyt ?

2 years ago

0 I Am Using Clearml Pro And Pretty Regularly I Will Restart An Experiment And Nothing Will Get Logged To Clearml. It Shows The Experiment Running (For Days) And It'S Running Fine On The Pc But No Scalers Or Debug Samples Are Shown. How Do We Troubleshoot T

Hi @<1719524641879363584:profile|ThankfulClams64>

I am using ClearML Pro and pretty regularly I will restart an experiment and nothing will get logged to ClearML.
I use ClearML with pytorch 1.7.1, pytorch-lightning 1.2.2 and Tensorboard auto
All ClearML has the latest stable updates. (clearml 1.7.4, clearml-agent 1.7.2)

Is this still happening with the latest clearml ( clearml==1.16.3rc2 ) ?
What is the TB version?
I remember a fix regrading lightining support
Also just making s...

one year ago

Thanks @<1719524641879363584:profile|ThankfulClams64> having a code that can reproduce it is exactly what we need.
One thing I might have missed and is very important , what is your tensorboard package version?

one year ago

0 Hello, Is There Any Option To Run A Docker Image On A Clearml-Agent Without Clearml Or Python? I Have A Colleague Who Just Wants To Execute His Docker On One Of My Clearml-Agents, But Who Does Not Want To Use Clearml. Right Now I Would Tell Him To Just Us

ReassuredTiger98 if this user passes to the task as docker args the following, it might work:
'-e CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1'

3 years ago

0 When Using A

Nothing that can't be worked around but for automation I don't think creating a TriggerScheduler with an existing name should be allowed

DangerousDragonfly8 I think I understand , basically you are saying the fact a user can create two triggers with the same name can create some confusion ?

It also sucks a bit that each TriggerScheduler will run in it's own pod in kubernetes.

Actually this depends on how you spin it, and you can actually spin a a service agents running multiple...

2 years ago

0 Are People Using Devcontainer + Clearml-Session?

TrickySheep9 you mean custom containers in clearml-session for remote development ?

4 years ago

0 How Can I Clone A Task And Execute_Remotely The Cloned Task With Exit_Process=False. It Currently Kills The Notebook Kernel. If I Say Exit_Process=False, It Says Clone Cannot Be False. Why The Restriction? What To Do In A Notebook To Run A Task Remotely

Meanwhile check CreateFromFunction(object).create_task_from_function(...)
It might be better suited than execute remotely for your specific workflow 🙂

3 years ago

0 I Have A Question Regarding Running The Code Directly In The Agent Without Running It On My Local Device. How Can I Do That? (Usually, I Run The Code In My Local Machine With The Two Magic Lines Then Clone The Task Then Enqueue It) I Know This Way To R

Hi WickedBee96

How can I do that?

clearml-task
https://clear.ml/docs/latest/docs/apps/clearml_task#what-is-clearml-task-for

I know this way to run it in the agent only by enqueue the draft after running it on my local machine so is there another way?

Or maybe are you looking for task.execute_remotely
https://clear.ml/docs/latest/docs/references/sdk/task#execute_remotely

2 years ago

Thank you @<1719524641879363584:profile|ThankfulClams64> for opening the GI, hopefully we will be able to reproduce it and fox ot quickly

one year ago

0 Hi, I Am Trying To Setup The Path To Trains.Conf File Programatically And Having Trouble.. We Tried Using Os.Environ['Trains_Config_File'] = Path, And Also Other Variations Of Overriding The Trains.Backend_Config.Defs But Nothing Seem To Work.. When Creat

RipeGoose2 https://github.com/allegroai/trains/blob/f27aed767cb3aa3ea83d8f273e48460dd79a90df/trains/task.py#L2072

4 years ago

0 Hello, I Am New To Clearml, I Would Like To Learn More About How Clearml Works On A Hpc Cluster Where The Only Way To Get Computational Resources Is Via Slurm:

That should work 🙂
BTW, you might play around with "clearml-agent execute --id <task_id_here>"
This will basically clone the code, create a venv with the python packages, apply uncommitted changes and will run the actual code. This could be a replacement for your bash. (notice it means that you need to clone the Task in the UI, then you can Change parameters, then the run the agent manually in SLURM and it will take the params from the UI.)

4 years ago

0 Hi, I Am Try To Use Taskscheduler As Cronjob, I Want My Task Running Every 2.40 Am Utc Everyday,

I found the issue, the first run it jumps over the first day (let me check if we can quickly fix that)

2 years ago

0 Can One Compare Experiments/Tasks From Different Projects? Edit: I Mean, I Can Manually Navigate To Some

Hi UnevenDolphin73

Can one compare experiments/tasks from different projects?

Yes, the easiest way is to go to the parent project ("all projects" if they have no common parent, then search for the specific Tasks (i.e. filter or using the search bar), then multi-select them.
wdyt?

3 years ago

0 I Have A Logical Task That I Want To Split To Multiple Workers. The Task Involves Processing Media Files (Not Training). The Optimal Design For Me Would Be:

Yes, exactly!

3 years ago

0 {"Detail":"Error Processing Request: Error: Failed Loading Preprocess Code For 'Py_Code_Best_Model': [Errno 2] No Such File Or Directory: '/Root/.Clearml/Cache/Storage_Manager/Global/Cd46Dd0091D71B5294Dc6870Ac6D17Dc..._Artifacts_Archive_Py_Code_Best_Model

DS, this way they only need to remember (and me only need to teach them where to find) one id.

Yes that's the point, this ID is the Model UID (as opposed to the Task ID), the reason I kind if "insist" on it is that the Model ID is built into the system meaning, this is how you register it, as opposed to the Task ID that somehow needs to be hacked/passed externally

TBH the main reason I went with our API is that because of the custom model loading, we need to use the "custom" framew...

2 years ago

0 Hi Everyone, Does Anybody Now If The Latest Release 1.15 Is Still Vulnerable To

Hi @<1658281099807166464:profile|SmallCamel52>

Lack of authentication in all versions of the fileserver component

Are you leaving the fileserver open to the world ?

one year ago

0 Hi, I Have One Doubt Related To Pipeline I Have One Pipeline With Eg 3 Tasks, Preprocess, Train And Test Now I Want To Clone The Pipeline And Change The Hyperparameters Of Train Task, Is It Possible? If So, How??

How are you building your pipeline?
None
None

2 years ago

0 Is There A Way To Record Metrics With The Rest Api? I'M Not Seeing It.

Hi MammothGoat53
Do you mean working with RestAPI directly?
https://clear.ml/docs/latest/docs/references/api/events

2 years ago

0 Hey Folks, I Am Currently Using The Open-Source Self-Hosted Version Of Clearml And Performing A Poc. I Was Trying To Set Up A Pipeline That Is Triggered Every Few Weeks, But It Seems Like I Can Only Trigger A Task Using A Task Scheduler But Not A Pipeline

, but it seems like I can only trigger a task using a Task scheduler but not a pipeline.

@<1523701132025663488:profile|SlimyElephant79> Maybe we should better state it, but Pipeline is "just" another type of Task. so triggering a Task with the Pipeline ID is essentially triggering the pipeline (do notice you need to select the "services" queue to be used so that the pipeline runs on the correct resource). Make sense ?

2 years ago

0 Hi All! I Am A Bit Confused As To How The Python Environment Is Set. I Can Submit Jobs That Build The Environment And Run Perfectly Fine. But, If I Abort The Job -> Requeue It From The Gui, Then A Different Environment Is Installed (Which Has Some Package

the first runs perfectly fine,

Just making sure, running in an agent?

the second crashes

Running inside the same container as the first one ?

one year ago

0 Back To Autoscaler; Is There Any Way To Ensure The Environment Variables On The Services Queue (Where The Scaler Runs) Will Be Automatically Exposed To New Ec2 Instance? Some Bash Hack Or Similar Would Be Nice, Really

Yes, that makes sense. Then you would need to use wither the AWS vault features, or the ClearML vault features ...

3 years ago

0 Hi, Community! For The Test I Logged My New Model To Clearml-Server File Host And Take Models For Clearml-Serving From There. And It Works With Clearml-Serving Model Add, But For Clearml-Serving Model Auto-Update I Do Not Exactly Understand What Happens.

Hi AbruptHedgehog21
can you send the two models info page (i.e. the original and the updated one) ?
do you see the two endpoints ?
BTW: --version would add a version to the model (i.e. create a new endpoint with version "endpoint/{version}"

3 years ago

the services queue (where the scaler runs) will be automatically exposed to new EC2 instance?

Yes, using this extra_clearml_conf parameter you can add configuration that will be passed to the clearml.conf of the instances it will spin.
Now an example to the values you want to add :
agent.extra_docker_arguments: ["-e", "ENV=value"]https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L149
wdyt?

3 years ago

0 Hi Everybody, I’M Getting Errors With Automatic Model Logging On Pytorch (Running On A Dockered Agent).

Actually if you can send the full log of the Task that would be great

3 years ago

0 I Have A Set Up An Agent, On A Gpu Machine, And Spun Up The Daemon In Docker Moder, And Specifically Specified A Gpu That It Will Work With. The Image Is Okay And I Verified That By Running

Okay, I'll make sure we change the default image to the runtime flavor of nvidia/cuda

5 years ago

0 Hello,

I can reproduce 😞 give me a moment to verify

2 years ago

0 Hi, There Is A Bug With Get_Logger Here:

Quite hard for me to try this right

👍
How do I reproduce it ?

4 years ago

0 Or Is It Just The Ubuntu Official Image

The task pod (experiment) started reaching out to an IP associated with malicious activity. The IP was associated with 1000+ domain names. The activity was identified in AWS guard duty with a high severity level.

BoredHedgehog47 What is the pod container itself ?
EDIT:
Are you suggesting the default "ubuntu:18.04" is somehow contaminated ?
https://hub.docker.com/layers/library/ubuntu/18.04/images/sha256-d5c260797a173fe5852953656a15a9e58ba14c5306c175305b3a05e0303416db?context=explore

2 years ago

and then in Preprocess:

self.model = get_model(task_id=os.environ['TASK_ID'], model_name=os.environ['MODEL_NAME'])That's the part I do not get, Models have their own entity (with UID), this is in contrast to artifacts that are only stored on Tasks.
The idea when you are registering a model with clearml-serving, you can specify the model ID, this should replace the need for the TASK_ID+model_name in your code, and the clearml-serving will basically bring it to you
Basically this fun...

2 years ago

Show more results