AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi! Does Clearml Have A Way To Turn On/Off Virtual Machines Depending If There Are Experiments On Queue?

Let me know if I can be of help 🙂

4 years ago

0 Hello Everybody, I Would Like To Start Off By Saying That I Absolutely Love Clearml. I Am Getting Familiar With Clearml Datasets And I Have A Quick Question. Is Is Possible To Download Individual Files From A Dataset Without Downloading The Entire Datase

I would like to start off by saying that I absolutely love clearml.

@<1547028031053238272:profile|MassiveGoldfish6> thank you for saying that! 😍

Is is possible to download individual files from a dataset without downloading the entire dataset? If so, how do you do that?

Well by default files are packaged into multiple zip files, you can control the size of the zip file for finer granularity, but at the end when you download, you are downloading the entire packaged ...

2 years ago

0 Hi, I Assume It Is Very Basic But How Can I Add The Model That Is Created In The Training To The Artifacts And To See It In The Models Tab?

so I didn't have much time to upgrade all the packs because I have some issues with that but it is on my todo list

No worries 🙂

Quick question, if you run https://github.com/allegroai/trains/blob/master/examples/frameworks/keras/legacy/keras_tensorboard.py
Do you see models in the artifacts tab?

5 years ago

0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

WickedGoat98 nice!!
Can you also pass the login screen (i.e. can you access the api server)

5 years ago

0 In Ui Under Execution Tab, I See That The Trains Has

Let me know if you managed to get it working, then we can see if we can detect it automatically.

5 years ago

0 For Clearml Serving, If I Am Trying To Deploy 100 Models On A Gpu That Can Handle 5 Concurrently, But Each One Will Be Sporadically Used (Fine Tuned Models Trained For Different Customers), Can Clearml-Serving Automatically Load And Unload Models Based Up

Hi @<1523711619815706624:profile|StrangePelican34>

if I am trying to deploy 100 models on a GPU that can handle 5 concurrently,

Main limitation is Triton's ability to dynamically load / unload models. We know Nvidia is adding this capability, but I think this is still not out, once they support it, it should be transparent

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

i had a misconception that the conf comes from the machine triggering the pipeline

Sorry, this one :)

3 years ago

0 Hi, I Am Trying To Use The Config Values From A Experiment, But

Hi SkinnyPanda43
Are you trying to access the same Task or an external one ?

4 years ago

0 We Have A Environment Variables Definitions.Py File Which Every User Configures On Their Local Machine. This File Includes Local Paths As Well As Aws/Api Credentials. This Is An Issue When Spinning Up Clearml Tasks Since It Is Not Included In The Git Repo

that's the downside

3 years ago

0 Trying To Setup A Trains-Agent Worker On A Remote Machine; When I Run Trains-Init And Follow The Steps To Give It Credentials For Our Trains Server I Get This

web / api / files

4 years ago

0 Hi! Had A Basic Question: I Want To Retrieve All Tasks Created By A Clearml User Id (Using Task.Get_Tasks() And Filter). Is It Possible To Get User Id Of The Current User Configured In The Clearml.Config Using Clearml Python Api? Thanks In Advanced!

Hi @<1529633468214939648:profile|CostlyElephant1>

Is it possible to get user ID of the current user

On the Task.data object itself there should be a filed named " user " that's the user ID of the owner (creator) of the Task.
You can filter based on this id with

Tasks.get_tasks(..., task_filter={'user': ["user-id-here"]})

wdyt?

one year ago

0 Can Someone Point Me Whether/How The Services-Agent The Starts With The Clearml-Server Mounts The

BTW: the agent will resolve pytorch based on the install CUDA version.

4 years ago

0 Hello, I Don'T Really Like The Idea Of Providing My Own Github Credentials To The Clearml Agent. We Have A Local Clearml Deployment. Is There A Way To:

Hi @<1573119962950668288:profile|ObliviousSealion5>

Hello, I don't really like the idea of providing my own github credentials to the ClearML agent. We have a local ClearML deployment.

if you own the agent, that should not be an issue,, no?

forward my SSH credentials using

ssh -A

and then starting the clearml agent?

When you are running the agent and you force git clonening with SSH, it will autmatically map the .ssh into the container for the git to use

Ba...

2 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

Yes... I think that this might be a bit much automagic even for clearml 😄

3 years ago

0 If Possible, I Would Like All Together Prevent The Fileserver And Write Everything To S3 (Without Needing Every User To Change Their Config)

that does happen when you create a normal local task, that's why i was confused

The parts that are not passed in both cases are the configurations from the conf file. Only the environment is passed (e.g. git python packages etc) , . For example if you have storage credentials in your conf file , they are not passed to a remote agent, instead the credentials from the remote agent are used when it runs the task.
make sense?

3 years ago

0 Has Anyone Got Any Experience With C++ Extensions In Python When Using Clearml? In Our Setup.Py We Have:

So could it be that pip install --no-deps . is the missing issue ?
what happens if you add to the installed packages "/opt/keras-hannd" ?

3 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

btw: both should work fine

5 years ago

0 Hello, I Don'T Really Like The Idea Of Providing My Own Github Credentials To The Clearml Agent. We Have A Local Clearml Deployment. Is There A Way To:

owning the agent helps, but still it's much better if the credentials don't show up in logs,

They are not, they are always filtered out,

how does force_git_ssh_protocol help please? it doesn't solve the issue of the agent simply not having accessIt automatically maps the host .ssh into the container, so that git can use SSH to clone.
What exactly is not working?
and how are you configuring it?

2 years ago

0 Hello Everyone, I Have A Quick Question, I Am Using Clearml For An Ml Experiment Tracking Project. As Is, Clearml Is Saving A Version Of My Model After Each Epoch. Is There A Way For Clearml To Simply Save The Model Once Training Is Done And To Ignore The

Hi @<1547028031053238272:profile|MassiveGoldfish6>

Is there a way for ClearML to simply save the model once training is done and to ignore the model checkpoints?

Yes, you can simple disable the auto logging of the model and manually save the checkpoint:

task = Task.init(..., auto_connect_frameworks={'pytorch': False}
...
task.update_output_model("/my/model.pt", ...)

Or for example, just "white-label" the final model

task = Task.init(..., auto_connect_frameworks={'pyt...

one year ago

0 I Cannot Get Clearml-Agent With Docker Containers To Work. Clearml Uses

Can you clone the git with the .ssh credentials on the host machine ?
If so, can you do the same manually inside a docker (i.e. spin a docker with mount -v /home/hostuser/.ssh:/root/.ssh) ?

4 years ago

0 Hi Guys, With The New Venv Caching Available In Clearml, I Have The Following Problem: I Force My Pip Requirements To Be:

JitteryCoyote63

So there will be no concurrent cached files access in the cache dir?

No concurrent creation of the same entry 🙂 It is optimized...

4 years ago

0 So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

SmallDeer34 the function Task.get_models() incorrectly returned the input model "name" instead of the object itself. I'll make sure we push a fix.

I found a different solution (hardcoding the parent tasks by hand),

I have to wonder, how does that solve the issue ?

4 years ago

0 How Can I Add My Requirements.Txt File To The Pipeline Instead Of Each Tasks?

but actually that path doesn't exist and it is giving me an error

So you are saying you only uploaded the "meta-data" i.e. a text file with links to the files, and this is why it is missing?

Is there a way to change the path inside the .txt file to clearml cache, because my images are stored in clearml cache only

I think a good solution would be to store the path in the txt file as relative path, i.e. instead of /Users/adityachaudhry/data/folder... as ./data/folder

2 years ago

0 Hi There! Can Anybody Help Me With Specifying The 'Platform' For A Model In Clearml-Serving. I Am Using The K8S Clearml-Serving Setup (Version 1.3.1). I Already Tried A Bunch Of Variants Like

I'm assuming those errors are from the triton containers? where you able to run the simple pytorch mnist example serving from the repo?

one year ago

0 My Nth Question For The Day

What’s the general pattern for running a pipeline - train model, evaluate metrics and publish the model if satisfactory (based on a threshold, for example)

Basically I would do:
parameters for pipeline:
TaskA = Training model Task (think of it as our template Task)
Metric = title/series/sign we want to choose based on, where sign is max/min
Project = Project to compare the performance so that we could decide to publish based on the best Metric.

Pipeline:
Clone TaskA Change TaskA argu...

4 years ago

0 Hi, I Have A Pre-Processing Steps Not Been Implemented In Python, But Being A Shell Script Calling Wget To Synchronize Data And Creating Intermediate Sqlite Dbs By A Script Been Implemented In 'R' And Would Like To Ask, If Trains Can Be Used Just To Trigg

WickedGoat98 if this is the case, you can check this example. Same idea only "manual":
https://github.com/allegroai/trains/blob/master/examples/automation/task_piping_example.py

5 years ago

0 Does Anyone Have Experience With Integrating Clearml And Slurm? If So, What Pattern Did You Use? (Did You Submit Tasks And Just Use Clearml As Tracker, Or Did You Start Agents With Slurm?) Would Love To Hear From The Community Before Trying To Diy

The difference is that running the agent in daemon mode, means the "daemon" itself is a job in SLURM.
What I was saying is pulling jobs from the clearml queue and then pushing them as individual SLURM jobs, does that make sense ?

9 months ago

0 Anyone Deployed Trains On Azure, I Am Interested To Know About Your Experience.

For setting trains-server I would recommend the docker-compose, it is very easy to setup, and you just need a single fixed compute instance, details https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md With regards to the "low prio clusters", are you asking how they could be connected with the trains-agent or if running code that uses trains will work on them?

5 years ago

0 Does Clearml-Session Work In A Kubernetes Environment?

Hi TrickySheep9
Long story short, clearml-session fully supports k8s (using k8s glue)
The --remote-gateway along side ports mode will basically allow you to setup a k8s service so that every session will register with a specific port so k8s does ingest foe you and route the SSH connection to the pod itslef, everything else is tunneled over the original SSH connection.
Make sense ?

4 years ago

0 Getting A Super Weird Error. Everything Works Fine On Local, When Trying To Run On Remote, Getting This Error Failing To Apply The Git Diff

WackyRabbit7 hmmm seems like non regular character inside the diff.
Let me check something

5 years ago

Show more results