AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi, I Am Running A File Like This

Hi DeliciousBluewhale87
You can achieve the same results programmatically with Task.create
https://github.com/allegroai/clearml/blob/d531b508cbe4f460fac71b4a9a1701086e7b6329/clearml/task.py#L619

4 years ago

0 Hi, When A Step In A Pipeline Is Aborted, It Is Marked As Gracefully Finished (Painted In Blue) And The Other Steps That Depend On It Continue. I Believe This Is Not The Expected Behavior, I'D Expect To To Be Marked As Failed, So Other Tasks That Depend

Why? The task should have completed successfully, how is this aborting?

Early stopping by the HPO process, like hyper-band, e.g. this training model is going nowhere let's stop it.

5 years ago

0 Hi, I Run The Trains Server In An Docker Container And Started Making Use Of Tasks ... My Tests Are Showed On The Projects Dashboard Which Is Realy Cool. What I Haven'T Found So Far Is A Way To Clean Up The System From The Tests I Did. I'M Able To Archive

WickedGoat98
The trains-agent-services docker is always CPU, the idea is put long lasting services there (like the auto cleanup or slack integration or HPO etc.)
To spin an agent with GPU on any machine (regardless of where the trains-server is) you can check the trains-agent readme.
https://github.com/allegroai/trains-agent#running-the-trains-agent

5 years ago

0 Hey There, I Would Like To Increase The

I think this should work 🤞

4 years ago

0 Hello! Since Today I Get

Let me check

4 years ago

0 Can Someone Help Me With Deploying This Example Model (From Triton Inference Server) Deployed In Clearml-Serving? Too Many Random Errors For Me To Figure It Out

Think I will have to fork and play around with it

NICE! (BTW: if you manage to get it working I'll be more than happy to help push the PR)
Maybe the quickest win is to store just the .py as model ?

4 years ago

0 I’M Trying To Use Minio With Clearml As A External Storage. I Am Having Problems With The Configuration File For The Clearml Client When I Use The Output_Uri Parameter Of Task.Init What Do I Put There? I Am Currently Doing Task.Init(… Output_Uri=“S3://I

Can you test with the credentials also in the global section
None

key: "************"
secret: "********************"

Also what's the clearml python package version

2 years ago

0 Hello! Since Today I Get

So it should have detected 11.2...

4 years ago

0 Is There A Way To Set Precedence On Package Managers? If We Set An Agent To Use

Local changes are applied before installing requirements, right?

correct

3 years ago

0 Is There Any Difference In:

btw: I'm assuming that args is not the ArgParser object, as the ArgParser is automatically "connected" ?

4 years ago

0 Hello! Since Today I Get

Sure, let's do that 🙂

4 years ago

0 Hello! Since Today I Get

It asks the driver or find the cuda dll/so

4 years ago

0 Hello! Since Today I Get

Hmm, you are correct
Which means this is some conda issue, basically when installing from env file, conda is not resolving the correct pytorch version 😞
Not sure why... Could you try to upgrade conda ?

4 years ago

0 Is There A Way To Get A Task'S Docker Container Id/Name? I'M Generally Interested In Resource Profiling Of Each Container, So I Noticed I Can Use

Hi ElegantCoyote26

is there a way to get a Task's docker container id/name?

you mean like Task.get_task("task_id_here").get_base_docker() ?

ow a Task's results page also has a plot for this, but I guess it's at the machine level and not the task level?

This is actually on the container level, meaning checked from inside the container. It should be what you are looking for

3 years ago

0 Is Anyone Also Experiencing Network Error During Every Clearml Dataset Download? It'S Been A While And Almost Every Download Fails...

It might be the file upload was broken?

3 years ago

0 Hi, I Would Like To Configure Clearml-Server To Connect To An S3 Bucket In Order To Store Artefacts - I'Ve Taken A Look On This Page

Still not supported 😞

4 years ago

0 I Originally Posted In

That would be great! Might have to use

2>/dev/null

in some of my bash scripts

Feel free to test and PR :)

One other question regarding connecting. We have setup sshd inside the docker image we are using.

Actually the remote session opens port 10022 on the host machine (so it does not collide with the default ssh port)
It actually runs an additional sshd inside the docker, setting its port.
And the clearml-session will ssh directly into the container sshd...

3 years ago

0 Hi! Is There Something Happening With The

This is what I just used:
` import os
from argparse import ArgumentParser

from tensorflow.keras import utils as np_utils
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Activation, Dense, Softmax
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint

from clearml import Task

parser = ArgumentParser()
parser.add_argument('--output-uri', type=str, required=False)
args =...

4 years ago

0 Hi Guys, I Managed To Set Up A Kubernetes Cluster And Install Trains Into It. While Testing My Set-Up I Run The Test_Reporting.Py Example

or do you mean the machine I ran the experiment locally?

Yes this one

4 years ago

0 Hi, I Have A Pre-Processing Steps Not Been Implemented In Python, But Being A Shell Script Calling Wget To Synchronize Data And Creating Intermediate Sqlite Dbs By A Script Been Implemented In 'R' And Would Like To Ask, If Trains Can Be Used Just To Trigg

Would that help?

5 years ago

0 Hello All. I'M Experimenting With Clearml And I'Ve Run Into A Strange Issue. I Used

How so? Installing a local package should work, what am I missing?

2 years ago

0 Playing Around With Hpo For First Time. I Am Giving This As Hyperparameter:

It completed after the max_job limit (10)

Yep this is optuna "testing the water"

4 years ago

0 Heya, Is There Any Plan For Clearml To Leverage The New

I could improve the cost-efficiency of my provisionned GCP A100 instances

But their pricing is linear, if you do not need a100 get a cheaper instance ?! no?

3 years ago

0 Hi Guys, I Managed To Set Up A Kubernetes Cluster And Install Trains Into It. While Testing My Set-Up I Run The Test_Reporting.Py Example

WickedGoat98 Actually the fileserver replied, so it all looks fine to me.
Try to run the text example again, see if you are still getting the fileserver error .

4 years ago

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

I assume here:
https://github.com/allegroai/trains/blob/04b3fa809bb73d7101d1995327684ebe5b2911e3/trains/storage/cache.py#L47

5 years ago

0 I’M Trying To Get A Copy Of A Model Through Clearml Which Is Stored In S3:

BeefyHippopotamus73 this error seems like it is coming from boto3, are you sure the credentials are properly configured and that you have read permission ?

3 years ago

0 Hello! Since Today I Get

Okay. And

110

means 11.1 and not 11.0? (edited)

110 means 11.0, the odd thing is, it actually installed 11.1, and from the pytorch website this is exactly how they suggest to install with conda...
Let me know if forcing the CUDA version changes anything

4 years ago

0 Hello! Since Today I Get

The problem is that clearml installs

cudatoolkit=11.0

but

cudatoolkit=11.1

is needed.
You suggested this fix earlier, but I am not sure why it didnt work then.

Hmm , could you test with the clearml-agent 0.17.2 ? making surethis actually solves the problem

4 years ago

0 Hello Everyone.I Have No Idea Why Clearml-Serving Inference Server Tries To Get Model From That Url(Pic 1), But In Clearml Ui I Have Correct Url(Pic 2). Could You Help Me With This?

ComfortableShark77 it seems the clearml-serving is trying to Upload data to a different server (not download the model)
I'm assuming this has to do with the CLEARML_FILES_HOST, and missing credentials. It has nothing to do with downloading the model (that as you posted, will be from the s3 bucket).
Does that make sense ?

3 years ago

0 I Have A Local Folder A, And A Dataset B. A:

RoughTiger69
move the files locally (i.e. based on the example move folder b into folder a ) Create a new version with two parents ('a' and 'b') then sync the local root folder ('a' in your case). Only the meta-data should change (because the referenced files are already in one of the datasets)wdyt?

3 years ago

Show more results