AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8049

0 Votes

0 Answers

984 Views

0 Votes 0 Answers 984 Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is It A One Time Thing? Or Recurring?

Is it a one time thing? or recurring?

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Slack Security ... Go Figure

Slack security ... Go figure 😉

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is You Server Using Https ?!

Is you server using https ?!

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

docs are up

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Finally

clearml

4 years ago

0 Votes

0 Answers

953 Views

0 Votes 0 Answers 953 Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

0 Votes

0 Answers

975 Views

0 Votes 0 Answers 975 Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New releases: ```pip install trains==0.13.3``` <https://github.com/allegroai/trains/releases/tag/0.13.3> ```pip install trains-agent==0.13.2``` <https://github.com/allegroai/trains-agent/releases/tag/0.13.2>

New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...

clearml

4 years ago

0 Votes

1 Answers

956 Views

0 Votes 1 Answers 956 Views

Quick Note: V1.3.1 Caused Pipelinedecorator Tasks To By Default Disable The Automagic Frameworks Connection, This Bug Is Solved In The Latest Rc

Quick note: v1.3.1 caused PipelineDecorator Tasks to by default disable the automagic frameworks connection, this bug is solved in the latest RC pip install ...

clearml

2 years ago

0 Votes

9 Answers

972 Views

0 Votes 9 Answers 972 Views

Hi https://github.com/allegroai/trains/releases/tag/0.15.1 / https://github.com/allegroai/trains-server/releases/tag/0.15.1 / https://github.com/allegroai/tr...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

We Are At Aaai Ny, Come Look Us Up :)

We are at AAAI NY, come look us up :)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Https://M.Facebook.Com/Story.Php?Story_Fbid=2484620658505570&Id=1620822758218702&Refid=52&__Tn__=-R

https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R

clearml

4 years ago

Show more results

0 Hi Everyone And Thanks Again For The Help, I Still Have No Success In Running Clearml Agent, It Just Gets Stuck Without Any Output, On Debug Mode For

Okay found the issue, to disable SSL verification global add the following env variable:
CLEARML_API_HOST_VERIFY_CERT=0(I will make sure we fix the actual issue with the config file)

2 years ago

0 Very Weird Error, Trying To Run An Experiment Through An Agent In Docker Mode, And I Get This Error

yes

3 years ago

0 Hello Community! Is There An Option To Only Download A Part Of A Dataset With .Get_Local_Copy()? I Imagine Something Like This, But I Can'T Find The Right Way To Do It.

I see...
Current (and this will change soon) the entire delta is stored in a single file, so there is no real way to download a "subset" of the data, only a parent version 😞

Lets say that this small dataset has a ID ....

Yes this would be exactly the way to do so:

` param ={'dataset': small_train_dataset_id_here}
task.connect(param)

dataset_folder = Dataset.get(param['dataset']).get_local_copy()
... Locally it will use the small_train_dataset_id_here ` , then whe...

3 years ago

0 I Am Not Familiar With Pytorch, But Is It Expected That So Many “Models” Are Created? These Are Being Repeated As Well For A Single Task (This Is Training A T5_Model With Transformers):

Do people use ClearML with huggingface transformers? The code is std transformers code.

I believe they do 🙂
There is no real way to differentiate between, "storing model" using torch.save and storing configuration ...

3 years ago

0 Hey There, I Would Like To Increase The

Sure thing 🙂

3 years ago

0 Hello! Question About

Notice this is per frame (single) not per 8

one year ago

0 Hello! Since Today I Get

Yes that is exactly what I will make sure we change :)

3 years ago

0 Hi All, Are There Any Alternatives To Storing User Credentials In

Hi @<1687653458951278592:profile|StrangeStork48>
I have good news, v1.0 is out with hashed passwords support.

3 years ago

0 Hey Everyone

Yes that should work, only thing is you need to call Task init on the master process (and make sure you call Task.current_task() on the subprocesses, if you want to automagic to kick in, that said, usually there is no need, they are supposed to report everything back to the main one anyhow
basically
` @call_parse
def main(
gpus:Param("The GPUs to use for distributed training", str)='all',
script:Param("Script to run", str, opt=False)='',
args:Param("Args to pass to script", nargs=...

2 years ago

0 Sorry Folks Too Many Questions - If I Have A Project (And I Set The Output Uri In It While Creating, To A S3 Folder) How Can I Ensure That A Experiment (Task) That I Run On My Local Outputs The Model To The Uri?

(or woman or in between, we are supportive as long as code is working 🙂 )

3 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

but I still clearml-agent will raise the same error

which one?

2 years ago

0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

Hi WickedGoat98
Regardless on the ingress configuration (which seems like you have the hang of), the API instance itself needs to be configured with persistent volume (the web / file server do not need direct access to the API server).
Can you get the API to run properly ?

Regrading the trains-agent once you have the API/Web/File server configured, you can configure it like the trains-agent-services is configured inside the docker-compose (e.g. set the environment variable with the c...

3 years ago

0 Hi, Is There Any Document About Migration Clearml-Server. Currently, I Have Clearml-Server Running On Servera But I Want To Move All Data (Including Artifacts, Task, Dataset) From Servera To Serverb.

For example, ServerA stores file at /opt/clearml but ServeB stores at /some_path/clearml

As long as you adjust your docker-compose yaml file, should be just fine

2 years ago

0 "Clearml.Task - Error - Action Failed <500/0: Tasks.Edit/V1.0 (Update Failed (Bsonobj Size: 18330801 (0X117B4B1) Is Invalid. Size Must Be Between 0 And 16793600(16Mb) F"

check the latest RC, it solved an issue with dataset uploading,
Let me check if it also solved this issue

one year ago

0 Hey! Is There Way To Get Latest/Best Checkpoint From Another Task (I Know Task Id)? I Know How To Get Data From Artifacts:

Hi FlatOctopus65
You are almost there
prev_task: Task = Task.get_task(task_id=<prev_task_id_here>) model = prev_task.models['output'][-1] my_check_point = model.get_local_copy()

one year ago

0 Hi I'M Trying To Clearml-Agent In My Dockerfile, But Even After Copying The Clearml.Conf To My Dockerfile Working Dir, The Clearlml Agent Does Not Start, Throwing Error Couldn'T Find ~/Clearlml.Conf How Do I Resolve This?

Hi TenderCoyote78

I'm trying to clearml-agent in my dockerfile,

I'm not sure I'm following, Are you traying to create a docker container containing the agent inside? for what purpose ?
(notice that the agent can spin any off the shelf container, there is no need to add the agent into the container it will take of itself when it is running it)

Specifically to your docker file:

RUN curl -sSL

| sh

No need for this line

COPY clearml.conf ~/clearml.conf

Try the ab...

2 years ago

0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

So that agent on different nodes will probably require different cuda-version images.

That makes sense SarcasticSquirrel56
I would edit the helm chart (or deploy manually) based on a selector that will select the different nodes/gpus and assign the correct containers (i.e. matching CUDA versions to the diff GPUs / drivers)
BTW: you can also playaround with k8s glue, which would dynamically spin pods based on clearml Tasks.
wdyt?

2 years ago

0 Hi Folks, We Are Trying To Find A Tool To Help With Workflow Orchestration. This Is Our Stack So Far (Label Studio/Clearml/Seldon). Does Anyone Have Any Experience With Using Any Workflow Which Is Most Compatible Esp Wrt To Clearml.

So like a UI for creating pipelines doing different things on the different solutions ?

3 years ago

0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

YEY

3 years ago

0 Hi, V1 Of Agent Seems To Have Removed Agent.Package_Manager.Force_Repo_Requirements_Txt. Is This Still Available In Other Forms?

SubstantialElk6 This seems to be the issue
cp: failed to access '/root/default_clearml.conf': Permission denied clearml_agent: ERROR: Could not find task id=024a421c0e174650a1c7ff64af756c26 (for host: )Notice it seems it just cannot read the clearml.conf , wdyt?

3 years ago

0 Hi, I Have A Question About How To Get Notified When A Task Finishes Running. We Are Creating Training Tasks Programatically On Clearml Server. It Is Relatively Easy To Do Through Directly Accessing The Clearml Server Api. However, We'D Like Some Way T

Hi JoyousElephant80

Another possibility would be to run a process somewhere that periodically polls ClearML Server for tasks that have recently finished

this is the easiest way to implement what you are after, and have full control over the logic itself.
Basically you inherit from the Monitor class
And implement the callback function:
https://github.com/allegroa...

3 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

This is the reason you are getting an error 🙂
Basically the session asks the agent to setup a new SSH server with credentials on the remote machine, this is not an issue inside a container, as this is an isolated environment, but when running in venv mode the User running the agent is not root, hence it cannot spin/configure an SSH server.
Make sense ?

2 years ago

0 Hi. After Upgrading Clearml To Latest Version, Got This Error From My Pipeline (Windows10, Configured And Running Tensorflowod For Tf 2.3.):

BattyLion34 is this running with an agent ?
What's the comparison with a previously working Task (in terms of python packages) ?

3 years ago

0 I Have An On-Prem/Free Clearml-Server Setup With Custom S3 Back-End Storage. I'M Trying Out The Clearml-Serving Capability And Not Sure What'S Failing. When I Start The Serving Containers It Can'T Retrieve The Model:

Okay that makes sense, if this is the case I'm assuming you have set the files server to point to your S3 bucket is that correct ?
could it be you are missing the credentials for that (it is trying to upload the preprocessing code there, so the clearml-serving container would be able to pull it later)

2 years ago

0 Hi, Is There A General Github Actions Workflow Just To Login Into Your Clearml App (Demo Or Server) So I Can Run Python Files Related To Clearml. I'Ve Seen Clearml-Actions-Train-Model And Clearml-Actions-Get-Stats And They Seem To Be Very Specific. Maybe

LittleShrimp86 did you try to run the pipeline form the UI on remote machines (i.e. with the agents)? Did that work?

one year ago

0 Hello! I Was Hoping I Could Get Some Debug Help. I'Ve Set Up A Clearml Pipeline Using The Pipelinecontroller, And When Running Through

sets up the venv correctly, prints

Starting Task Execution:

then does nothing

Can you provide a log?
Do you see the code/git reference in the Pipeline Task details - Execution Tab ?

one year ago

0 Hi Everyone, Is There A Way To Increase The Cache Size Of Each Clearml Task? I'M Running An Experiment And Many Artifacts Are Downloaded. My Dataloader Fails To Load Some Of The File Since They Are Missing, Although They Were Downloaded. I Guess There Is

Hi ScaryKoala63
Sure, add the following to your clearml.conf:
sdk.storage.cache.default_cache_manager_size = 400I think you are correct, it seems like for some reason you hit the cache limit, and a previous entry was deleted

2 years ago

0 Hi, I Have A Task Which Uses Hydra For Configuration. I Want To Add This Taks To A Pipeline, And Pass The Full Hydra Config Objects To The Task. Is There A Way To Do It? I Get “Parameters Should Be In The Form Of “`Section-Name`/Parameter”, Example: “Args

RoughTiger69 the easiest thing would be to use the override option of Hydra:
parameter_override={'Args/overrides': '[the_hydra_key={}]'.format(a_new_value)})wdyt?

3 years ago

0 Is There Any Reason Why Doing The Following Is Not Possible? Am I Doing It Right? I Want To Run A Pipeline With Different Parameters But I Get The Following Error?

After testing the code again, I see the task parameter dictionary has been removed properly

Great!

However, I still have the same problem with duplicate tasks, as you can see in the image.

Any chance the pipeline script Itself is running from the agent (as opposed to running the pipeline code locally, then the pipelines are executed on the agent)?

2 years ago

0 Hi, I Assume It Is Very Basic But How Can I Add The Model That Is Created In The Training To The Artifacts And To See It In The Models Tab?

BTW: What's the TF / Keras version?

4 years ago

Show more results