AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 8 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8051

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is You Server Using Https ?!

Is you server using https ?!

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

0 Votes

3 Answers

505 Views

0 Votes 3 Answers 505 Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

4 years ago

0 Votes

1 Answers

601 Views

0 Votes 1 Answers 601 Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

4 years ago

0 Votes

10 Answers

611 Views

0 Votes 10 Answers 611 Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

9 months ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

4 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Quick Note: V1.3.1 Caused Pipelinedecorator Tasks To By Default Disable The Automagic Frameworks Connection, This Bug Is Solved In The Latest Rc

Quick note: v1.3.1 caused PipelineDecorator Tasks to by default disable the automagic frameworks connection, this bug is solved in the latest RC pip install ...

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

4 years ago

0 Votes

0 Answers

977 Views

0 Votes 0 Answers 977 Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

3 years ago

0 Votes

3 Answers

628 Views

0 Votes 3 Answers 628 Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

9 months ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

2 years ago

Show more results

0 , This Is A Great Tool For Visualizing All Your Experiments. I Wanted To Know That When I Am Logging Scalar Plots With Title As Train Loss And Test Loss They Are Getting Diplayed As Train Loss And Test Loss In The Scalar Tab. I Wanted That The Title Shoul

Create one experiment (I guess in the scheduler)
task = Task.init('test', 'one big experiment')
Then make sure the the scheduler creates the "main" process as subprocess, basically the default behavior)
Then the sub process can call Task.init and it will get the scheduler Task (i.e. it will not create a new task). Just make sure they all call Task init with the same task name and the same project name.

4 years ago

0 Is It Necessary To Serve Keras Model Using Triton Engine? I'M Trying To Serve An Endpoint, And Trying To Debug, But The Error Given Not Helping Much. Is There A Flag I Can Pass To See More Logs?

Hi @<1567321739677929472:profile|StoutGorilla30>

Is it necessary to serve keras model using triton engine?

It is not, but it is the most efficient way to serve keras models, and this is why by default clearml-serving is using Nvidia Triton (we are talking 10x factors)
I would start with the keras example, see that it works and then work your way into your example (notice you always need to provide the layers form the in/out of the model)
[None](https://github.com/allegroai/clearml-s...

one year ago

0 Help Please, After Creating My Data Drift Monitoring Dashboard Using Clearml Serving And Grafana, How Can I Configure My Alerts To Be Notified When The Distribution Of My Metrics (Variables) Changes On My Heatmaps?

I ran the test, but there was no result.

what do you mean by no result, no data after the new query?

7 months ago

0 How Can I Tell Clearml To Ignore Certain Submodules Existing In The Project? My Projects Consists Of Multiple Git Submodules And It Is Rather Annoying That The Task Always Tries To Fetch All Submodules, When They Are Not Even Necessary. I Don'T Know How I

Hi @<1694157594333024256:profile|DisturbedParrot38>
You mean how to tell the agent to pull only some submodules of your git?
If this is the case you can actually remove them on your git branch, submodule is a file with a soft link. Wdyt?

8 months ago

That is quite neat! You can also put a soft link from the main repo to the submodule for better visibility

7 months ago

I double checked the code it's always being passed 😞

7 months ago

Yes I was thinking a separate branch.
The main issue with telling git to skip submodules is that it will be easily forgotten and will break stuff. BTW the git repo itself is cached so the second time there is no actual pull. Lastly it's not clear on where one could pass a git argument per task. Wdyt?

8 months ago

It will not create another 100 tasks, they will all use the main Task. Think of it as they "inherit" it from the main process. If the main process never created a task (i.e. no call to Tasl.init) then they will create their own tasks (i.e. each one will create its own task and you will end up with 100 tasks)

4 years ago

0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

PanickyMoth78 RC is out
pip install clearml==1.6.3rc1🤞

2 years ago

0 How Can I Avoid

Hi TrickyRaccoon92
BTW: checkout the HP optimization example, it might make things even easier 🙂 https://github.com/allegroai/trains/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py

4 years ago

0 Hello, Does Anybody Know What Triggers A New Model To Be Added In A Project (Working In Pytorch) ? I'M New To Trains And Adding It To My Script Generated A Huge Amount Of Models (Almost 1 Per Datapoint I Would Say) And It Would Also Prompt

You can disable it with:

Task.init('example', 'train', auto_connect_frameworks={'pytorch': False})

4 years ago

0 Hi, Currently It Seems That Trains-Agent Writes Files With The User "Nobody", Group "Nogroup" And Permissions 777 To Created Files. How Can I Change That? To The Very Least, Change The User Group It Uses? Running On Linux Ubuntu

nfs version 3

That's the thing, NFS will automatically set file access and flags based on the mount options you cannot change them post mount.
How about creating a new user just for the agent, it makes sense from security / credentials perspective

4 years ago

SmarmySeaurchin8 what's the mount command you are using?

4 years ago

Hi MiniatureShells8
The torch.save triggers the model creation.
If you are using the same filename, then the same model in the system will be used.
New filenames will create new models.
What exactly is your use case ?

4 years ago

Correct 🙂

4 years ago

0 Hi, When A Step In A Pipeline Is Aborted, It Is Marked As Gracefully Finished (Painted In Blue) And The Other Steps That Depend On It Continue. I Believe This Is Not The Expected Behavior, I'D Expect To To Be Marked As Failed, So Other Tasks That Depend

Why? The task should have completed successfully, how is this aborting?

Early stopping by the HPO process, like hyper-band, e.g. this training model is going nowhere let's stop it.

4 years ago

You actually have to login/ssh under said user, have another dedicated mountpoint and spin the agent from that user.

4 years ago

Yes, it could, crontab uses the user it is running from (root if used with sudo)

4 years ago

They all "inherit" the same user / environment from one another

4 years ago

correct

4 years ago

0 Some Time Ago I Wrote A Simple Glue Code To Spin Slurm Workers (Clearml Agents) When There Are Tasks Enqueued. The Workers Are Killed When Idle For A Specific Amount Of Time In Order Not To Block The Gpus (Slurm Resources), This Code Is Not Polished, But

Thanks @<1523703472304689152:profile|UpsetTurkey67>
I'm pretty sure it has!
Let me check how we can merge it into the cleamrl-agent, sounds good?

2 years ago

why would root cause the user to become nobody with group nogroup?

It is exactly the case, they inherit the cron service user (uid/gid) which would look like nobody/nogroup

4 years ago

0 <image>

What's the OS (Windows/Max/Linux)? What's the chrome version ?

3 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

. Is there any known issue with amazon sagemaker and ClearML

On the contrary it actually works better on Sagemaker...

Here is what I did on sage maker, created:
created a new sagemaker instance opened jupyter notebook Started a new notebook conda_python3 / conda_py3_pytorchIn then I just did "!pip install clearml" and Task.init
Is there any difference ?

3 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

DeterminedToad86 I suspect that since it was executed on sagemaker it registered a specific package that is unique for Sagemaker (no to worry installed packages can be edited after you clone/reset the Task)

3 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

Would it suffice to provide the git credentials ...

That should be enough, basically this is where they should be:
https://github.com/allegroai/clearml-agent/blob/0462af6a3d3ef6f2bc54fd08f0eb88f53a70724c/docs/clearml.conf#L18

3 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

Hmm that is odd, it seemed to missed the fact this is a jupyter notbook.
What's the clearml version you are using ?

3 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

DeterminedToad86
So based on the log it seems the agent is installing:
torch from https://download.pytorch.org/whl/cu102/torch-1.6.0-cp36-cp36m-linux_x86_64.whl
and torchvision from https://torchvision-build.s3-us-west-2.amazonaws.com/1.6.0/gpu/cuda-11-0/torchvision-0.7.0a0%2B78ed10c-cp36-cp36m-manylinux1_x86_64.whl

See in the log:
Warning, could not locate PyTorch torch==1.6.0 matching CUDA version 110, best candidate 1.7.0But torchvision is downloaded from the cuda 11 folder...
I...

3 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

Nicely done DeterminedToad86 🙂
Wasn't this issue resolved by torch?

3 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

Hi DeterminedToad86
I just verified on a clean sagemaker instance everything should just work, see here: https://demoapp.demo.clear.ml/projects/0e919ea1cc5c499b99e1ab85004b6e97/experiments/887edef09d4549e88b829a34c87d4d5b/output/execution Yes if you have more than one file (either notebook or python script) than you must have a git repo, in order to run the task using the Agent.

3 years ago

Show more results