AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8124

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Https://M.Facebook.Com/Story.Php?Story_Fbid=2484620658505570&Id=1620822758218702&Refid=52&__Tn__=-R

https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R

clearml

5 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

one year ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

docs are up

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Finally

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

New releases: ```pip install trains==0.13.3``` <https://github.com/allegroai/trains/releases/tag/0.13.3> ```pip install trains-agent==0.13.2``` <https://github.com/allegroai/trains-agent/releases/tag/0.13.2>

New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

5 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

:confetti_ball: :champagne: Happy new year <!everyone>! :fireworks: :sparkler: We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to see users enjoying the product you build, and y

🎊 🍾 Happy new year ! 🎆 🎇 We wanted to thank you all for the great feedback, contribution and general support you guys give us. It is truly fulfilling to ...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

5 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi , v0.15 is out, 🎉 🚀 Your feedback had a major influence on the features we added 🙂 thank you! A selected list of features: Column resizing / ordering /...

clearml

5 years ago

0 Votes

9 Answers

2K Views

0 Votes 9 Answers 2K Views

Hi https://github.com/allegroai/trains/releases/tag/0.15.1 / https://github.com/allegroai/trains-server/releases/tag/0.15.1 / https://github.com/allegroai/tr...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hello Everyone!

clearml

5 years ago

Show more results

0 Hi, I Am Trying

MelancholyChicken65 what's the clearml-serving you are using ? (I believe this issue was fixed in 1.2)

2 years ago

0 Hi Guys, Suppose I Have The Following Script:

GiganticTurtle0 is it in the same repository ?
If it is it should have detected the fact that it needs to analyze the entire repository (not just the standalone script, and then discover tensorflow)

4 years ago

0 Hello, I Have Been Using Clearml Interactive Session For More Than 3 Months And I Am Facing With Random Ssh Disconnection Errors In Vscode Once In A While After Creating The Session. Sometimes Reconnecting Works, If It Does Not Work I Reconnect The Clear

@<1699955693882183680:profile|UpsetSeaturtle37> good progress, regrading the error, 0.15.0 is supposed to be out tomorrow, it includes a fix to that one.
BTW: can you run with --debug

one year ago

0 What Are Project Default Output ? That The Default Output_Uri Set On The Server Side ? Can I Use Azure Blob Storage ?

Hi @<1576381444509405184:profile|ManiacalLizard2>
Yeah that should work, assuming credentials are set in your clearml.conf

2 years ago

0 When I Run An Experiment (Self Hosted), I Only See Scalars For Gpu And System Performance. How Do I See Additional Scalars? I Have

BoredHedgehog47 you need to make sure "<path here>/train.py" also calls Task.init (again no need to worry about calling it twice with different project/name)
The Task.init call will make sure the auto-connect works.
BTW: if you do os.fork , then there is no need for the Task.init, the main difference is that POpen starts a whole new process, and we need to make sure the newly created process is auto-connected as well (i.e. calling Task.init)

2 years ago

0 Does Clearml-Session Work In A Kubernetes Environment?

Have to get glue setup, which I couldn’t understand fully, so that’s a different topic

I suggest using the apply template setup (basically you provide a Job/Service template, and it uses that to setup k8s jobs based on the Tasks coming in from the specific queue)

4 years ago

0 Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

Done!

Thanks

fatal: unable to find a suitable socket path; use --socket

)

Look here https://stackoverflow.com/questions/5015732/why-do-i-get-unable-to-connect-a-socket-when-i-try-to-clone-via-a-git-url

I think that's the root cause, we should probably also add https://github.com/allegroai/trains-agent/issues/16

5 years ago

0 When Using Docker Mode (And Specifically K8S Glue), What Are The Options For Caching? One Option Is Definitely Having A Base Image That Has The Things Needed. Anything Else? Thanks!

A few examples:
https://medium.com/@Sushil_Kumar/readwritemany-persistent-volumes-in-google-kubernetes-engine-a0b93e203180
https://docs.openshift.com/enterprise/3.1/install_config/storage_examples/shared_storage.html

4 years ago

0 Hi New With Clearml I Create Clearml Server On Gcp With Docker Now I’M Training Yolov5 And I Want To Save All The Info (Model And Metrics ) With Clearml To My Bucket.. (So I Can Have Small Server And No Memory Issue ) Where Should I Start? Its Should Be C

the only thing that missing is some plots on the clearml server (app ) when i got to the details of the train i cannot see the matrix confusion for example ( but its exists on the bucket )

How do you report the "matrix confusion" ? (I might have an idea on what's the difference)

2 years ago

0 Can I Import A Tensorboard File Straight To The Clearml Ui?

SweetGiraffe8 no need to import it, any report to TB is automatically logged by ClearML 🙂

4 years ago

0 Hi, I Have A Question Regarding The Autoscaler. I Implemented A Custom Driver For Gcp And I Manager To Launch The Clearml.Automation.Auto_Scaler.Autoscaler Which Runs Smoothly (Kudos!!). I Can See Instance Being Created/Destroyed On Demand As Expected. Th

And If I create myself a Pro account

Then you have the UI and implementation of both AWS & GCP autoscalers, am I missing something?

one year ago

0 Hello, We Have A Self Hosted Clearml Server Connected To Different Queues And Use It To Launch Remote Experiments (Clearml==1.9.3, Clearml-Agent==1.5.2Rc0). It Is Working Really Well For Us Unless One Workflow :) We Would Like To Abort An Experiment And E

Woot woot, will do!

2 years ago

0 I'M Training A Tensorflow Model And Saving It In The End. I Looked At The Outputmodel Class. How Do I Connect The Model I'M Saving To The Outputmodel?

I think that what you need is to create an OutputModel , then call update weights file when you have the better model, this will also allow you to tag the model object. Would that help? Or would it make sense to use Task.models and count on the auto logging?

3 years ago

0 Base_Template_Keras_Simply.Py

No worries 🙂

4 years ago

0 Hi Fam! I’M Trying To Get

Hi QuaintPelican38
Assuming you have open the default SSH port 10022 on the ec2 instance (and assuming the AWS premissions are set so that you can access it). You need to use the --public-ip flag when running the clearml-session. Otherwise it "thinks" it is running on a local network and it registers itself with the local IP. With the flag on it gets the public IP of the machine, then the clearml-session running on your machine can connect to it.
Make sense ?

4 years ago

0 Hi, We Are Having Issues With Clearml-Session For Vscode. Apparently It'S Hardcoded To Download From

SubstantialElk6 could you add a github issue to set the direct url for the vscode as a parameter to the cleaml-session?
We already have --vscode-version we could either extend it to include a direct url, or add a new argument.
wdyt ?

4 years ago

0 Clearml-Session Question: I’M Using The Tool With An On-Prem Machine. Normal Tasks Are Being Executed Normally - But When Using

Hmm, any suggestion on making it more visible or on the interface ? (I mean deleting the cache file is always a solution, but it sounded quite painful to debug, hence the question)

2 years ago

0 I Am Exploring Your Latest Video On Cleaml Onboarding Part 3 Model Serving And Monitoring. The Example In A Video Is Very Simple - Deploying A Xgboost Model To Triton Engine. What About If I Need To Deploy A Custom Solution With 2 Models Lets Say And Some

Hi @<1578918150261444608:profile|RoundJellyfish71>
Yes it can, there check these examples:
None
None

2 years ago

0 I’M Trying To Use Minio With Clearml As A External Storage. I Am Having Problems With The Configuration File For The Clearml Client When I Use The Output_Uri Parameter Of Task.Init What Do I Put There? I Am Currently Doing Task.Init(… Output_Uri=“S3://I

@<1538330703932952576:profile|ThickSeaurchin47> can you try the artifacts example:
None
and in this line do:

task = Task.init(project_name='examples', task_name='Artifacts example', output_uri="

")

2 years ago

0 Hi There I'M Trying Out Clearml. I Saw Mention That Clearml Can Capture Tensorboard Output So I Tried It With This Little Script (Image Below). The Events File Is Filled, The Clearml Task Is Created, And Marked Complete However There Is Nothing In The Sc

PanickyMoth78 'tensorboard_logger' is an old deprecated package that meant to create TB events without TB, it was created before TB was a separate package. Long story short, it is not supported. That said if you just run the same code and replace tensorboard_logger with tensorboard, you should see all scalars in the UI

background:
ClearML logs TB events as they are created in real-time, TB_logger is not TB, it creates events and dumps them directly into a TB equivalent event file

3 years ago

0 Question About The Usage Of Trains Agents. In Our Company We Have 3 Hpc Servers, Two Of Them Have Multiple Gpus, One Is Cpu Only. I Saw In The Docs The Multiple Agents Can Be Run Separately Assigning Gpus In Whatever Manner You Want. My Questions Are 1

Hi WackyRabbit7 ,
Running in Docker mode provides you greater flexibility in terms of environment control, from switching cuda versions, to pre-compiled packages that are needed (think apt-get) etc. Specifically for DL if you are using multiple tensorflow versions, they are notorious for compiling against a specific CUDA version, and the only easy way to be able to switch between them would be different dockers. If your are a PyTorch user, then you are in luck, they have all the pytorch ver...

5 years ago

0 Hi There, I’Ve Been Trying To Play Around With The Model Inference Pipeline Following

I did change the

instead of 8080?

So this is the issue

2 years ago

0 Hi, I Have A Small Issue About Gpu Monitoring. I Run My Training Inside A Singularity Container And I Set The Cuda_Visible_Devices Variable. However, I Get The Following Message:

Hi BoredGoat1
from this warning: " TRAINS Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring " It seems trains failed to load the nvidia .so dll that does the GPU monitoring:
This is based on pynvml, and I think it is trying to access "libnvidia-ml.so.1"

Basically saying, if you can run nvidima-smi from inside the container, it should work.

5 years ago

0 I'M Training A Tensorflow Model And Saving It In The End. I Looked At The Outputmodel Class. How Do I Connect The Model I'M Saving To The Outputmodel?

I get what you're saying. Only problem is in the case of AutoLogging, I don't have the model id, for the model being saved.

Task.models['output'] should return all the model objects the autologging created

3 years ago

0 Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

Thank you! 🙂

5 years ago

0 Hi Everyone! Quick Question: I Have A Script That Allows The Model To Be Saved Out In Case Of An Early Exit. At The Moment The Script Is Catching The Sigint And Sigterm Signals, Ending The Training And Writing Out The Model. I Understand I Could Use Check

SillyPuppy19 yes you are correct, actually I can promise you the callback will be called from a different thread (basically the monitoring thread) so it's on the user to make sure the callback can handle it .
How about we move this discussion to GitHub?

5 years ago

You might only see it when the upload is done

2 years ago

0 Hi Everyone, I Have Some Questions Regarding Clearml Aws_Autoscaler.Py.

Glad to hear! 🎉

one year ago

0 Clearml-Session Question: I’M Using The Tool With An On-Prem Machine. Normal Tasks Are Being Executed Normally - But When Using

2023-02-15 12:49:22,813 - clearml - WARNING - Could not retrieve remote configuration named 'SSH'

This is fine, it means it uses the default identity keys

The thing is - when I try to connect with normal SSH there are no issues

Now I'm lost, so when exactly do you see the issue ?

2 years ago

0 Hey All, I'M Having An Issue Using Hydra And Tensorboardx, Where Clearml Isn'T Resetting The Iterations Across Different Multiruns Although It Looks As Expected In Tensorboard Itself:

pip install clearml==0.17.5rc5

4 years ago

Show more results