AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8051

0 Votes

10 Answers

504 Views

0 Votes 10 Answers 504 Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

8 months ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

4 years ago

0 Votes

6 Answers

453 Views

0 Votes 6 Answers 453 Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

3 years ago

0 Votes

2 Answers

434 Views

0 Votes 2 Answers 434 Views

Omg Look Who Just Joined The Pytorch Ecosystem

OMG Look who just joined the PyTorch EcoSystem None Yes! it is TRAINS 🚆 🎉 🎈

clearml

4 years ago

0 Votes

1 Answers

409 Views

0 Votes 1 Answers 409 Views

Please Skip

🙏 Please skip cleaml python package v1.0.1 and just move on to v1.0.2 😊 apologies for the inconvenience 🙂 pip install clearml==1.0.2

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is You Server Using Https ?!

Is you server using https ?!

clearml

4 years ago

0 Votes

7 Answers

454 Views

0 Votes 7 Answers 454 Views

Thank You All For Taking The Time To Answer Our Survey (If You Haven'T Already, We Urge You To

Thank you all for taking the time to answer our survey (If you haven't already, we urge you to do so ). Your feedback has a major impact on what we build, do...

clearml

4 years ago

0 Votes

3 Answers

533 Views

0 Votes 3 Answers 533 Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

7 months ago

0 Votes

3 Answers

419 Views

0 Votes 3 Answers 419 Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

0 Answers

997 Views

0 Votes 0 Answers 997 Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

2 Answers

999 Views

0 Votes 2 Answers 999 Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hello Everyone!

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi ClearML v0.17.1 and ClearML-Agent v0.17.0 are now the official packages & repositories 🎉 🎊 👋 🛤️ This new name brings on many changes, mainly replace a...

clearml

3 years ago

0 Votes

1 Answers

521 Views

0 Votes 1 Answers 521 Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

We Are At Aaai Ny, Come Look Us Up :)

We are at AAAI NY, come look us up :)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

docs are up

clearml

4 years ago

0 Votes

1 Answers

994 Views

0 Votes 1 Answers 994 Views

Quick Note: V1.3.1 Caused Pipelinedecorator Tasks To By Default Disable The Automagic Frameworks Connection, This Bug Is Solved In The Latest Rc

Quick note: v1.3.1 caused PipelineDecorator Tasks to by default disable the automagic frameworks connection, this bug is solved in the latest RC pip install ...

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New releases: ```pip install trains==0.13.3``` <https://github.com/allegroai/trains/releases/tag/0.13.3> ```pip install trains-agent==0.13.2``` <https://github.com/allegroai/trains-agent/releases/tag/0.13.2>

New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Finally

clearml

4 years ago

0 Votes

1 Answers

959 Views

0 Votes 1 Answers 959 Views

Gals, Guys &

Gals, Guys & :robot_face: , if you want to checkout the Hyper-Parameters automation (Using Bayesian Optimization Hyper-Band) We have an example on the demo s...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Https://M.Facebook.Com/Story.Php?Story_Fbid=2484620658505570&Id=1620822758218702&Refid=52&__Tn__=-R

https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R

clearml

4 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi , v0.15 is out, 🎉 🚀 Your feedback had a major influence on the features we added 🙂 thank you! A selected list of features: Column resizing / ordering /...

clearml

4 years ago

0 Votes

1 Answers

494 Views

0 Votes 1 Answers 494 Views

There Is No V1.0 Release Without A Prompt V1.0.1 Following It, And We Are No Different

🙏 There is no v1.0 release without a prompt v1.0.1 following it, and we are no different 😊 pip install clearml==1.0.1

clearml

3 years ago

0 Votes

0 Answers

901 Views

0 Votes 0 Answers 901 Views

Hi Gals / :robot_face: / Guys <!here> Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying python packages, ETA Noon(ish) PT time. `trains` , `trains-agent` and the docker images a

Hi Gals / :robot_face: / Guys Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying p...

clearml

3 years ago

Show more results

0 Hi All! I Have Methods Inside Notebooks That I Made Available To Clis Using Nbdev

Hi @<1528908687685455872:profile|MassiveBat21>

However

no useful

template

is created for down stream executions - the source code template is all messed up,

Interesting, could you provide the code that is "created", or even better some way to reproduce it ? It sounds like sort of a bug? or maybe a feature support that is missing.

My question is - what is a best practice in this case to be able to run exported scripts (python code not made availa...

one year ago

0 Hi Guys, I Have Many Questions To Ask, Sorry If This Questions Were Posted Already - If The Answer Exist, Please, Point Me To It. Thank You For Your Help. I'M Training Object Detection Model Using Tf 2.3 Object Detection Api And Use Clearml On Local Serve

This should have worked with the latest clearml RC.
And you verified it is not working?

3 years ago

Thanks MagnificentSeaurchin79 !
Let me check what's the status with this one, could it be the same as this one?
https://github.com/allegroai/clearml/issues/322

3 years ago

0 Hello, I’M Trying To Update Our Clearml Server Running On Kubernetes (1.6.0-213) But I Get This Error:

should i only do mongodb

No, you should do all 3 DBs ELK , Mongo, Redis

one year ago

0 What Is The Recommended Way To Stop The Execution Of A Specific Agent? This Command Doesn'T Allow Me To Specify The Agent Ip I Want To Stop:

GiganticTurtle0 adding --stop to the exact daemon execution will stop it (meaning if you have multiple agents on the same machine launched with different parameters, just add the --stop to retire the specific one)

3 years ago

0 What Is The Recommended Way To Stop The Execution Of A Specific Agent? This Command Doesn'T Allow Me To Specify The Agent Ip I Want To Stop:

Hmmm that is a good use case to have (maybe we should have --stop get an argument ?)
Meanwhile you can do
$ clearml-agent daemon --gpus 0 --queue default $ clearml-agent daemon --gpus 1 --queue default then to stop only the second one: $ clearml-agent daemon --gpus 1 --queue default --stopwdyt?

3 years ago

0 What Is The Recommended Way To Stop The Execution Of A Specific Agent? This Command Doesn'T Allow Me To Specify The Agent Ip I Want To Stop:

GiganticTurtle0 can you please add a github issue with feature request to clearml-agent? I think this is a great use case!

3 years ago

0 Hi, When A Step In A Pipeline Is Aborted, It Is Marked As Gracefully Finished (Painted In Blue) And The Other Steps That Depend On It Continue. I Believe This Is Not The Expected Behavior, I'D Expect To To Be Marked As Failed, So Other Tasks That Depend

SmarmySeaurchin8 it could be a switch, the problem is that when you have automatic stopping flows, they will abort a task, which is legitimate (e.g. should not considered failed)
How come you have aborted tasks in the pipeline ? If you want to abort the pipeline, you need to first abort the pipeline Task then the tasks themselves.

4 years ago

Why? The task should have completed successfully, how is this aborting?

Early stopping by the HPO process, like hyper-band, e.g. this training model is going nowhere let's stop it.

4 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

JitteryCoyote63 are you calling to:
my_task.output_uri = " s3://my-bucket
in the code itself ?
Why not with Task.init output_uri=...
Also this is running remotely there is no need fo r that, use the Execution -> Output -> Destination and put it there, it will do everything for you 🙂

4 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

the second seems like a botocore issue :
https://github.com/boto/botocore/issues/2187

4 years ago

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

That is actually what you are looking for:
https://github.com/allegroai/trains/blob/04b3fa809bb73d7101d1995327684ebe5b2911e3/trains/storage/cache.py#L46

4 years ago

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

I assume here:
https://github.com/allegroai/trains/blob/04b3fa809bb73d7101d1995327684ebe5b2911e3/trains/storage/cache.py#L47

4 years ago

0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

Legit, if you have a cached_file (i.e. exists and accessible), you can return it to the caller

4 years ago

0 One More Follow-Up Still; We'Re Trying To Run Non-Gpu Scaler, And I'Ve Finally Sorted Out Subnet And Security Groups Issues, Only To Run Into This:

works seamlessly throughout and in our current on premise servers...

I'm assuming via something close to what I suggested above with .netrc ?

2 years ago

0 Hey Has Anyone Managed To Capture Darts Logging With Clearml When Using The Temporal Fusion Transformers ? Even When Overriding Their Trainer With A Custom Pytorch Lightning Trainer It Seems That Clearml Cannot Retrieve The Iteration Log...

Hi @<1523702000586330112:profile|FierceHamster54>
I think I'm missing a few details on what is logged, and ref to the git repo?

one year ago

Where do you have your Task.init ?

one year ago

Where is darts reporting scalars ?

one year ago

0 Please Tell Me What Ram Metric Is Tracked By Clearml? What I See In Htop And On The Board Don'T Match Even Though It'S The Same Server 20 Gb Vs 70Gb

Hi @<1523702932069945344:profile|CheerfulGorilla72>

Please tell me what RAM metric is tracked by ClearML?

Free RAM is the entire machine free RAM
Yeah htop shows odd numbers as it doesn't "count" allocated buffers
specifically you can see the code here:
None

one year ago

0 I Wanted To Ask About Html Reporting, If I Want To Do A More Fancy Visualization (Like Overlay Of Two Images Depending On Mouse Hovering), I Have To Inject This Html Into The Reporting Code, Right? I Am Asking, As Perhaps It Is Possible To Have Custom Wid

HealthyStarfish45 what exactly did you have in mind, in terms of the widget ?

4 years ago

0 Hi, I Am Getting Following Error While Trying To Checkout A Gut Hub Rep. Error: Rpc Failed; Curl 56 Gnutls Recv Error (-54): Error In The Pull Function. Fatal: The Remote End Hung Up Unexpectedly Fatal: Early Eof Fatal: Index-Pack Failed Repository Cloni

Simple git clone on that repo works well

On the machine running the trains-agent ?

4 years ago

0 One More Follow-Up Still; We'Re Trying To Run Non-Gpu Scaler, And I'Ve Finally Sorted Out Subnet And Security Groups Issues, Only To Run Into This:

NICE!

2 years ago

Could you right click on the failed experiment , select reset and send it again for execution?
Could that error be a random network issue ?
(Basically this seems like a generic network error not actually related to the trains-agent)
Is the trains-agent running in docker mode or venv mode?

4 years ago

0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

CooperativeFox72 yes 20 experiments in parallel means that you always have at least 20 connection coming from different machines, and then you have the UI adding on top of it. I'm assuming the sluggishness you feel are the requests being delayed.
You can configure the API server to have more process workers, you just need to make sure the machine has enough memory to support it.

4 years ago

0 Given These Are Settled.. Another Question I Have Is About The Job Scheduling Based On Cron Style.. E.G. Run Training Every Night At 2 Am Etc.

Meanwhile you can just sleep for 24hours and put it all on the services queue. it should work 🙂
Example here:
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py

4 years ago

0 Upon Calling Task.Init(), I Get Below Error: Failed Getting Token (Error 401 From

command line 🙂
cmd.exe / bash

4 years ago

0 Hi Again. As I Am Running My Experiment From Server Using Agent, I Am Failing On The Point, Where The Arguments Of Argparse Are Processed. When Is The Agent Task Registered. I Am Getting None For Task.Current_Task() At The Begining Of My Script.

or shall I call the Task.init even from the agent

WorriedParrot51 I think something is lost here.
Task.init() is always called, even when the agent is executing the code. The difference is in what happens inside the Task.init() call. When the codebase itself is executed by the trains-agent, it signals through OS environment to the task.init() that instead of a new created task, it should use the already created one. from this point all data flows from the trains-server back into the c...

4 years ago

0 Hi, I Am Quite Sure, That Someone Has Already Asked This Before, But I Suppose, That The Answer Will Be Simple: I Am Trying To Run Trains-Agent In Docker Mode, But I Need To Setup Pythonpath To Point To The Cloned Repo. I Was Trying To Add Following Arg:

Hi WorriedParrot51
Take a look at the Experiment execution section:
there is script and working directory
working directory is the base of the git repository (which is cloned into the docker file)
So if for some reason trains did not properly detect the current working dir here is what should solve the issue, without changing the PYTHONPATH

script path: ./sub_folder/scripy.py working directory: .
What do you think?

4 years ago

Hi WorriedParrot51
So I think what you need is to map your external code into the docker, is that correct?
Also you want to always set the PYTHONPATH.
You can achieve both by configuring the trains.conf:

Here you can always add a predefined environment and mount point, regardless of the docker image or other docker argument arguments:
https://github.com/allegroai/trains-agent/blob/master/docs/trains.conf#L98

Will this solve the issue?

4 years ago

WorriedParrot51 I now see ...
Two solutions that I can quickly think of:
In the code add:import sys sys.path.append('./my_sub_module')Assuming you always have to add the sub-directories to make the code work, and assuming they are part of the repository, this is probably the table stolution
2. In the the UI in the Docker base image, add -e PYTHONPATH=/folder
or from code (which is exactly what you did)
a clean interface task.set_base_docker('nvidia/cids -e PYTHONPATH=/folder")

4 years ago

Show more results