AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8122

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Finally

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hello Everyone!

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Slack Security ... Go Figure

Slack security ... Go figure 😉

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Is You Server Using Https ?!

Is you server using https ?!

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Is It A One Time Thing? Or Recurring?

Is it a one time thing? or recurring?

clearml

5 years ago

0 Votes

1 Answers

2K Views

0 Votes 1 Answers 2K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

5 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Omg Look Who Just Joined The Pytorch Ecosystem

OMG Look who just joined the PyTorch EcoSystem None Yes! it is TRAINS 🚆 🎉 🎈

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

docs are up

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Gals / :robot_face: / Guys <!here> Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying python packages, ETA Noon(ish) PT time. `trains` , `trains-agent` and the docker images a

Hi Gals / :robot_face: / Guys Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying p...

clearml

4 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

one year ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

5 years ago

Show more results

0 Hi There, I Used

JitteryCoyote63 fix should be pushed later today 🙂
Meanwhile you can manually add the Task.init() call to the original script at the top, it is basically the same 🙂

3 years ago

0 Hi There, I Used

pip install git+ https://github.com/allegroai/clearml.git
out 🙂

3 years ago

0 Is Clearml-Serving Using Either System Or Cuca Shared Memory? Or Planning To? In Our Experiments Using Perf_Analyzer The Shared Memory Experiments Showed A Huge Improvement And If We Wanted To Look Into This, Do You Have Any Pointers Of Where We Can Do T

Hi @<1547028116780617728:profile|TimelyRabbit96>
Notice that if running with docker compose you can pass an argument to the clearml triton container an use shared mem. You can do the same with the helm chart

one year ago

0 Hi, I'M Running My Experiments On Remote Ec2 Machine. It Works Good But I Noticed That It Doesn'T Logs My Repo/Branch/Working_Dir, Probably Since My Git Is Local And Not On The Machine. I Used Set_Script() To Log This Details. Is There Any Other More Auto

Hi HappyDove3
task.set_script is a great way to add the info (assuming the .git is missing)
Are you running it using PyCharm? (If so use the clearml pycharm plugin, it basically passes the info from your local git to the remote machine via OS environment)

3 years ago

0 Any Ideas Of Using Label Studio With Clearml Datasets - Base Dataset, Load To Label Studio, Annotate, Child Annotated Dataset Is The Kind Of Flow

Could be nice to write some automation

4 years ago

0 Why There Is No

Hi @<1618056041293942784:profile|GaudySnake67>
Task.create is designed to create an External task not from the current running process.
Task.init is for creating a Task from your current code, and this is why you have all the auto_connect parameters. Does that make sense ?

one year ago

0 Moreover, When I Go To The Queue Page, I See The Queue Is Empty, But When I'M On The Queued Task'S Page I Can See It Is Enqueued To Right Right Queue... So The Task Says It Is In The Queue, But The Queue Says It Is Empty

WackyRabbit7 I might be missing something here, but the pipeline itself should be launched on the "pipelines" queue, is the pipeline itself running? or is it the step itself that is stuck in ""queued" state?

3 years ago

0 Question About Pipelines - So The Default For Pipeline Tasks That Are Executed Remotely Is To Execute On The

It's relatively new and it is great as from the usage aspect it is exactly like a user/pass only the pass is the PAT , really makes life easier

3 years ago

0 Hi, I Have A Question Regarding The Aws_Autoscaler: It Usually Takes ~Hours To Get A Gpu Instance Nowadays. I Was Thinking, It Would Be Much More Interesting To Stop The Instances (Clearml-Agents) Instead Of Terminating Them Once They Are Inactive, So Tha

instead of terminating them once they are inactive, so that they could be available immediately when they are needed.

JitteryCoyote63 I think you can increase the IDLE timeout on the autoscaler, and achive the same behavior, no ?

3 years ago

0 Hi Clearml, Does Clearml Orchestration Have The Ability To Break Gpu Devices Into Virtual Ones?

Sure thing, any specific reason for querying on multi pod per GPU?
Is this for remote development process ?
BTW: the funny thing is, on bare metal machines multi GPU woks out of he box, and deploying it with bare metal clearml-agents is very simple

3 years ago

0 Hello, I Have Two Experiments Having The Same Plot With The Same X Values. When I Compare These Two Experiments, The Plots Are Drawn Next To Each Other (See Figure), But I Would Appreciate To See The Y-Values Of The Experiments Just In One Plot. The Plot

My pleasure 🙂

3 years ago

0 Hello, Can I Get Somehow Json Files Of Plots For The Given Task? I Know There Is The "Download Json" Button Near The Plots In Your Web Ui, But I Need Do It Programatically (There Are Many Plots And Many Tasks).

CurvedHedgehog15 is it plots or scalars you are after ?

3 years ago

0 In My Requirement.Txt File I Have Modules Installed From The Same Repository, I.E., I Have Lines Such As:

understood, can you try
Task.add_requirements("-e path/to/folder/")

2 years ago

0 Hi, Is There A Means To Leverage On Clearml To Run A Ml Inference Container That Does Not Terminate?

To clarify, there might be cases where we get helm chart /k8s manifests to deploy a inference services. A black box to us.

I see, in that event, yes you could use clearml queues to do that, as long as you have the credentials the "Task" is basically just a deployment helm task.
You could also have a monitoring code there so that the same Task is pure logic, spinning the helm chart, monitoring the usage, and when it's done taking it down

7 months ago

0 So, I Have Just Started Using Clearml For Local Data And Experiment Tracking And Its Been Super Helpful. Now That I Am Moving Towards Deploying And Serving The Models Using Clearml-Serving And Triton. I Have Done Some Basic Experimenting With The Provided

Hi RipeAnt6

What would be the best way to add another model from another project say C to the same triton server serving the previous model?

You can add multiple call to cleaml-serving , each one with a new endpoint and a new project/model to watch, then when you launch it it will setup all endpoints on a single Triton server (the model optimization loading is taken care by Triton anyhow)

4 years ago

0 Hi Everyone, I'M Using The

AttractiveCockroach17 could it be Hydra actually kills these processes?
(I'm trying to figure out if we can fix something with the hydra integration so that it marks them as aborted)

2 years ago

0 Hi Everyone, Is It Possible To Show The Upload Progress Of Artificats? E.G. I Use

I use

torch.save

to store some very large model, so it hangs forever when it uploads the model. Is there some flag to show a progress bar?

I'm assuming the upload is http upload (e.g. the default files server)?
If this is the case, the main issue we do not have callbacks on http uploads to update the progress (which I would love a PR for, but this is actually a "requests" issue)
I think we had a draft somewhere, but I'm not sure ...

3 years ago

0 Has Anyone Compared

Yes clearml is much better 🙂
(joking aside, mlops & orchestration in clearml is miles better)
CheerfulGorilla72 What are you looking for?

3 years ago

0 Has Anyone Compared

is it planned to add a multicursor in the future?

CheerfulGorilla72 can you expand? what do you mean by multicursor ?

3 years ago

0 Hello there! ~I've come to bargain!~ So, I noticed that with the REST API at least the `/tasks.get_all` endpoint appears to have an undocumented maximum page size of 500. The minimum page size it says right there, but at least when fetching it through th

Hi @<1724235687256920064:profile|LonelyFly9>

So, I noticed that with the REST API at least the

/tasks.get_all

endpoint appears to have an undocumented maximum page size of 500.

Yeah otherwise the request size might be too big, but you have pagination:

page
optional	Page number, returns a specific page out of the resulting list of tasks
Minimum value : 0	integer

11 months ago

0 Has Anyone Compared

Oh like sync the "x-axis" between graphs?

3 years ago

0 If I Create A Task Using Task.Create And Then In A Separate Piece Of Code I Want To Report To It (By Using

So it is the automagic that is not working.
Can you print the following before calling Both Task.debug_simulate_remote_task and Task.init , Notice you have to call Task.init
print(os.environ)

3 years ago

0 Hey, I Have One Question Regarding The Cleanup_Service Task In The Devops Project: Does It Assume That The Agent In Services Mode Is In The Trains-Server Machine?

Hi JitteryCoyote63

cleanup_service task in the DevOps project: Does it assume that the agent in services mode is in the trains-server machine?

It assumes you have an agent connected to the "services" queue 🙂
That said, it also tries to delete the tasks artifacts/models etc, you can see it here:
https://github.com/allegroai/trains/blob/c234837ce2f0f815d3251cde7917ab733b79d223/examples/services/cleanup/cleanup_service.py#L89
The default configuration will assume you are running i...

4 years ago

0 Hi! I Am Getting The Following Error On An Agent:

Hi GrievingTurkey78
Can you test with the latest clearml-agent RC (I remember a fix just for that)
pip install clearml-agent==1.2.0rc0

3 years ago

0 Different Question. How Can I Pass Pythonpath Env Variable To A Task, Run By Agent (So Python Can Find Classes Inside M Subdirectories)?

Happy to hear 🙂

3 years ago

0 Hi! I Am Getting The Following Error On An Agent:

This is an odd error, could it be conda is not installed in the container (or in the Path) ?
Are you trying with the latest RC?

3 years ago

0 Hello, My Dl Workflow Includes Post-Training Quantization. Is There A Way To Implement These Procedures In Clearml?

MistakenBee55 how about a Task doing the Model quantization, then trigger it with TriggerScheduler ?
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py

3 years ago

0 Hi Fam! Sorry For The Potential Dumb Question, But I Couldn’T Find Anything On The Interwebs About It. I’M Hosting A Clearml Server On Aws, Using S3 As A Backend For Artifact Storage. I Find That Whenever I Delete Archived Artifacts In The Web App, I Get

in

issues a delete command to the ClearML API server,...

almost, it issues the boto S3 delete commands (directly to the S3 server, not through the cleaml-server)

And that I need to enter an AWS key/secret in the profile page of the web app here? (edited)

correct

3 years ago

0 Hello, I Am Trying To Run The

Hi ShinyRabbit94

system_site_packages: true

This is set automatically when running in "docker mode" no need to worry 🙂
What is exactly the error you are getting ?
Could it be the container itself has the python packages installed in a venv not as "system packages" ?

3 years ago

0 Hi Everybody, I'M Trying To Run An Experiment Inside A Docker And I Get: Repository Cloning Failed: Command '['Git', 'Checkout', 'Commit-Id', '--Force']' Returned Non-Zero Exit Status 128. (I Set Git_User And Git_Pass) Anyone Know How To Solve? I Tired

Hi SparklingElephant70

Anyone know how to solve?
I tired git push before,

Can you send the entire log? Could it be that the requested commit ID does not exist on the remote git (for example force push deleted it) ?

3 years ago

Show more results