AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8049

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

We Are At Aaai Ny, Come Look Us Up :)

We are at AAAI NY, come look us up :)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

4 years ago

0 Votes

0 Answers

937 Views

0 Votes 0 Answers 937 Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

0 Votes

10 Answers

461 Views

0 Votes 10 Answers 461 Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

7 months ago

0 Votes

0 Answers

964 Views

0 Votes 0 Answers 964 Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

4 years ago

0 Votes

3 Answers

968 Views

0 Votes 3 Answers 968 Views

This Will Close It

This will close it Task.current_task().close()I think we should rename completed() because it just marks the Task as completed on the backend but does not ac...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hello Everyone!

clearml

4 years ago

0 Votes

0 Answers

878 Views

0 Votes 0 Answers 878 Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

I Would Guess Connectivity Issues, The Tls Is Probably Python Inaccurate Response (I Mean In A Way, It Is Also A Tls Error, But I Would Imagine This Has More To Do With The Actual Network Connection)

I would guess connectivity issues, the TLS is probably python inaccurate response (I mean in a way, it is also a TLS error, but I would imagine this has more...

clearml

4 years ago

0 Votes

0 Answers

961 Views

0 Votes 0 Answers 961 Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 awesome thanks!

YummyWhale40 awesome thanks!

clearml

4 years ago

0 Votes

1 Answers

464 Views

0 Votes 1 Answers 464 Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

4 years ago

0 Votes

3 Answers

365 Views

0 Votes 3 Answers 365 Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

one year ago

0 Votes

7 Answers

410 Views

0 Votes 7 Answers 410 Views

Thank You All For Taking The Time To Answer Our Survey (If You Haven'T Already, We Urge You To

Thank you all for taking the time to answer our survey (If you haven't already, we urge you to do so ). Your feedback has a major impact on what we build, do...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

docs are up

clearml

4 years ago

0 Votes

0 Answers

965 Views

0 Votes 0 Answers 965 Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

2 years ago

Show more results

0 When It Comes To Continuous Training, I Wanted To Know How You Train Or Would Train If You Have Annotated Data Incoming? Do You Train Completely Online Where You Train As Soon As You Have A Training Example Available? Do You Instead Train When You Have A

What about the epochs though? Is there a recommended number of epochs when you train on that new batch?

I'm assuming you are also using the "old" images ?
The main factor here is the ratio between the previously used data and the newly added data, you might also want to resample (i.e. train on more) new data vs old data. make sense ?

2 years ago

0 Hi! I Use Self-Hosted Server. I Uploaded Datasets With

now i cant download neither of them

would be nice if address of the artifacts (state and zips) was assembled on the fly and not hardcoded into db.

The idea is this is fully federated, the server is not actually aware of it, so users can manage multiple storage locations in a transparent way.

if you have any tips how to fix it in the mongo db that would be great ....

Yes that should be similar, but the links would be in artifact property on the Tasks object
not exactly...

one year ago

0 Hi Everyone, I Have Some Questions Regarding Clearml Aws_Autoscaler.Py.

Glad to hear! 🎉

11 months ago

0 Hi. After Upgrading Clearml To Latest Version, Got This Error From My Pipeline (Windows10, Configured And Running Tensorflowod For Tf 2.3.):

To automate the process, we could use a pipeline, but first we need to understand the manual workflow

3 years ago

0 I'Ve Tried Setting Up A Clearml Application On Openshift Using The Helm Chart But The Pods Cannot Go Up Because They Are Trying To Write To Files And Directories That Aren'T Open To Non Root Users During Their Setup. This Is A Problem On Openshift Because

i've tried setting up a clearml application on openshift

First, my condolences 🙂 openshift ...
Second, what you need to make sure is that each container (i.e. ELK/Monogo etc) has their own PV for persistent storage , I'm assuming this is the root cause for the error.
Make sense to you ?

2 years ago

0 Hey,

WickedElephant66 it should work, how exactly are you calling StorageManager?

2 years ago

0 Hi All. I Am Struggling With Integrating Plots Into My Task. Without The Plotting Code, The Task Never Completes The Execution And Seems To Hang. Also, The Plots Are Not Visible In The Plots Tab. I Am Running A For Loop For Different Models And Attemptin

https://github.com/allegroai/clearml/blob/master/examples/frameworks/matplotlib/matplotlib_example.py

3 years ago

0 Hi, Does Anyone Use Mlflow / Weight & Biases /

Nice debugging experience

Kudos on the work !
BTW, I feel weird to add an issue on their github, but someone should, this generic setup will break all sorts of things ...

4 years ago

0 I'M Trying To Run A Task On An Agent. I'Ve Passed The Requirements File But It Isn'T Able To Install It. The Error Is In The Reply. Help Would Be Appreciated.

Hi VexedCat68
Could it be the python version is not the same? (this is the only reason not to find a specific python package version)

2 years ago

0 Hi, I Am Trying To Setup Multi-Node Training With Pytorch Distributeddataparallel. Ddp Requres A Launch Script With A Set Of Parameters To Be Run On Each Node. One Of These Parameters Is Master Node Address. I Am Currently Using The Following Scheme:

This task is picked up by first agent; it runs DDP launch script for itself and then creates clones of itself with task.create_function_task() and passes its address as argument to the function

Hi UnevenHorse85
Interesting use case, just for my understanding, the idea is to use ClearML for the node allocation/scheduling and PyTorch DDP for the actual communication, is that correct ?

passes its address as argument to the function

This seems like a great solution.

the queu...

3 years ago

0 Hey Everyone! Is It Possible To Trigger A Pipeline Run Via Api? We Have A Repo That Builds An Image For Serving To Clearml Server But We'Ve Wrapped It Inside A Fastapi Application So It Can Be Called From Another Web Service.

Is there any way to make that increment from last run?

pipeline_task = Task.clone("pipeline_id_here", name="new execution run here") 
Task.enqueue(pipeline_task, queue_name="services")

wdyt?

5 months ago

0 If I Am Using The Demo Servers, Do I Need To Do Something Special To Use

there was a problem with index order when converting from pytorch tensor to numpy array

HealthyStarfish45 I'm assuming you are sending numpy to report_image (which makes sense) if you want to debug it, you can also test tensorboard add_image or matplotlib imshow. both will send debug images

3 years ago

0 Is There A Way To Sort Plots By Iteration?

https://stackoverflow.com/questions/60860121/plotly-how-to-make-an-annotated-confusion-matrix-using-a-heatmap
MagnificentSeaurchin79 see plotly example here:
https://allegro.ai/clearml/docs/docs/examples/reporting/plotly_reporting.html

3 years ago

0 Another Question: How Can I Make Clearml-Agent Use Pre-Installed Version From The Nvidia/Pytorch (

Probably not the case the other way around.

Actually the other way around, new pip version uses new package dependency resolver that can concluded that a previous package setup is not supported (because of version conflicts) even though they worked...
It is tricky, pip is trying to get better at resolving package dependencies, but it means that old resolutions might not work which would mean old environments cannot be resorted (or "broken" env). This is the main reason not to move to p...

2 years ago

0 Hello, My Name Is Gabriel, I'M Using Clearml For Our Machine Learning Experiments, Which Is An Amazing Tool To Manage This Type Of Stuff So Thank You Guys For Creating This. But The Last Time I Tried To Use It Some Unexpected Error Came Up For Which I Can

The file is never touched, nowhere in the process that file is deleted.

it should never have gotten there, this is not the git repo folder, it one level above...

3 years ago

0 Hi All, I'M Trying To Deploy Trains On Rancher (Nice Kubernetes Cluster Orchestration Project) Where I'M Quite New To Rancher And Kubernetes. I Have Been Able To Install Trains Using Helm

but I still need the laod ballancer ...

No you are good to go, as long as someone will register the pods IP automatically on a dns service (local/public) you can use the regsitered address instead of the IP itself (obviously with the port suffix)

Thanks for your support

With pleasure!

3 years ago

0 Hey Guys, I Have Set Up A Clearml Pipeline For My Simple Isolation Forest Model. But I Have Been Receiving This Error.

If you are using the "default" queue for the agent, notice you might need to run the agent with --services-mode to allow for multiple pipeline components on the same machine

one year ago

0 Hi Again. Is There Any Way To Have Trains-Agent Do A 'Docker Build' On The Dockerfile In The Repository It Pulls And Then Run That Image? I Know I Can Specify The Base Image Trains-Agent Runs The Task In And That Will Get Pulled/Run At Execution Time, But

Hi RobustGoldfish9 ,

I'd much rather just have trains-agent just automatically build the image defined there than have to build the image separately and make it available for all the agents to pull.

Do you mean there is no docker image in the artifactory built based on your Dockerfile ?

3 years ago

0 Hi, I Am Trying To Upload A Plot To An Existing Task Using The

SmarmyDolphin68
BTW: there is no automatic reporting when you have task = Task.get_task(task_id='your_task_id')
It's only active when you have one "main" task.
You can also check the continue_last_task argument in Task.init , it might be a good fit for your scenario
https://allegro.ai/docs/task.html#trains.task.Task.init

3 years ago

0 Hey Guys Trying To Save A Model Via The Outputmodel.Update_Weights Function I Get The Following Error:

Hmm whats the OS and python version?
Is this simple example working for you?
None

one year ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

And the correct region ?

3 years ago

0 Hi! I Have A Question Concerning Dynamic Environment Variables. I Managed To Create Some Env Variables From The Apiserver.Conf And Now I Would Like To Set Some Env Variables For My Main Clearml.Conf File. However I Am Not Sure What Is The Proper Way. I T

Hi GreasyPenguin66
Is this for the client side ? If it is why not set them in the clearml.conf ?

3 years ago

0 Hi There, I'Ve Encountered A Problematic Behavior In Python. When Defining An Argument A Default Value Of

Hi PompousBeetle71
I remember it was an issue, but it was solved a while ago. Which Trains version are you using?

4 years ago

0 Hi, I Have A Task Which Uses Hydra For Configuration. I Want To Add This Taks To A Pipeline, And Pass The Full Hydra Config Objects To The Task. Is There A Way To Do It? I Get “Parameters Should Be In The Form Of “`Section-Name`/Parameter”, Example: “Args

Okay this is a bit tricky (and come to think about it, we should allow a more direct interface):
pipe.add_step(name='train', parents=['data_pipeline', ], base_task_project='xxx', base_task_name='yyy', task_overrides={'configuration.OmegaConf': dict(value=yaml.dump(MY_NEW_CONFIG), name='OmegaConf', type='OmegaConf YAML')} )Notice that if you had any other configuration on the base task, you should add them as well (basically it overwrites the configurati...

3 years ago

0 What Is

PipelineController works with default image, but it incurs overhead 4-5 min

You can try to spin the "services" queue without docker support, if there is no need for containers it will accelerate the process.

Repository cloning failed: Command '['git', 'fetch', '--all', '--recurse-submodules']' returned non-zero exit status 1.

This error is about failing to clone the pipeline code repo, how is that connected to changing the container ?!
Can you provide the full log?

2 years ago

0 Hi, I'D Like To Know If There Is A Way To Include A Process Like Aws Autoscaler And Its Configurations Inside The Clearml Helm Chart. My Goal Is To Automatically Run The Aws Autoscaler Task On A Clearml-Agent Pod When I Deploy The Clearml Services On The

This, however, requires that I slightly modify the clearml helm chart with the aws-autoscaler deployment, right?

Correct 🙂

3 years ago

0 Clearml-Init Doesn'T Ask For Ports, And Our Server Exposes Ports That Are Different From Default Ones. It Would Be Great To Have An Option To Change Default Ports For Api, File And Web Servers, Otherwise Initialization Fails With Wrong Creds Error

Hi DilapidatedDucks58
is this something new ?
usually copy pasting directly from the UI parses everything, no?

one year ago

0 Is It Possible To Report A Static Html To A Task And Have It Shown In The Ui? I Used The Following:

Done HandsomeCrow5 +1 added 🙂
btw: if you feel you can share how your reports looks like (screen shot is great), this will greatly help in supporting this feature , thanks

4 years ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

let me check

4 years ago

0 Hi! I Was Wondering Regarding This Issue:

` from time import sleep
from clearml import Task
import tqdm

task = Task.init(project_name='debug', task_name='test tqdm cr cl')
print('start')
for i in tqdm.tqdm(range(100)):
sleep(1)
print('done') `The above example code will output a line every 10 seconds (with the default console_cr_flush_period=10) , can you verify it works for you?

3 years ago

Show more results