AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Questions 49
Answers 8124

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

3 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

one year ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

5 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Thank You All For Taking The Time To Answer Our Survey (If You Haven'T Already, We Urge You To

Thank you all for taking the time to answer our survey (If you haven't already, we urge you to do so ). Your feedback has a major impact on what we build, do...

clearml

5 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

I Would Guess Connectivity Issues, The Tls Is Probably Python Inaccurate Response (I Mean In A Way, It Is Also A Tls Error, But I Would Imagine This Has More To Do With The Actual Network Connection)

I would guess connectivity issues, the TLS is probably python inaccurate response (I mean in a way, it is also a TLS error, but I would imagine this has more...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

5 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi , v0.15 is out, 🎉 🚀 Your feedback had a major influence on the features we added 🙂 thank you! A selected list of features: Column resizing / ordering /...

clearml

5 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

5 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

2 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hello Everyone!

clearml

5 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

This Will Close It

This will close it Task.current_task().close()I think we should rename completed() because it just marks the Task as completed on the backend but does not ac...

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

We Recently Released A New Version Of

we recently released a new version of clearml-session with Persistent Workspace support! 🚀 🎉 Finally you can develop on remote machines with workspace fold...

remote-ssh

one year ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

@YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo server, and do get the Scalars without any issues...

YummyWhale40 you are saying the example code is not working when running with the demo server? Also I think I was able to view your experiment on the demo se...

clearml

5 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

4 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

3 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

3 years ago

0 Votes

0 Answers

2K Views

0 Votes 0 Answers 2K Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

4 years ago

Show more results

0 Hello! I Try Add Dataset To Clearml Using Clearml-Data. All Images In One Folder, Size Around 5Gb. After Upload To Cloud I Get This Error Clearml.Metrics - Error - Failed Reporting Metrics: <400/0: Unknown (Error: Events.Add_Batch Request Exceeds Limit 75

Hurray!

4 years ago

0 Hi, I Failed To Update The "Started At" And The "Completed At" Attributes In The "Info" Tab. I Tried To Do So By The Following Steps:

what's the error/reply ?

4 years ago

0 Anyone Knows Why This Happens?

Can you also share the full log? the numbers seem of (and clearml cannot actually "invent" those numbers they are coming from somewhere...)

3 years ago

0 I Am Running Trains=0.16.4 Python==3.7.5 , And Notice That The "Log" Page Sometimes Didn'T Capture The Console Log From My Program. Is This A Known Issue, Anyone Have Experienced Similar Behavior?

Is this a logging issue, or clearml issue ?

4 years ago

0 Hey Trains Riders, This Must Be Something Simple I Am Missing, But Still I Couldn'T Realize What The Problem Is. I Am Trying To Run Trains-Agent On My Experiments. Setup Of The Server And The Agent Is Fine, But I Am Struggling To Run Real Experiments (Not

Hi ColossalDeer61 ,

Xxx is the module where my main experiment script resides.

So I think there are two options,
Assuming you have a similar folder structure-main_folder
--package_folder
--script_folder
---script.py
Then if you set the "working directory" in the execution section to "." and the entry point to "script_folder/script.py", then your code could do:
from package_folder import ABC
2. After cloning the original experiment, you can edit the "installed packages", and ad...

5 years ago

0 Hi All. I'M Setting Up An Model Export Script That Will Export Trained Models For Edge Deployment. I Initially Thought About Setting It Up As A Trigger Scheduler, And To Have It Trigger On Tags On A Published Model, But As Time Goes By The Trigger Schedul

Task.enqueue will execute immediately, i need execute task to spesific time

Oh I see what you mean, trigger -> scheduled (cron alike) -> Task executed.
Is that correct?

one year ago

0 Has Anyone Compared

Thank you so much!

3 years ago

0 I Am Getting This Specific Message When Trying To Run Hyper Parameters Optimization (Running Remotely My Task). Does It Affect My Flow? Do I Have Something To Worry About?

Although I didn't understand why you mentioned

torch

in my case?

Just a guess 🙂 other frameworks do multi-process as well,

I would guess it relates to parallelization of Tasks execution of the

HyperParameterOptimizer

class?

Yes that might be it, it's basically by product of using python "Process" class for multiprocessing. we are working on a fix, not a trivial unfortunately

3 years ago

0 Hi. Here'S A Question For

Hi FancyWhale93 , in your clear.conf configure default output uri, you can specify the file server as default, or any object storage:
https://github.com/allegroai/clearml-agent/blob/9054ea37c2ef9152f8eca18ee4173893784c5f95/docs/clearml.conf#L409

3 years ago

0 On The Clearml Web Interface You Obviously Need To Provide The Aws Credentials To Do Things Like Download Artifacts And Data Stored On Aws. One Thing I'M Curious About Is If You Do Provide The Credentials, When You Do Things Like Delete A Dataset Or Task,

Hi @<1545216070686609408:profile|EnthusiasticCow4>

will ClearML remove the corresponding folders and files on S3?

Yes and it will ask you for credentials as well. I think there is a way to configure it so that the backend has access to it (somehow) but this breaks the "federated" approach

2 years ago

0 Hi, I Am Saving Plt Chart To Clearml Using

MortifiedDove27
Nice!!!

4 years ago

0 We Are Facing Performance Issues Of Our Self-Hosted Clearml Server Looking At The Cpu Utilization \ Memory \ Networking We Couldn'T Identify A Bottleneck We Are At The Moment Using ~100 Workers For Some Hpo, And The Main Performance Issues We Observe Are

Hi DepressedChimpanzee34 , took me a while but I think there is a solution:
In your docker file, replace:
https://github.com/allegroai/clearml-server/blob/a64c4d264d00eadd2d11818b37151d3cc6266d99/docker/docker-compose.yml#L5
with
entrypoint: /bin/bash command: -c "mkdir -p /var/log/clearml && cd /opt/clearml/ && python3 -m apiserver.apierrors_generator && gunicorn -w 4 -t 600 --bind=0.0.0.0:8008 apiserver.server:app"

3 years ago

0 Hello! How Can I Use "Report_Scatter2D" In Order To Report Timestamp In The X-Axis?

but I cannot compare between them

I think we noticed it, and this will be fixed in the next server update (again, some plotly.js issue there)

4 years ago

0 Quick Question About The

ReassuredTiger98 I guess this is a plotly feature, none the less I think you can shift the Y axis manually (click and drag)

4 years ago

0 Hello, How Do I Deploy An Agent In

Hi IcySwallow94
Are you deploying the clearml server with the helm chart ?

2 years ago

0 Hello, Is It Possible To Run Trains Offline Where There'S No Http Connection Between The Node Running The Job And Where The Web Ui Runs? I See In Your Diagram The Connection Between Training Machine And Trains Server (Which Contains The Web Ui) Is Over Ht

not really 😞
Why would you want to set it up manually ? makes sense to have it in the cache folder, no?

5 years ago

0 If I Update The Annotation Files For Some Dataset And Upload It. Can It Call The Previous Version Of Dataset Using The Dataset Id ?

Hi WearyLeopard29
Yes 🙂 this is exactly how it should work

4 years ago

0 Hello! I Was Hoping I Could Get Some Debug Help. I'Ve Set Up A Clearml Pipeline Using The Pipelinecontroller, And When Running Through

It just seems frozen at the place where it should be spinning up the tasks within the pipeline

And is there an agent for those ? usually there is one agent for running logic tasks (like pipelines) running with --services-mode which means multiple Tasks can be executed by the same agent. And other agents for compute Tasks that are a signle Task per agent (but you can run multiple agents on the same machine)

2 years ago

0 Hi (Again... Sorry For Asking So Many Questions) Question About Using Google Cloud Storage In A Clearml Agent Running In Aws Ec2 Instance. My

Hi PanickyMoth78
Hmm yes, I think the StorageManager (i.e. the google storage pythonclinet) also needs a json file with the credentials.
Let me check something

3 years ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

EFS get downloaded to the k8 pod local volume?

EFS is an Amazon service that mounts a persistent FS into ec2 instances, I believe they have support for k8s as a service as well, which would make it kind of like a PV only as a service.
Does that make sense ?

2 years ago

Another (minor) issue is that all the packages that are installed using git+https are cloned and installed twice, immediately one after the other

Yes this is so that we can better log the installed package name, not a major issue, but we just fixed a bug with derivative packages from git packages.
https://github.com/allegroai/trains/issues/196

5 years ago

0 Is Anyone Also Experiencing Network Error During Every Clearml Dataset Download? It'S Been A While And Almost Every Download Fails...

Hi BitterStarfish58
Where are you uploading it to?

3 years ago

0 Hello! Since Today I Get

Okay this seems correct:

pytorch=1.8.0=py3.7_cuda11.1_cudnn8.0.5_0

I can't seem to find what's the diff between the two.
Give me a second let me check if I can reproduce it somehow.

4 years ago

0 Hi. I Have A Few Questions About The Snippet Attached

is some explanation of how functions become pipelines and components.

That is a good point! I'll make sure we mention it in the pipeline section of the docs

This whole experiment with random numbers started as my attempt at verifying that code in clearml.pipeline

Correct 🙂 BTW: this is why we added PipelineDecorator.run_locally() so it is easier to debug, you can also use PipelineDecorator.debug_pipeline() to run the entire pipeline in a single process as python ...

3 years ago

0 Hi, I Am New Here. I Was Wondering Where Can I Configure Which Machines Trains (Or Trains-Agent?) Use For Queueing Tasks, And How Do I Create Such Queues. Thanks.

But same server ?

4 years ago

It's the same but done from outside, you want the same and "offline" as well right?

3 years ago

0 Hi, I'M Following The Instructions For

OutrageousSheep60

I found the task in the UI -

and in the

UNCOMMITTED CHANGES

execution section there is

No changes logged

This is the issue.

and then run the

session

via docker

clearml-session --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 \ --packages "clearml" "tensorflow>=2.2" "keras" \ --queue MY_QUEUE \ --verboseAre you running the "cleamrl-session" from your machine? (i.e. not from inside a docker) ?...

3 years ago

0 Hi, I Am Trying To Clone An Experiment. Using The Server Gui, I Select 'Clone' And Then 'Enqueue'. In The Console Window, I See That Clearml Makes Sure The Environment Is Installed, And Then It Goes Into A 'Completed' Status Although The Experiment Did N

Any chance your code needs more than the main script, but it is Not in a git repo? Because the agent supports either single script file, or a git repo with multiple files

2 years ago

0 Hi, I'M Following The Instructions For

clearml_agent: ERROR: Can not run task without repository or literalscript in script.diff

This is odd ...

OutrageousSheep60 when you launch clearml-session it tells you the session ID (which is also a Task ID), can you look for it in the UI and check there is something in the repo/uncommitted-changes section ?

3 years ago

0 Hello! I Have A Quick Question About The Clearml Hyperparameter Optimizations Module. Is It Possible To Use It Without Using The Clearml Agent System? In Other Words, Launch A Script From A Few Machines Manually But The Hyperparameters Are Given From Cle

That's not possible, right?

That's actually what the "start_locally" does, but the missing part is starting it on another machine without the agent (I mean it totally doable, and if important I can explain how, but this is probably not what you are after)

I really need to have a dummy experiment pre-made and have the agent clone the code, set up the env and run everything?

The agent caches everything, and actually can also just skip installing the env entirely. which would mean ...

2 years ago

Show more results