AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8051

0 Hey, I Have A Problem With The Following Task:

JitteryCoyote63 I think I failed explaining myself.

I think the problem of the controller is that you are interacting (aka changing hyper parameters)) with a Task created using new SDK version, with an older SDK version. specifically we added section names to the hyper parameters, and only new version of the SDK is aware of it.
Make sense?
Regrading the actual problem. It seems like this is somehow related to the first one, the task at run time is using an older SDK version , and I t...

4 years ago

0 Hello, We Are Currently Working On A Hyperparameter Tuning Job For Object Detection Following This Tutorial

Nicely done DeterminedToad86 🙂
Wasn't this issue resolved by torch?

3 years ago

0 Hello Everyone! Is It Possible To Deactivate Package Analysis For Remote Execution? I Run My Code With Clearml-Agent In Docker Mode With Nvidia:Pytorch Container. When Clearml Is Running Inside The Docker The Installed Packages Of The Webui Get Updated. H

clearml will register conda packages that cannot be installed if clearml-agent is configured to use pip. So although it is nice that a complete package list is tracked, it makes it cumbersome to rerun the experiment.

Yes mixing conda & pip is not supported by clearml (or conda or pip for that matter)
Even python package numbers might not exist on both.
We could add a flag not to update back the pip freeze, it's an easy feature to add. I'm just wondering on the exact use case

3 years ago

0 Hi Again. As I Am Running My Experiment From Server Using Agent, I Am Failing On The Point, Where The Arguments Of Argparse Are Processed. When Is The Agent Task Registered. I Am Getting None For Task.Current_Task() At The Begining Of My Script.

Hi WorriedParrot51
Assuming you run the code "manually" once (i.e. without the agent). Then when you call Task.init it will register the argparser.
When running with the agent, the first time you will call parse, it will automatically override the argparse defaults with the values stored in the Task.
Make sesne?

am getting None for Task.current_task() at the beginning of my script.

Task.init() is doing the magic , only after this call you will have current_task (either running manua...

4 years ago

Hi ReassuredTiger98
Could you send the log of both run ?
(I'm not sure this is a bug, or some misconfiguration , but the scenario should have worked...)

3 years ago

0 Hi, I'M Trying To Reproduce The Pipeline Example

Hi SplendidToad10
In order to run a pipeline you first have to create the steps (i.e Tasks).
This is usually dont by running the code once (basically running any code with Task.init call will create a Task for that specific code, including the enviroement definition needed to reproduce it by the Agent)

3 years ago

0 I Want To Run My Clearml Task On An Agent In K8S Together With A Memory Profiler (Maybe

FiercePenguin76 in the Tasks execution tab, under "script path", change to "-m filprofiler run catboost_train.py".
It should work (assuming the "catboost_train.py" is in the working directory).

3 years ago

0 How To Do Continuous Training With Trains? Can Someone Share Examples Or Docs To Get Started With Continuous Learning.

Questions

I want to trigger a retrain task when F1

That means that in inference you are reporting the F1 score, correct?

As part of the retraining I have to train all the models and then have to choose best one and deploy it

Are you using passing output_uri to Task.init? are you storing the model as artifact?
You can tag your model/task with "best" tag (and untag the previous one). Then in production , look for the "best" task and get its model
Thoughts?

3 years ago

0 Hi, I Have Several Long Running Experiments Failing With

Hi JitteryCoyote63
Signal 9 is killed signal, could it be someone killed the process ? Do you have other logs to share ? Is this reproducible ?

3 years ago

0 Hello Community, I Had A Query Regarding Clearml-Data , Can The Dataset Be Queried Against Some Metadata Using Ui And/Or Cli ?

HarebrainedBear62 this is what I have.
clearml-data will store all the files for you, and version the entire thing, make is a breeze to abstract the dataset from the code. Querying data is available using Apache Drill (though currently it is still not built into the platform, but we are planning to get there soon) Since this is Image based data/meta-data, I know the paid tier of ClearML, has n additional dedicated data management solution specifically for images, with full ability to query m...

3 years ago

0 Hi, Together With

I'm happy to hear! 😅

4 years ago

0 Hi, I Try To Write An Article On Medium About Clearml And Face Some A Problem With Plotly Figures. When Displaying The Figure Locally In A Browser Works Fine, But On The Cleaml Server (I Use The Free Tier Service) The Plot Is Empty And Has The Title 'Unkn

WickedGoat98 Nice!!!
BTW: The fix should solve both (i.e. no need to manually cast), I'll make sure the fix is on GitHub so you'll be able to verify 🙂

3 years ago

0 Hi Guys, I Managed To Set Up A Kubernetes Cluster And Install Trains Into It. While Testing My Set-Up I Run The Test_Reporting.Py Example

WickedGoat98 Actually the fileserver replied, so it all looks fine to me.
Try to run the text example again, see if you are still getting the fileserver error .

3 years ago

0 When It Comes To Continuous Training, I Wanted To Know How You Train Or Would Train If You Have Annotated Data Incoming? Do You Train Completely Online Where You Train As Soon As You Have A Training Example Available? Do You Instead Train When You Have A

My main query is do I wait for it to be a sufficient batch size or do I just send each image as soon as it comes to train

This is usually a cost optimization issue, generally speaking if GPU up time is not an issue that the process is stochastic anyhow, so waiting for a batch or not is not the most important factor (unless you use batchnorm layer, in that case this is basically a must)

I would not be able to split the data into train test splits, and that it would be very expensiv...

3 years ago

0 Hi, I'M Getting Error 404 When Trying To See Debug Samples I'Ve Recorded With Record_Image. The Local Path I'Ve Provided Is Valid (Image Is Displayed Normally When I Read It Via Python For Example) But Trains Ui Tell Me In The Debug Samples "Unable To Loa

What's the trains-server version ?
You can see it if you go to the profile page

4 years ago

0 Hi, I'M Trying To Install A New Server, This Is A Fresh Ubuntu 18.04 Install. When I Try To Run The Docker Composer Up Command I Get Error Messages Like This One:

CourageousLizard33 specifically section (4) is the issue (and it's related to any elastic docker, nothing specific to trains-server)
echo "vm.max_map_count=262144" > /tmp/99-trains.conf sudo mv /tmp/99-trains.conf /etc/sysctl.d/99-trains.conf sudo sysctl -w vm.max_map_count=262144 sudo service docker restartDid you try the above, and you are still getting the same error ?

4 years ago

0 Hi, I'M Trying To Deploy Clearml On Gke On Google Cloud Via Helm Using App Version 1.0.2 And Chart Version 2.0.2+1. I'M Seeing The Following

I'm sorry wrong line reference:
I'm assuming the error is due to ulimit missing:
try adding 16777216 to both soft/hard ulimit
https://github.com/allegroai/clearml-server/blob/09ab2af34cbf9a38f317e15d17454a2eb4c7efd0/docker/docker-compose.yml#L58

3 years ago

0 Hey, How Can I Add A Private Key In Order To Let The Clearml Agent To Clone From A Private Git Repository?

at the end it's just another env var

It should work GIT_SSH_COMMAND is used by pip

3 years ago

0 Question About The File Server. Currently, We Have A Machine With Minio Installed, And All File Communication Is Made Using The Minio Sdk Client. [Minio Is Just Like An S3 Bucket, Fully Compliant With S3 Protocol]. In The Examples I'Ve Seen The

EnviousStarfish54 Notice that you can configure it on the agent machine only, so in development you are not "wasting" storage when uploading debug checkpoints/models 🙂

4 years ago

In your trains.conf, change the value
files_server: ' s3://ip :port/bucket'

4 years ago

0 Hi Everyone, Is It Possible To Show The Upload Progress Of Artificats? E.G. I Use

An upload of 11GB took around 20 hours which cannot be right.

That is very very slow this is 152kbps ...

3 years ago

0 Is It Not Possible To Add Artifacts To A Completed Task?

task = Task.get_task('task_id_here') task.mark_started(force=True) task.upload_artifact(..., wait_on_upload=True) task.mark_completed()

3 years ago

0 Is It Not Possible To Add Artifacts To A Completed Task?

I think you can force it to be started, let me check (I pretty sure you can on aborted Task).

3 years ago

0 Playing Around With Hpo For First Time. I Am Giving This As Hyperparameter:

Tried context provider for Task?

I guess that would only make sense inside notebooks ?!

3 years ago

0 Is It Possible To Launch A

Hi ShallowArcticwolf27

from the command line to a remote machine while loading a local

.env

file as a configuration object?

Where would the ".env" go to ? Are we trying to pass it to the remote machine somehow ?

3 years ago

0 Hello Everyone. After Restart Self-Hosted Clearml Server The Data From Tabs Plots, Console And Scalars Are Gone Away For Every My Previous Experiment. But In Folders

Hi @<1533982060639686656:profile|AdorableSeaurchin58>
Notice the scalars and console are stored on the elasticsearch DB, this is usually under
/opt/clearml/data/elastic_7

one year ago

0 Executed From Within A Pipelinecontroller Task, What Possible Reason Does

So I checked the code, and the Pipeline constructor internally calls Task.init, that means that after you constructs the pipeline object, Task.current_task() should return a valid object....
let me know what you find out

2 years ago

0 Hi There, I Used

JitteryCoyote63 no you should not (unless you already have the Task.init call in your code)
clearml-data add the Task.init call at the beginning of the code in the entry point.
This means you should be able to get Task.current_task() and get back the object.
What do you have under the "uncommitted changes" on the Task that was created?

UnevenDolphin73 clearml.config.get_remote_task_id() will return the Task ID not the Task object. in order to get automagic to work, one h...

2 years ago

0 Hi There, I Used

JitteryCoyote63 I think I found the bug in clearml-task it adds it at the end instead of before everything else

2 years ago

0 I Originally Posted In

I have made a PR request.

Thanks you!!! 🎉 we will merge shortly 🙂

2 years ago

Show more results