AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hi, Is There A Way To Get The Quota Used By Each Task? My "Metrics" Quota Is Filling Up Very Quickly And I Would Like To Understand What'S Causing It.

I can definitely feel you!
(I think the implementation is not trivial, metrics data size is collected and stored as commutative value on the account, going over per Task is actually quite taxing for the backend, maybe it should be an async request ? like get me a list of the X largest Tasks? How would the UI present it? As fyi, keeping some sort of book keeping per task is not trivial either, hence the main issue)

one year ago

0 Hi, Is There A Way To Get The Quota Used By Each Task? My "Metrics" Quota Is Filling Up Very Quickly And I Would Like To Understand What'S Causing It.

Like get the tasks that uses the most metrics API?

one year ago

0 Hi ! I Have A Config Dictionary Which Is A Dot Dictionary ( A Dictionary That Supports Dot Notation As Well As Dictionary Access Notation Set Attributes: D.Val2 = 'Second' Or D['Val2'] = 'Second' Get Attributes: D.Val2 Or D['Val2'] ) I Ru

Hi @<1571308003204796416:profile|HollowPeacock58>

parameters = task.connect(config, name='config_params')

It seems that your DotDict does not support the python copy operator?
i.e.

from copy import copy
copy(DotDict())

fails ?

one year ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

Yes, this seems like the problem, you do not have an agent (trains-agent) connected to your server.
The agent is responsible for pulling the experiments and executing them.
pip install trains-agent trains-agent init trains-agent daemon --gpus all

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

What should have happened is the experiments should have been pending (i.e. in a queue)
(Not sure why they are not).
You can manually send them for execution , right click on an experiment in the able, select enqueue and select the default queue (This will be the one the trains-agent will pull from , by default)

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

A more detailed instructions:
https://github.com/allegroai/trains-agent#installing-the-trains-agent

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

ShaggyHare67 notice that the services queue is designed to run CPU based tasks like monitoring etc.
For the actual training you need to run your trains-agent on a GPU machine.
Did you run the trains-agent init ? it will walk you through the configuration (git user/pass) included.
If you want to manually add them, you can see an example of the configuration file in the link below.
You can find it on ~\trains.conf
https://github.com/allegroai/trains-agent/blob/master/docs/tr...

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

Hi ShaggyHare67 ,
Yes the trains.conf created by trains-agent is basically an extension of the trains usage (specifically it adds a section for the agent)
I'm assuming you are running the agent on the same development machine.
I guess the easiest is to rename the trains.conf to trains.conf.old and run trains-agent init
(No need to worry, the trains package supports it , so the new configuration file that will be generated will work just fine)

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

ShaggyHare67

Now the

trains-agent

is running my code but it is unable to import

trains

...

What you are saying is you spin the 'trains-agent' inside a docker? but in venv mode ?

On the server I have both python (2.7) and python3,

Hmm make sure that you run the agent with python3 trains-agent this way it will use the python3 for the experiments

3 years ago

0 Is It Possible To Embed A Streamlit App In A Clearml Report? Are There Other Ways To Integrate Streamlit Apps?

Hi @<1575656665519230976:profile|SkinnyBat30>
Streamlit apps are backend run (i.e. the python code drives the actual web app)
This means running your Tasks code and exposing the web app (i.e. http) streamlit.
This is fully supported with ClearML, but unfortunately only in the paid tiers 😞
You can however run your Task with an agent, make sure the agent's machine is accessible and report the full IP+URL as a hyper-parameter or property, and then use that to access your streaml...

one year ago

0 Is It Possible To Embed A Streamlit App In A Clearml Report? Are There Other Ways To Integrate Streamlit Apps?

Can the host server's service agent be used?

In theory yes, just make sure you expose the containers network (check the docker compose)

one year ago

0 Hi Again. As I Am Running My Experiment From Server Using Agent, I Am Failing On The Point, Where The Arguments Of Argparse Are Processed. When Is The Agent Task Registered. I Am Getting None For Task.Current_Task() At The Begining Of My Script.

or shall I call the Task.init even from the agent

WorriedParrot51 I think something is lost here.
Task.init() is always called, even when the agent is executing the code. The difference is in what happens inside the Task.init() call. When the codebase itself is executed by the trains-agent, it signals through OS environment to the task.init() that instead of a new created task, it should use the already created one. from this point all data flows from the trains-server back into the c...

4 years ago

0 Hi, I Am Quite Sure, That Someone Has Already Asked This Before, But I Suppose, That The Answer Will Be Simple: I Am Trying To Run Trains-Agent In Docker Mode, But I Need To Setup Pythonpath To Point To The Cloned Repo. I Was Trying To Add Following Arg:

Hi WorriedParrot51
Take a look at the Experiment execution section:
there is script and working directory
working directory is the base of the git repository (which is cloned into the docker file)
So if for some reason trains did not properly detect the current working dir here is what should solve the issue, without changing the PYTHONPATH

script path: ./sub_folder/scripy.py working directory: .
What do you think?

4 years ago

Oh, did you try task.connect_configuration
?
https://allegro.ai/docs/examples/reporting/model_config/#using-a-configuration-file

4 years ago

0 Hi All, Is There A Way To Schedule The Tasks From The Queue Onto The Gpu Instances Based On Factors Such As Gpu Utilisation, Number Of Cpu Cores Present, Free Memory Or Custom Parameters Such As Priority Of The Task, Estimated Time Etc?

I am trying to see if the user can submit a list of resource requirements (e.g 4GPUs, 12 cores, 100GB diskspace) for the task when queuing the task and the agents pick these tasks if they have the requested resources. With this, the user need not think about which queue to send the task to. The users just state what they need and the agents do the scheduling for them.

Can I assume we are talking Kubernetes under the hood for the resource allocation ?

3 years ago

0 Hi Team, How To Configure Gerrit To Clearml?

I'm answering on the original thread

one year ago

0 Is There An Enviroument Variable That I Can Use To Set The Trains.Conf File Path?

SlipperyDove40 Yes there is
TRAINS_CONFIG_FILEhttps://allegro.ai/docs/faq/faq/#trains-configuration

4 years ago

0 How Can I Integrate Trains-Server To Aws Ec2 Api

Sure :) https://allegro.ai/docs/examples/services/aws_autoscaler/aws_autoscaler/

4 years ago

0 Hi, I Am New Here. I Was Wondering Where Can I Configure Which Machines Trains (Or Trains-Agent?) Use For Queueing Tasks, And How Do I Create Such Queues. Thanks.

SmarmySeaurchin8 just so that I don't miss anything.
One machine, two trains-agents each one connected to a different trains-server, correct ?
from the trains-agent --help
trains-agent --config-file /home/user/my_trains_server1.conf daemon trains-agent --config-file /home/user/my_trains_server2.conf daemon

4 years ago

0 Hi, Can I Use Clearml As A Tool For Deploying Models In A Private Network? Did Not Manage To Understnd From The Docs.

does the clearml server is a worker i can serve on models?

The serving is done by one of the clearml-agents.
Basically you spin an agent, then this agent is spinning the model serving engine container (fully managed).
(1) install run run clearml-agent (2) run clearml-session CLI to configure and spin the serving engine

3 years ago

0 Hi, Is It Possible To Disable Some Of The System Metrics Monitored? And Also Downsample The Rate Of Logging?

Hi JitteryCoyote63
The easiest is to inherit the ResourceMonitor class and change the default logging rate (you could also disable some of the metrics).
https://github.com/allegroai/clearml/blob/701fca9f395c05324dc6a5d8c61ba20e363190cf/clearml/task.py#L565
Then pass the new class to Task.init as auto_resource_monitoring

3 years ago

0 Just Wanted To See Is Chatgpt Correct?

No way?! Is this real chatgpt answer?

one year ago

0 Hello! Getting Credential Errors When Attempting To Pip Install Transformers From Git Repo, On A Gpu Queue.

Hi SmallDeer34
I need some help what is the difference between the manual one and the automatic one ?
from your previous log, this is the bash command executed inside the container, can you try to "step by step" try to catch who/what is messing it up ?
` docker run -it --gpus "device=1" -e CLEARML_WORKER_ID=Gandalf:gpu1 -e CLEARML_DOCKER_IMAGE=nvidia/cuda:11.4.0-devel-ubuntu18.04 -v /home/dwhitena/.git-credentials:/root/.git-credentials -v /home/dwhitena/.gitconfig:/root/.gitconfig -v /tmp/...

3 years ago

0 Hi! I Am Trying To Provide A Custom

But a warning instead of an error would be good.

Yes, that makes sense, I'll make sure we do that

Does this sound like a reasonable workflow, or is there a better way maybe?

makes total sense to me, will be part of next RC 🙂

3 years ago

0 Hi Folks, I Have A Question Related To The Storage Of Artifacts, As It Is Not Entirely Clear To Me Where To Configure It. If I Read The Documentation

when I run it on my laptop...

Then yes, you need to set the default_output_uri on Your laptop's clearml.conf (just like you set it on the k8s glue)
Make sense ?

2 years ago

0 How Come

ShinyLobster84

fatal: could not read Username for '

': terminal prompts disabled

This is the main issue, it needs git credentials to clone the repo code, containing the pipeline logic (this is the exact same behaviour as pipeline v1 execute_remotely(), which is now the default, could it be that before you executed the pipeline logic, locally ?)
WackyRabbit7 could the local/remote pipeline logic could apply in your case as well ?

3 years ago

0 I Have Another Small Technical Question, I Am Trying To See The Workers Status Programatically Using The Folowing:

So basically the APIClient is a pythonic interface to the RestAPI, so you can do the following
See if this one works
# stats from he last 60 seconds for worker in workers: print(client.workers.get_stats(worker_ids=[worker.id], from_date=int(time()-60),to_date=int(time()), interval=60, ))

3 years ago

0 How Come

I started running it again and it seems to have passed the phase where it failed last time

Yey!

Yes it is a common case....

I have the feeling ShinyLobster84 WackyRabbit7 you are not alone in this one 🙂 let me make sure we change the default value of Yes it is a common case to False, so the code looks cleaner

3 years ago

0 How Come

Is this a common case? maybe we should change the run_pipeline_steps_locally argument to False?
(The idea of run_pipeline_steps_locally=True is that it will be easier to debug the entire pipeline on the same machine)

3 years ago

0 The Overview Panel Would Be Extremely Well Suited For The Task Of Selecting A Number Of Projects For Comparing Them. Another Useful Feature Would Be To Allow Adding Information (E.G. Metrics Or Metadata) To The Tooltip. Would You Consider Adding This

Yes, I find myself trying to select "points" on the overview tab. And I find myself wanting to see more interesting info in the tooltip.

Yep that's a very good point.

The Overview panel would be extremely well suited for the task of selecting a number of projects for comparing them.

So what you are saying, this could be a way to multi select experiments for detailed comparison (i.e. selecting the "dots" on the overview graph), is this what you had in mind?

3 years ago

Show more results