AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hey, Great Product! I'Ve Installed Trains Agent On A Python3 Venv, But When I Run A Script On The Worker, It Calls Python2 Instead Of Python 3. How To Change It?

VivaciousWalrus99
Yes this is odd:
1608392232071 spectralab:gpu0 DEBUG New python executable in /cs/usr/gal.hyams/.trains/venvs-builds/3.7/bin/python2So it thinks it has python v3.7 but it is using python2 in the venv...
In your trains.conf file, set agent.python_binary to the python3.7 binary. It should be something like:
agent.python_binary=/path/to/python/python3.7

4 years ago

0 Hi Everyone, Yesterday I Pushed An Experiment To The

Let me check, it was supposed to be automatically aborted

4 years ago

0 Hi, In My Setup I Run Multiple Experiments In Parallel From The Same Script. I Understand That There Can Only Be One Execution

Are these experiments logged too (with the train-valid curves, etc)?

Yes every run is log as a new experiment (with it's own set of HP). Do notice that the execution itself is done by the "trains-agent". Meaning the HP process creates experiments with new set of HP an dputs them into the execution queue, then trains-agent pulls them from the queue and starts executing them. You can have multiple trains-agent on as many machines as you like with specific GPUs etc. each one ...

4 years ago

0 Another Question: Is It Possible To Specify In Which Directory To Save All The Files That Clearml-Agent Creates (E.G. Cache Files Or Results Of The Currently Running Experiments)

Hmm, so the way the configuration works is it loads the default configuration (equivalent to the example in the docs) then it adds the ~/clearml.conf on top. That means that you can tell your users to just copy paste the credentials from the UI into a template you make. How is that ?

4 years ago

0 Hi

Thank you AttractiveWoodpecker16 !
Removing the uncommitted changes so that you can launch it from an agent? Or is it visual only?

3 years ago

0 Hello, How Do You Manage To Unload A Model From Clearml-Serving Api? I Am Trying To Unload A Model Through Grpc Via

I would like to be able to send a request to unload the model (because I cannot load all the models in gpu, only 7-8) o

Hmm is this part of the gRPC interface of Triton? if it is, we should be able to add that quite easily,

one year ago

0 Dear Developers, I Encountered A Question That The Local Module Cannot Be Found When Pulling Task From Queue. I Opened A Issue Here

Well it should work, make sure you see the Task "holds" all the information needed (under the execution tab). repo / uncommitted changes / python packages etc.
Then configure your agent (choose pip/conda/poetry as package managers), and spin it up (by default in venv/coda mode, or in docker mode)
Should work 🙂

3 years ago

0 Hey All, Hope You’Re All Doing Well. I’M Running A Self-Deployed Server (0.17, I Think, Where Can You Find The Version In Use?). I’M Having Trouble With The Automatic Plot Capture. If I Run

pip3 install clearml==0.17.5rc5

4 years ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

TRAINS_WORKER_NAME=first_agent trains-agent --gpus 0
and
TRAINS_WORKER_NAME=second_agent trains-agent --gpus 0

5 years ago

0 Hello ! When Running

What is the proper way to change a clearml.conf ?

inside a container you can mount an external clearml.conf, or override everything with OS environment
https://clear.ml/docs/latest/docs/configs/env_vars#server-connection

3 years ago

0 I Have A Question Regarding Running The Code Directly In The Agent Without Running It On My Local Device. How Can I Do That? (Usually, I Run The Code In My Local Machine With The Two Magic Lines Then Clone The Task Then Enqueue It) I Know This Way To R

Hi WickedBee96

How can I do that?

clearml-task
https://clear.ml/docs/latest/docs/apps/clearml_task#what-is-clearml-task-for

I know this way to run it in the agent only by enqueue the draft after running it on my local machine so is there another way?

Or maybe are you looking for task.execute_remotely
https://clear.ml/docs/latest/docs/references/sdk/task#execute_remotely

2 years ago

0 Hello, I Have A Server With 2 Gpus. Many Users Want To Train On The Gpus. That Means The Git Credentials Are Different For Every User. In The Clearml.Conf File I Let All Credentials Blank And Set Force_Git_Ssh_Protocol To True. But Where Do I Define The P

Hi UnsightlySeagull42

does anyone know how this works with git ssh credentials?

These will be taken from the host ~/.ssh folder

4 years ago

0 Hi Anyone

Hi AstonishingWorm64
Is this the same ?
https://github.com/allegroai/clearml-serving/issues/1
(I think it was fixed on the later branch, we are releasing 0.3.2 later today with a fix)
Can you try:
pip install git+

4 years ago

0 Hello I'M New Here, I Found This Error When Running This Command "Docker-Compose --Env-File Example.Env -F Docker-Compose-Triton.Yml Up". Actually, When I Run This Command For The First Time, It Worked. And Then When I Try To Change To My Friend'S Workspa

MoodyCentipede68 could it be that the model is on one account (workspace) and your credentials (the ones provided to the docker compose) are from another workspace?
The error itself point to the triton helper failing to get the model ID from the backend. The models are uploaded to a a specific workspace, and it looks like a mismatch (I.e. the model Id is nowhere to be found) wdyt?

3 years ago

0 Hi, I Have A Future Roadmap Question On Clearml-Datasets. The Current Implementation Works Well For Small Datasets But Its Rather In Effective For Very Large Datasets. For Example, Let'S Say I Have 10 Million Images Just For The Training Dataset, And My T

Would you have an example of this in your code blogs to demonstrate this utilisation?

Yes! I definitely think this is important, and hopefully we will see something there 🙂 (or at least in the docs)

3 years ago

0 With

However, that would mean passing back the hostname to the Autoscaler class.

Sorry my bad, the agent does that automatically in real-time when it starts, no need to pass the hostname it takes it from the VM (usually they have some random number/id)

4 years ago

0 Hello, Is It Possible To Run Trains Offline Where There'S No Http Connection Between The Node Running The Job And Where The Web Ui Runs? I See In Your Diagram The Connection Between Training Machine And Trains Server (Which Contains The Web Ui) Is Over Ht

The import process actually creates a new Task every import, that said if you take a look here:
https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/trains/task.py#L1733
you can pass a pre-existing Task ID to "import_task" https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/trains/task.py#L1653

5 years ago

0 Hello Everyone! I Have A Problem With Clearml. Could You Please Help Me? I Have 2 Little Projects With Total 31 Experiments. And Its 837Mb Metric Stored. Where Can I Find A Detail Information About This Memory Quota Spending? I Really Don'T Understand, Wh

Hi @<1674588542971416576:profile|SmarmyGorilla62>
You mean on your elastic / mongo local disk storage ?

one year ago

0 Hello, Is It Possible To Disable Lazy Loading ? It’S Quite Horrible To Use In Console Logs For Instance, Where Search Is Useless As It Doesn’T Request Anything But Only Filter Currently Loaded Logs, And From My Browser Info The Ui Loads Previous Logs By 7

Hi @<1523706645840924672:profile|VirtuousFish83>

Hello, is it possible to disable lazy loading ?

You mean in the UI for loading the console ?
The logs can be huge 10s and 100s of MB...

We have the same issue for hyperparameters even with only ~100 keys,

100+ parameters that is quite a lot.
So are you saying the search in the UI only filter the lazily loaded elements and not the entire param list?

2 years ago

0 I’M Trying To Use

Hi LazyTurkey38
What do you mean the git repo is not recognized? When execute_remotely leaves you should see on the task a ref to the git repo with the exact commit ID you have locally pulled, do you see it under the Execution tab?

4 years ago

0 Hi People! I Think The Clearml

BTW: what's the clearml-server version ?

2 years ago

0 Hi All, I Have An Issue With The Way Hyper Parameters Are Logged Under Configuration, The Values That Are Stored Seem To Add Unnecessary Escape Characters To The Original Values.. Is It A Known Issue? Is There A Way To Change It? Thanks

It should preserve the order as the order of the update back (i.e. when executed by the agent) is the same as the order of the keys (obviously py3.7+ becuase it creates dict not Ordered Dicts)

4 years ago

0 Hi All, I Observed That When I Get A Dataset With

I think you are correct 😞 Let me make sure we add that (docstring and documentation)

3 years ago

0 Can I Use

Thank you! 😍

2 years ago

0 If I Have A Task And A Dataset Is Being Created In A Task, How Can I Get A “Link” That This Dataset Is Created In This Task, Similar To How Model Has The Task Where It Came From

Maybe we should do that automatically ? wdyt?

4 years ago

0 I Have A Problem With Clearml-Agent, The Agent Is Cloning Repository, But When Executing This Command:

UpsetTurkey67 are you saying there is a sym link in the original repository, and when it copies it, it breaks the symlink ?

3 years ago

0 Hi, I'M Trying To Get An Understanding Of How

Hi GiddyTurkey39 ,

When you say trains agent, are you referring to the trains agent command ...

I mean running the trains-agent daemon on a machine. This means you have a daemon pulling jobs from the execution queue and executing them (either in virtual environment, or inside a docker)
You can read more about https://github.com/allegroai/trains-agent and https://allegro.ai/docs/concepts_arch/concepts_arch/

Is it sufficient to queue the experiments

Yes there is no ne...

5 years ago

0 Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

NICE!

5 years ago

0 Hi There! Is There Any Way To Boost Creating Sha2 Hashes During

Switching to process Pool might be a bit of an overkill here (I think)
wdyt?

3 years ago

0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

Can you put here the task.connect line ? (btw: I would assume there is no need for additional connect, if using hydra+fire, no ?)

3 years ago

Show more results