AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 What Could Be The Reason For My Package To Not Be Loading Under The "Installed Packages"? I Have A

So if everything works you should see "my_package" package in the "installed packages"
the assumption is that if you do:
pip install "my_package"
It will set "pandas" as one of its dependencies, and pip will automatically pull pandas as well.
That way we do not list the entire venv you are running on, just the packages/versions you are using, and we let pip sort the dependencies when installing with the agent
Make sense ?

3 years ago

0 What Could Be The Reason For Fail Status Of A Task That Seems To Have Completed Correctly? No Information In The Log Whatsoever

Hmm should not make a diff.
Could you verify it still doesn't work with TF 2.4 ?

3 years ago

0 What Could Be The Reason For Fail Status Of A Task That Seems To Have Completed Correctly? No Information In The Log Whatsoever

YEY!

3 years ago

0 Hi Guys. Say That We Train A Model With 10 Epoch, And Suddenly Interruption Occur On Epoch 5. How Can We Continue The By Using Clearml?

Then running by using the

, am I right?

yep

I have put the

--save-period

while running Yolov5 and ClearML does not save the weight per epoch that I have trained. Why is this happened?

But do you still see it in the clearml UI ? do you see the models logged in the clearml UI ?

one year ago

0 Just Getting Started With Clearml, Any Recommended Videos On How To Get A Sample Project Up? I Am Using The One On Their Youtube Channel Right Now But I Am A Bit Confused As How To Use The Demoapp

Ohh you mean ~/clearml.conf ?

3 years ago

0 Hi Team, How To Configure Gerrit Details In Clearml So That Tasks Or Pipeline Will Be Executed Depends On Gerrit?

Hi @<1542316991337992192:profile|AverageMoth57>
Not sure I follow how the integration what you have in mind regarding Gerrit integration None
Sounds interesting ...
wdyt?

one year ago

0 Hi Team, How To Configure Gerrit Details In Clearml So That Tasks Or Pipeline Will Be Executed Depends On Gerrit?

ssh: Could not resolve hostname

: Name or service not known

@<1542316991337992192:profile|AverageMoth57> so is this the main issue? this seems unrelated to the Gerrit thing, just missing configuration of the .ssh on the agent machine, is that correct?

one year ago

0 Hi Everyone, Quick Question: Is The Self Hosted Version Free For Big Teams Or The Pricing Shown On The Website Refers Also To The Self-Hosted Case?

Hi @<1552101458927685632:profile|FreshGoldfish34>
self-hosted, you mean the open source ? if so, then yes totally free 🙂
That said I would recommend to have the server inside your VPN, just in case from a security perspective

one year ago

0 Hi, I'M Trying To Reproduce The Pipeline Example

BTW: there is a full Pipeline class that does everything for you, example here:
https://github.com/allegroai/clearml/tree/master/examples/pipeline

3 years ago

0 Hi, I'M Trying To Reproduce The Pipeline Example

Just run once (from your python console / pycharm etc.):
https://github.com/allegroai/clearml/blob/master/examples/automation/toy_base_task.py

3 years ago

0 Hi, I'M Trying To Reproduce The Pipeline Example

https://allegro.ai/clearml/docs/docs/examples/pipeline/pipeline_controller.html

3 years ago

0 Hello Again, How Can I Use The

Hi AgitatedTurtle16
You can find documentation here:
https://github.com/allegroai/clearml-session
Basically it uses the cleaml-agents to launch a session on one of the machines in the cluster.
In the remote session itself it install jupyterlab + vscode-server, then it connects to the remote session (running on the agent's machine) automatically over ssh and creates tunnel to these services.

3 years ago

0 Hello Again, How Can I Use The

Hi JumpyDragonfly13

I don't know why I'm getting

172.17.0.2

I think it (the remote jupyter Task) fails to get the correct IP address of the server.
You can manually correct it by going to the DevOps project, look for the runnig Task there, then under Configuration/Properties change external_address to the actual IP 10.19.20.15
Once that is done, re-run the clearml-session , it will suggest to connect to the running session, it should work....

BTW:
I'd like...

3 years ago

0 Hello Again, How Can I Use The

Basically run the 'agentin virtual environment mode JumpyDragonfly13 try this one (notice no --docker flag) clearml-agent daemon --queue interactive --create-queue Then from the "laptop" try to get a remote session with: clearml-session `

3 years ago

0 Is There A Way To Interface With Clearml Agent (Cli?) To Handle Model Repositories And Data Versioning (But So, Not Experimentation, Tight Integration, Pipelining, Etc)?

UnevenDolphin73 FYI: clearml-data is documented , unfortunately only in GitHub:
https://github.com/allegroai/clearml/blob/master/docs/datasets.md

3 years ago

0 When I Run Experiments I Set

Hi IntriguedRat44
Sorry, I missed this message...
I'm assuming you are running in manual mode (i.e. not through the agent), in that case we do not change the CUDA_VISIBLE_DEVICES.
What do you see in the resource monitoring? Is it a single GPU or multiple GPUs?
(Check the :monitor:gpu in the Scalar tab under results,)
Also what's the Trains/ClearML version you are suing and the OS ?

3 years ago

0 When I Run Experiments I Set

IntriguedRat44 If the monitoring only shows a single GPU (the selected one) it means it reads the correct CUDA_VISIBLE_DEVICES (this is how it knows that you are only using a selected GPU not all of them).
There is nothing else in the code that will change the OS environment.
Could you print os.environ['CUDA_VISIBLE_DEVICES'] while running the code to verify ?

3 years ago

0 When I Run Experiments I Set

IntriguedRat44 could I ask you to open a GitHub issue on it?
I really do not want it to slip through our fingers...
(BTW: meanwhile I was not able to reproduce it, what's the OS / nvidia drivers you are using )?

3 years ago

0 When I Run Experiments I Set

IntriguedRat44 how do I reproduce it ?
Can you confirm that marking out the Task.init(..) call will fix it ?

3 years ago

0 I'M Experiencing Some Weird Behavior From The Automatic Logging Iterations. It Seem To Be Capped At The Number Of Batches Rather Than The Epochs. How Can I Control Which Variable The Logging Mechanism Tracks?

Interesting... TrickyRaccoon92 could it be the validation phase was creating a new Tensorboard file ?

3 years ago

0 When I Run Experiments I Set

Thanks IntriguedRat44 !
I'll follow up on GitHub 🙂

3 years ago

0 Hi I Wanted To Use Method Task.Reset() Or Task.Delete() However None Of That Seems To Be Able To Delete

Hi @<1523707131994312704:profile|CrabbyKoala94>

I wanted to use method Task.reset() or Task.delete() however none of that seems to be able to delete

only

the logs in the "console" section in the UI.

So Task.reset will reset the entire outputs of the Task (and the status), as you noticed. Why would you want to just remove the logs?
You can disable the auto logs altogether if you really want to, see Task.init [auto_connect_streams](https://github.com/allegroai/cl...

one year ago

LOL, no worries 🙂

3 years ago

0 Hi I Wanted To Use Method Task.Reset() Or Task.Delete() However None Of That Seems To Be Able To Delete

I want to be able to delete only the logs since they are taking a lot of space in my case.

I see... I do not think this is possible 😞
You can disable the auto logging though ... pass auto_connect_streams=False to Task.init

one year ago

0 Hi All, I'M Trying To Create A Task In A Jupyter Notebook, And I Always Get This Warning:

SmugDog62 so on plain vanilla Jupyter/lab everything seems to work.
What do you think is different in your setup ?

3 years ago

0 I'M Trying To Spin Up A Task On An Agent And Inside The Task I Have Two Packages That I'Ve Created Custom Versions Of And Specified A Git Repo For In The Requirements.Txt. Example With Hydra-Core And Omegaconf:

@<1545216070686609408:profile|EnthusiasticCow4>
git+ssh:// will be converted automatically to git+https if you have user/pass ocnfigured in your clearml.conf on the agent machine.
More over, git packages are always installed After all other packages are installed (because pip cannot resolve the requirements inside the git repo in time)

one year ago

Could you also provide the full log?

one year ago

0 Hi, In My Setup I Run Multiple Experiments In Parallel From The Same Script. I Understand That There Can Only Be One Execution

Are these experiments logged too (with the train-valid curves, etc)?

Yes every run is log as a new experiment (with it's own set of HP). Do notice that the execution itself is done by the "trains-agent". Meaning the HP process creates experiments with new set of HP an dputs them into the execution queue, then trains-agent pulls them from the queue and starts executing them. You can have multiple trains-agent on as many machines as you like with specific GPUs etc. each one ...

3 years ago

0 Hi, In My Setup I Run Multiple Experiments In Parallel From The Same Script. I Understand That There Can Only Be One Execution

Well that depends on how you think about the automation. If you are running your experiments manually (i.e. you specifically call/execute them), then at the beginning of each experiment (or function) call Task.init and when you are done call Task.close . This can be done in parallel if you are running them from separate processes.
If you want to automate the process, you can start using the trains-agent which could help you spin those experiments on as many machines as you l...

3 years ago

0 Hi, In My Setup I Run Multiple Experiments In Parallel From The Same Script. I Understand That There Can Only Be One Execution

Hi SourSwallow36
What do you man by Log each experiment separately ? How would you differentiate between them?

3 years ago

Show more results