GrievingTurkey78

34 Questions, 125 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

119 × Eureka!

Questions 34
Answers 125

0 Votes

2 Answers

940 Views

0 Votes 2 Answers 940 Views

Hi! What Would Be The Way For Manually Uploading A Model? I Have Intermediate

Hi! What would be the way for manually uploading a model? I have intermediate .pt files which I don't want to upload. Is there a way to turn off clearml capt...

clearml

3 years ago

0 Votes

5 Answers

933 Views

0 Votes 5 Answers 933 Views

Hi, With The Upcoming Version Of Hydra It Seems The Binding Breaks. Specifically In The

Hi, with the upcoming version of Hydra it seems the binding breaks. Specifically in the run_job function the argument order changed from https://github.com/f...

clearml

3 years ago

0 Votes

2 Answers

937 Views

0 Votes 2 Answers 937 Views

Hi AgitatedDove14 ! Regarding the Hydra integration, which pattern should be used? Call the task inside the decorated function? Will this store the parameter...

clearml

3 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi! I Am Currently Using Hydra+Clearml And Wanted To Know If There Are Still Some Updates Coming. At The Moment, If I Change The Defaults Hydra Uses From The

Hi! I am currently using Hydra+ClearML and wanted to know if there are still some updates coming. At the moment, if I change the defaults hydra uses from the...

clearml

3 years ago

Show more results

0 Hi! I Recently Updated My Server And My Clearml Version, Now When I Set A Task To Be Executed Remotely Its Default State Is Aborted Hence I Have To Reset And Enqueue, Is There Something I Am Doing Wrong (I Am Using Hydra Too)?

Thanks SuccessfulKoala55 !

3 years ago

0 Hi, I Was Getting A Really Weird Error Due To Mismatch On The Versions Between The Installed Libraries In My Environment And The Ones Ran In The Node (I Manually Changed The Installed Packages And Everything Worked). How Can I Force Trains To Use Exactly

Using detect_with_pip_freeze: true runs into package version not found for some of the ones I have locally.

4 years ago

0 Hi

Thanks!

3 years ago

Pigar is capturing different versions that the ones I have installed on my local machine (not a problem except for one). I just want to force the version of that package in a way that I don’t have to manually change it from the UI for every experiment.

4 years ago

0 Hi! I Have The Previous Trains Server Configured With Multiple Experiments; I Created It Using The Gcloud Images Provided. If I Want To Update The Server To The Newest Clearml Version Should I Follow These Steps

Thanks AgitatedDove14

3 years ago

0 Hi! I Am Having Some Problems With A Loss After A Good Amount Of Training, What Would Be The Best Way To Log A Value To Have A Better Idea Of What Is Happening?

AgitatedDove14 Well I have a loss function which is something like:
class MyLoss(...): def forward(...): weights = self.compute_weights(...) return (weights * (target-preds)).mean()There seems to be a problem on certain batch when computing the weights. What would be the best way to log the batch that causes the problem, along with the weights being computed.

2 years ago

0 Hi! I Am Getting The Following Error On An Agent:

It is the latest RC, I get the following:
` Executing Conda: /opt/conda/bin/conda install -p /home/ramon/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults 'pip<20.2' --quiet --json
Pass
Trying pip install: /home/ramon/.clearml/venvs-builds/3.8/task_repository/my-rep.git/requirements.txt
Executing Conda: /opt/conda/bin/conda install -p /home/ramon/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults numpy==1.20.3 --quiet --json
Pass
Warning, could not locate PyTorch to...

2 years ago

TimelyPenguin76 I found out its just one package that is causing the error ( cloudpickle breaks everything). Is there a way to use Pigar but force a single package to have a version?

4 years ago

0 Hi! I Was Taking A Look At The

There are also ways to override the parameters as stated https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_cli.html#use-of-command-line-arguments .

3 years ago

0 Hi! I Am Trying To Download Data From Gs Using

Thanks AgitatedDove14 !

4 years ago

0 Hi! I Am Getting The Following Error On An Agent:

Not yet AgitatedDove14 , does the agent use by default the python version the command is run with? I installed conda and tried using package_manager.type=conda but then get an error:
clearml_agent: ERROR: 'NoneType' object has no attribute 'lower'

2 years ago

0 Hi! I Have Some Agents On Gcp. Lately I Have Been Getting Some Experiments That Simply Stop Running (No Signs That The Experiment Crashed). Here Is A Plot That Shows The Resource Monitoring. Any Ideas On What Could Be Causing This?

I am using pytorch_lightning , I'll try to create a snippet I can share! Thanks 🙌

3 years ago

0 Hi! I Am Trying To Download Data From Gs Using

AgitatedDove14 update here! Something like this should work:
from trains import StorageManager from trains.storage.helper import StorageHelper bucket = 'gs://bucket' helper = StorageHelper.get(bucket) remote_files = helper.list('folder') for f in remote_files: StorageManager.get_local_copy(bucket + "/" + f)the * gives [] results since one the list method startswith is used which uses it as a string and not as a wildcard

4 years ago

0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

I feel it’s easier not to report than cleaning after but please correct me if I am overthinking it. I’ll check if I could wrap the code in something that calls the Task.delete if debugging

3 years ago

Hey CostlyOstrich36 ! I am using clearml==1.1.2 and clearml-agent==1.1.0 . Stopped is not the right word, more like frozen, it just froze at an epoch. The console on the agent shows epoch 33 first batch and the one at the server epoch 32 last batch. The experiment was running for ~6 hours.

3 years ago

0 Hi, Is There A Way To Force The Requirements.Txt? I Have A Package I Installed Directly From Github But The Version Is Always Wrong. Any Other Way To Do This?

Yes Martin! I have a package installed from github but its using the pypi version

3 years ago

0 Hi! I Am Saving Some Intermediate

So I would have to disconnect pytorch? And then upload the model at the end

3 years ago

0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

AgitatedDove14 I filed an issue of fire for them to point us to the argument parsing method https://github.com/google/python-fire/issues/291

4 years ago

0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

Yes AgitatedDove14 , I added git user name and password on the trains.conf file. On the results tab of the UI the logs clone command shows the SSH command instead of the HTTPS :
Repository cloning failed: Command ['clone', mailto:'git@gitlab.com : ...

4 years ago

0 Hi! Regarding The

Thanks for the info AgitatedDove14 !

4 years ago

0 Hi! If I Have A Pipeline On Gitlab That Uses Clearml For Some Tests Is There Some Way To Setup The Credentials So That It Doesn’T Fail?

Thanks SuccessfulKoala55 !

3 years ago

0 Hi All! Currently I Am Trying To Create A Tool That Can Perform Certain Operations On Dataset Ids, This Is A Skeleton Of What I Have In Mind (Based On The Examples):

Thanks AgitatedDove14

3 years ago

0 Hello! There Is Great Alternative For Argparse Developed By Facebook For Ml Named

Best thing ever, thanks AgitatedDove14 !

4 years ago

0 Hello! There Is Great Alternative For Argparse Developed By Facebook For Ml Named

AgitatedDove14 from this thread I understand hydra is not supported and therefore overriding the parameters from the UI wont work, but is there still a way to track and add the parameters to the experiment? Will task.connect_configuration work with the yaml files?

4 years ago

0 Hello

Yes, I configured it that way 👌 Thanks! I'll use the flag!

one year ago

0 Hello

Managed to get:

clearml_agent: ERROR: Command '['/home/ramon/.clearml/venvs-builds/3.9/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/var/tmp/requirements_tb0x2i3j.txt', '--extra-index-url', '

 died with <Signals.SIGKILL: 9>.

while building the task with the id on the agent

one year ago

0 Hello

It is failing exactly when the download finishes. Not sure if it is something but on the ~/.clearml/pip-download-cache only a cu120 empty folder appears. Should the torch wheel be saved there?

one year ago

0 Hello

Sure! For torch I have:

torch==2.0.1
    # via
    #   monai
    #   pytorch-lightning
    #   torchio
    #   torchmetrics

one year ago

0 Hello

@<1523701070390366208:profile|CostlyOstrich36> Thanks for the help! It ended being a mistake on my side. Misconfigured the VM's memory and it had only 3.75 G. Failed when installing torch.

one year ago

0 Hello

What additional context do you need?

one year ago

Show more results