GrievingTurkey78

34 Questions, 125 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

119 × Eureka!

Questions 34
Answers 125

0 Votes

6 Answers

959 Views

0 Votes 6 Answers 959 Views

Hi! I Have Some Agents On Gcp. Lately I Have Been Getting Some Experiments That Simply Stop Running (No Signs That The Experiment Crashed). Here Is A Plot That Shows The Resource Monitoring. Any Ideas On What Could Be Causing This?

Hi! I have some agents on GCP. Lately I have been getting some experiments that simply stop running (no signs that the experiment crashed). Here is a plot th...

clearml

3 years ago

0 Votes

8 Answers

818 Views

0 Votes 8 Answers 818 Views

Hello

Hello 👋 I am using a self hosted clearml setup using the requirments file of the project. When I run the task it is failing and I get: Collecting torch==2.0...

clearml

one year ago

0 Votes

2 Answers

885 Views

0 Votes 2 Answers 885 Views

Hi! Regarding The

Hi! Regarding the artifact.get_local_copy() method, since there is no way to specify the path where the artifact will be downloaded, I wanted to confirm that...

clearml

4 years ago

0 Votes

21 Answers

958 Views

0 Votes 21 Answers 958 Views

Hi! Any Idea Why Clearml Fails To Detect Iteration Reporting?

Hi! Any idea why clearml fails to detect iteration reporting? ClearML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-fr...

clearml

3 years ago

Show more results

0 I Am Also Experiencing A Weird Behaviour When Running A Script Using The Module Flag. For Example I Run:

I’ll show you what I have through PM!

4 years ago

0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

AgitatedDove14 task.set_archived(True) + the cleanup service should do it 👌 If we run in debug mode the experiment goes directly to the archive and gets cleaned and we don’t pollute the main experiment page.

3 years ago

0 Hi! I Am Getting The Following Error On An Agent:

I have the agent configured to force install requirements.txt

2 years ago

0 Hi, Is There A Way To Force The Requirements.Txt? I Have A Package I Installed Directly From Github But The Version Is Always Wrong. Any Other Way To Do This?

Yes Martin! I have a package installed from github but its using the pypi version

3 years ago

0 Hi! I Am Currently Using Hydra+Clearml And Wanted To Know If There Are Still Some Updates Coming. At The Moment, If I Change The Defaults Hydra Uses From The

AgitatedDove14 Thanks! I’ll give it a try! Makes sense 👌

3 years ago

0 Hi! I Am Having Some Problems With A Loss After A Good Amount Of Training, What Would Be The Best Way To Log A Value To Have A Better Idea Of What Is Happening?

Awesome AgitatedDove14 Thanks a lot 🙌

2 years ago

0 Hi! Is There Something Happening With The

Any idea why this could happen?

3 years ago

0 Hi

SuccessfulKoala55 on both 8080 and 8008 I get: Safari can’t open the page http://<External IP>:80XX because Safari can’t establish a secure connection to the server http://<External IP>:80XX .

4 years ago

0 Hi! I Am Using The Modelcheckpoint Callback From Tensorflow To Save The Best Model. When The Experiment Finishes If I Go On The Server To Experiment > Artifacts > Output Model I Can See The Model And Subsequently By Clicking On It The Weights. How Can I

On the server through the command line?

3 years ago

0 Hi! I Am Saving Some Intermediate

So I would have to disconnect pytorch? And then upload the model at the end

3 years ago

0 Hi! I Am Having Some Problems With A Loss After A Good Amount Of Training, What Would Be The Best Way To Log A Value To Have A Better Idea Of What Is Happening?

AgitatedDove14 Well I have a loss function which is something like:
class MyLoss(...): def forward(...): weights = self.compute_weights(...) return (weights * (target-preds)).mean()There seems to be a problem on certain batch when computing the weights. What would be the best way to log the batch that causes the problem, along with the weights being computed.

2 years ago

0 Hi, I Was Getting A Really Weird Error Due To Mismatch On The Versions Between The Installed Libraries In My Environment And The Ones Ran In The Node (I Manually Changed The Installed Packages And Everything Worked). How Can I Force Trains To Use Exactly

Thats really cool! But I would still prefer avoid using pip_freeze, is there a way?

4 years ago

Makes sense! Then where would I have to add output_uri to save the weights?

3 years ago

0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

Yes AgitatedDove14 , I added git user name and password on the trains.conf file. On the results tab of the UI the logs clone command shows the SSH command instead of the HTTPS :
Repository cloning failed: Command ['clone', mailto:'git@gitlab.com : ...

4 years ago

TimelyPenguin76 I found out its just one package that is causing the error ( cloudpickle breaks everything). Is there a way to use Pigar but force a single package to have a version?

4 years ago

0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

Sure, I’ll share It through a private message!

4 years ago

0 Hi! Is There Something Happening With The

It works perfectly! AgitatedDove14 There is something weird on my side 😢

3 years ago

0 Hi! Is There Something Happening With The

Hey AgitatedDove14 does this work for you?
` from argparse import ArgumentParser
from tensorflow.keras import utils as np_utils
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint

import tensorflow as tf
from clearml import Task

class Linear(tf.keras.Model):
def init(self, in_shape=(784,), num_classes=10):
super().init()
self.l...

3 years ago

0 Hi! Is There Something Happening With The

AgitatedDove14 its on the checkpoint

3 years ago

Works like a charm 👌 thanks!

4 years ago

0 Hi! I Am Getting The Following Error On An Agent:

It is the latest RC, I get the following:
` Executing Conda: /opt/conda/bin/conda install -p /home/ramon/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults 'pip<20.2' --quiet --json
Pass
Trying pip install: /home/ramon/.clearml/venvs-builds/3.8/task_repository/my-rep.git/requirements.txt
Executing Conda: /opt/conda/bin/conda install -p /home/ramon/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults numpy==1.20.3 --quiet --json
Pass
Warning, could not locate PyTorch to...

2 years ago

0 I Am Also Experiencing A Weird Behaviour When Running A Script Using The Module Flag. For Example I Run:

Is this caused by running the script with the arguments?

4 years ago

0 Hi! I Am Trying To Download Data From Gs Using

AgitatedDove14 update here! Something like this should work:
from trains import StorageManager from trains.storage.helper import StorageHelper bucket = 'gs://bucket' helper = StorageHelper.get(bucket) remote_files = helper.list('folder') for f in remote_files: StorageManager.get_local_copy(bucket + "/" + f)the * gives [] results since one the list method startswith is used which uses it as a string and not as a wildcard

4 years ago

For option 2 do I have to configure it on all agents or on the server?

3 years ago

0 Hi! Is There Something Happening With The

Thanks Martin! I’ll keep checking 👌

3 years ago

0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

Thanks Martin! 🙌

3 years ago

👌 Great

3 years ago

0 Hi! I Am Getting The Following Error On An Agent:

Let me double check!

2 years ago

0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

AgitatedDove14 Downloading a dataset would not be possible using this right? I want to be able to access the data just avoid reporting the experiment results

3 years ago

0 Hi! Is There Something Happening With The

Thanks AgitatedDove14 🙌

3 years ago

Show more results