AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hello Everyone I’M New To Clearml And I’Ve Faced One Issue With Models Auto Logging. Tried To Google An Answer Online, But Failed.. Generally I Use The Most Simple

Hi GreasyRaven35
You should set the output_uri, in Task init, it will auto upload the model, and register the remote location URL
task = Task.init(..., output_uri=True)You can also specify a target bucket, if you configured credentials (e.g. output_uri=" s3://bucket ")

3 years ago

0 When My Remote Task Is Installing The Python Dependencies

BoredHedgehog47
is this ( https://clearml.slack.com/archives/CTK20V944/p1665426268897429?thread_ts=1665422655.799449&cid=CTK20V944 ) the same issue (or solution) ?

2 years ago

0 I’M Trying To Use

LazyTurkey38
The last part makes sense, not sure I get the "if clone", we are calling execute_remotely, so I'm assuming we do not need to clone ourselves, but send the current Task.
Other than that yes, makes sense (BTW, assuming you have upgraded the server >=1.0 you can just do mark_stopped, no need to reset

4 years ago

0 I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

See if this helps

5 years ago

0 Heya, The Owner Of My Current Pro Saas Deployment Workspace Has Changed Of Google Account And The Google Account He Used To Create The Workspace Has Been Closed, Is There Any Mean He Can Retrieve/Transfer Ownership Of The Workspace To Another Google Ident

Hi FierceHamster54
I'm this is solvable, get in touch with them either in the contact form on the website or email support@clear.ml , should not be complicated to fix 🙂

2 years ago

0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

Oh that is odd... let me check something

4 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

I think, this all ties into the none-standard git repo definition. I cannot find any other reason for it. Is it actually stuck for 5 min at the end of the process, waiting for the repo detection ?

4 years ago

0 Hi Folks! I'M Using

ShallowCat10
pip install clearml==0.17.5rc0🙂

4 years ago

0 Hi

task.models["outputs"][-1].tags (plural, a list of strings) and yes I mean the UI 🙂

I get the n_saved what's missing for me is how would you tell the TrainsLogger/Trains the current one is the best? Or are we assuming the last saved model is always the best ? (in that case there is no need for tag, you just take the last in the list)

If we are going with: "I'm only saving the model if it is better than the previous checkpoint" then just always use the same name i.e. " http:/...

5 years ago

0 I'M A Little Confused As To How Force_Requirements_Env_Freeze Works When No Requirements File Is Supplied. Is It Supposed To Store The Full Reqs Of The Environment That Calls It?

CostlyOstrich36 did you manage to reproduce it?
I tried conda w/ python3.9 on a clean Windows VM , and it worked as expected ....

3 years ago

0 Hi! I Was Wondering Regarding This Issue:

Okay, some progress, so what is the difference ?
Any chance the issue can be reproduced with a small toy code ?
Can you run the tqdm loop inside the code that exhibits the CR issue ? (maybe some initialization thing that is causing it to ignore the value?!)

4 years ago

0 Is It Possible To Create A Serving Endpoint With Pytorch Jit File In Web Interface Only?

DefiantHippopotamus88
HTTPConnectionPool(host='localhost', port=8081):This will not work because inside the container of the second docker compose "fileserver" is not defined
CLEARML_FILES_HOST=" "
You have two options:
configure to the docker compose to use the networkhost on all conrtainers (as oppsed to the isolated mode they are now running ing)2. Configure all of the CLEARML_* to point to the Host IP address (e.g. 192.168.1.55) , then rerun the entire thing.

3 years ago

0 Hi Everyone. I Have An Issue With The Simple Pipeline - It Runs Two Similar Nn Training Steps (Tf2.3, Windows10, Python 3.7) With Only Difference Is A Batch Size. I'M Running First Separately Each Step To Have Them In Clearml Project Page. Then I Run Pipe

No, I mean actually compare using the UI, maybe the arguments are different or the "installed packages"

4 years ago

0 Hi, I Am Wondering Why Do I Need To Create Files Before Applying Diff ?

DefeatedOstrich93 many thanks I was able to reproduce it (basically newly added files caused git apply to fail)
Fix will be part of the next clearml-agent RC

4 years ago

0 Hi, I Am Wondering Why Do I Need To Create Files Before Applying Diff ?

Thanks DefeatedOstrich93
Let me check if I can reproduce it.

4 years ago

0 Hi Everyone! I'Ve Had A Problem. But When I Was Describing It Here It Was Solved. Maybe It Will Help Someone. I Use Pytorch And Training Accidentally Freezes After Weights Uploading By Trains. Don'T Know Exactly What'S Wrong, But It Was Somehow Connected

PungentLouse55 hmmm
Do you have an idea on how we could quickly reproduce it?

5 years ago

0 Hi Guys, I Would Like To Start Using The Aws Autoscaler Shipped In Trains. I Need To Create A Iam User To Get And I Would Like To Know What Are The Minimal Permissions Required For The Autoscaler To Work?

FriendlySquid61 what do you think?

4 years ago

0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Hi @<1541954607595393024:profile|BattyCrocodile47>

I

do

have the SSH key placed at

/root/.ssh/id_rsa

on the machine,

Notice that the .ssh folder is mounted from the host (EC2 / GCP) into the container,

'-v', '/tmp/clearml_agent.ssh.cbvchse1:/.ssh'

This is odd, why is it mounting it to /.ssh and not /root/.ssh ?

2 years ago

0 Hello Everyone, I’M Newcomer For Clearml. I Have Question Related To

I do not think this is the upload timeout, it makes no sense to me for GCP package (we do not pass any timeout, it's their internal default for the argument) to include a 60sec timeout for upload...
I'm also not sure where is the origin of the timeout (I'm assuming the initial GCP handshake connection could not actually timeout, as the response should be relatively quick, so 60sec is more than enough)

4 years ago

0 Did Someone Here Already Try The

Isn't that risky? not knowing you need a package ?
How do you actually install it on the remote machine with the agent ?

4 years ago

0 Let'S Say That I Specify The

But I'm sure there is a cleaner way to proceed.

Maybe ?!
path = task.get_output_destination().replace('file://', '', 1)

4 years ago

0 Hey! I Have Task That Generates .Pth Files Locally And They'Re Visible On Webui, But Unfortunately I Can'T Fetch Them From My Local Computer, Is This Intended?

Hi ObnoxiousStork61

but unfortunately I can't fetch them from my local computer,
is this intended?

By default ClearML will only log the wights files.
It can also automatically upload them, if you pass a destination for storage at Task.init/
For example, to store on the files server:
Task.init(..., output_uri=True)To store on S3 (sub folders will be created automatically based on the Task ID
Task.init(..., output_uri=' ')

4 years ago

0 Hi, How Can I Remove A Tag From A Task Via Code In A Non-Barbaric Way?

🙂

4 years ago

0 I Am Hosting Clearml Server And I Faced Issue With Closing Datasets. For Some Reason Closing Datasets Ends Up With The Word "Killed" For Datasets More Than 2.5Gb (See Screenshot) The Question Is What Is The Reason Of The Issue? How To Upload Datasets Size

Hi SmugLizard24

The question is what is the reason of the issue?

That is a good question, could it be out of memory? (trying to compress or send the file in one chunk?)

4 years ago

0 Hi All, Is There A Way To Schedule The Tasks From The Queue Onto The Gpu Instances Based On Factors Such As Gpu Utilisation, Number Of Cpu Cores Present, Free Memory Or Custom Parameters Such As Priority Of The Task, Estimated Time Etc?

The idea of queues is not to let the users have too much freedom on the one hand and on the other allow for maximum flexibility & control.
The granularity offered by K8s (and as you specified) is sometimes way too detailed for a user, for example I know I want 4 GPUs but 100GB disk-space, no idea, just give me 3 levels to choose from (if any, actually I would prefer a default that is large enough, since this is by definition for temp cache only), and the same argument for number of CPUs..
Ch...

4 years ago

0 Would Be Great If Clearml Is Represented In The Matrix. (It'S Mentioned As A Mlops Platform And The Origin Year)

TrickySheep9 Yes, let's do that!
How do you PR a change ?

3 years ago

0 Hi Guys, I Have Many Questions To Ask, Sorry If This Questions Were Posted Already - If The Answer Exist, Please, Point Me To It. Thank You For Your Help. I'M Training Object Detection Model Using Tf 2.3 Object Detection Api And Use Clearml On Local Serve

Hi MagnificentSeaurchin79
Could you test with the tesnorflow toy example?
https://github.com/allegroai/clearml/blob/master/examples/frameworks/tensorflow/tensorboard_toy.py

4 years ago

0 There Is Some Specificity With The Way We Setup Our Environment At My Company That Prevents Me From Using The Full Features Of

I want to inject a bash command after the repo has been clone (and maybe even after the venv has been installed).

LazyTurkey38 the created venv inherits from the system environment, so in theory you can do all the installation on the system python and the created venv will just inherit the packages, no?
(btw: just to clarify, there is only one entry point for the custom bash script and that is before everything, so users can configure the container before the agent starts)

4 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

Can you verify it fixes the timeout issue as well? (or some insight on how to reproduce the issue?)

4 years ago

0 Hi, I'M Trying To Upload My Dataset Via

Hi ZippySheep23

Any ideas what might be happening?

I think you passed the upload limit (2.36 GB) 🙂

4 years ago

Show more results