AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hey All -- I'M Fairly New To This But, As Of Today, My Required Packages Aren'T Being Recognized In Cloned Runs And They Are Repeatedly Failing. Has Anyone Had Similar Issues/Found A Fix?

-- I've been running my script from VSCode for the first time,

In the initial Task (the one created when running inside VSCode) do you have all the packages listed in the "Installed Packages" section ?

one year ago

0 Hey All -- I'M Fairly New To This But, As Of Today, My Required Packages Aren'T Being Recognized In Cloned Runs And They Are Repeatedly Failing. Has Anyone Had Similar Issues/Found A Fix?

Could it be there is a Task.init being called Before this code snippet ?

one year ago

0 Hi, I Would Like To Configure Clearml-Server To Connect To An S3 Bucket In Order To Store Artefacts - I'Ve Taken A Look On This Page

@<1687643893996195840:profile|RoundCat60> I'm assuming we are still talking about the S3 credentials, sadly no 😞
Are you familiar with boto and IAM roles ?

3 years ago

0 Hi, I Would Like To Configure Clearml-Server To Connect To An S3 Bucket In Order To Store Artefacts - I'Ve Taken A Look On This Page

I suggest a bump in the GitHub issue

3 years ago

0 Hi, I Would Like To Configure Clearml-Server To Connect To An S3 Bucket In Order To Store Artefacts - I'Ve Taken A Look On This Page

Still not supported 😞

3 years ago

0 Hi, I Would Like To Configure Clearml-Server To Connect To An S3 Bucket In Order To Store Artefacts - I'Ve Taken A Look On This Page

Let me check

3 years ago

0 Hi, I Would Like To Configure Clearml-Server To Connect To An S3 Bucket In Order To Store Artefacts - I'Ve Taken A Look On This Page

As we can’t create keys in our AWS due to infosec requirements

Hmmm

3 years ago

0 If I Set

(it will just create a new venv and install everything you need in the venv)

3 years ago

0 With Clearml 1.0 It Seems That Console Logs Are Only Shown In The Web Ui When The Task Has Finished. Is This Expected Behaviour? With Previous Versions I Was Able To See "Live" Output. I Tested This With The Pytorch_Tensorboardx.Py Example. I Run The Scri

Guys, any chance you can verify the RC solves the issue?
pip install clearml==1.0.2rc0

3 years ago

0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

The problems comes from ClearML that thinks it starts from iteration 420, and then adds again the iteration number (421), so it starts logging from 420+421=841

JitteryCoyote63 Is this the issue ?

3 years ago

So I shouldn’t even need to call the

task.set_initial_iteration

function

I think just removing this call should solve it, I think that what's going on is that this is called twice (once internal once manually by your code)

3 years ago

0 How Can I Tell Clearml To Ignore Certain Submodules Existing In The Project? My Projects Consists Of Multiple Git Submodules And It Is Rather Annoying That The Task Always Tries To Fetch All Submodules, When They Are Not Even Necessary. I Don'T Know How I

Yes I was thinking a separate branch.
The main issue with telling git to skip submodules is that it will be easily forgotten and will break stuff. BTW the git repo itself is cached so the second time there is no actual pull. Lastly it's not clear on where one could pass a git argument per task. Wdyt?

5 months ago

Hi @<1694157594333024256:profile|DisturbedParrot38>
You mean how to tell the agent to pull only some submodules of your git?
If this is the case you can actually remove them on your git branch, submodule is a file with a soft link. Wdyt?

5 months ago

I double checked the code it's always being passed 😞

5 months ago

0 Assuming I Have A

Is it possibe to launch a task from Machine C to the queue that Machine B's agent is listening to?

Yes, that's the idea

Do I have to have anything installed (aside from the

trains

PIP package) on Machine C to do so?

Nothing, pure magic 🙂

4 years ago

0 I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

strange ...

4 years ago

0 I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

Check the log to see exactly where it downloaded the torch from. Just making sure it used the right repository and did not default to the pip, where it might have gotten a CPU version...

4 years ago

0 I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

See if this helps

4 years ago

0 The

Do you think this is better ? (the API documentation is coming directly from the python doc-string, so the code will always have the latest documentation)
https://github.com/allegroai/clearml/blob/c58e8a4c6a1294f8acec6ed9cba81c3b91aa2abd/clearml/datasets/dataset.py#L633

3 years ago

That is quite neat! You can also put a soft link from the main repo to the submodule for better visibility

5 months ago

0 Hi Clearml Community. I Interviewed Nir Bar-Lev On The Practical Ai Podcast, So I Had Allegro/Clearml In The Back On My Mind. I’M Launching A New Project At My Org Now, And I Think Clearml Might Be A Good Fit. Questions That Have Come Up Are:

Hi GleamingGrasshopper63

How well can the ML Ops component handle job queuing on a multi-GPU server

This is fully supported 🙂
You can think of queues as a way to simplify resources for users (you can do more than that,but let's start simple)
Basicalli qou can create a queue per type of GPU, for example a list of queues could be: on_prem_1gpu, on_prem_2gpus, ..., ec2_t4, ec2_v100
Then when you spin the agents, per type of machine you attach the agent to the "correct" queue.

Int...

3 years ago

0 I Am Trying To Use

make sure the API port is 8008 and the web 8080

3 years ago

0 I Am Trying To Use

And the server itself? is it http or https ?

3 years ago

0 Port Remapping Of The Webserver Is Not Supported (Documentation Only Mentions

Hi DefeatedCrab47
You should be able to change the Web server port , but API port (8008) cannot be changed. If you can login to the web app and create a project it means everything is okay. Notice that when you configure trains ( trains-init ) the port numbers are correct 🙂

4 years ago

0 Hi, Another Question. I Tried To Not

PompousBeetle71 cool, next RC will have the argparse exclusion feature :)

4 years ago

0 I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

So a bit of explanation on how conda is supported. First conda is not recommended, reason is, is it very easy to create a setup on conda that is un-reproducible by conda (yes, exactly that). So what trains-agent does, it tries to install all the packages it can first with conda (not one by one, because that will break conda dependencies), then the packages that it failed to install from conda, it will install using pip.

4 years ago

0 I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

Try adding this environment variable:
export TRAINS_CUDA_VERSION=0

4 years ago

0 Hello Everyone! I Have A Problem With Clearml. Could You Please Help Me? I Have 2 Little Projects With Total 31 Experiments. And Its 837Mb Metric Stored. Where Can I Find A Detail Information About This Memory Quota Spending? I Really Don'T Understand, Wh

how I can turn off git diff uploading?

Sure, see here
None

6 months ago

0 I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

Please send the full log, I just tested it here, and it seems to be working

4 years ago

0 What Sort Of Integration Is Possible With Clearml And Sagemaker? On The Page

Hmm what do you have here?

os.system("cat /var/log/studio/kernel_gateway.log")

one year ago

Show more results