AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

It will also allow you to pass them to Hydra (wither as overloaded, or directly edit the entire hydra config)

3 years ago

0 Hi! Is There Something Happening With The

This is what I just used:
` import os
from argparse import ArgumentParser

from tensorflow.keras import utils as np_utils
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Activation, Dense, Softmax
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint

from clearml import Task

parser = ArgumentParser()
parser.add_argument('--output-uri', type=str, required=False)
args =...

4 years ago

0 Is It Possible To Filter Tasks By There Output And Input Names Using .Get_Tasks?

Hmm SuccessfulKoala55 what do you think?

4 years ago

I just cloned it from the examples that are available in the SaaS console upon account creation

Ohhh! that would explain it. Maybe it is broken there?! let me check a second

3 years ago

0 Hi, I Try To Run Locally

I wonder if I just need to join 2 docker-compose files to run everything in one session

Actually that could also work

But for reference, when I said IP i meant the actual host network IP not the 127.0.0.1 (which is the same as localhost)

3 years ago

0 Hi!

Hi EagerOtter28
The agent knows how to do the http->ssh conversion on the fly, in your cleaml.conf (on the agent's machine) set force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/42606d9247afbbd510dc93eeee966ddf34bb0312/docs/clearml.conf#L25

4 years ago

0 Hi All! I Can'T Use Scalar Tab In All Experiments Due To Elastic Search Error:

Hi @<1569496075083976704:profile|SweetShells3>
Are you using the standard docker-compose ? are using the default elastic container ?
What exactly changed ?

2 years ago

0 Hey Folks, When I Run

The 'on-premise' server fails to connect to the ClearML server because of the VPN I think

I think you are correct.
You can quickly test it, try ti run curl http://local-server:8008 see if that works

4 years ago

0 Hey All -- I'M Fairly New To This But, As Of Today, My Required Packages Aren'T Being Recognized In Cloned Runs And They Are Repeatedly Failing. Has Anyone Had Similar Issues/Found A Fix?

Hi BoredSquirrel45

as of today, my required packages aren't being recognized in cloned

Are you saying you are editing the code directly in the cloned Task, then enqueue the Task an the agent does not "auto recognize" the package ?

3 years ago

0 Hello, If I Set

Let me check ...

4 years ago

0 Hi All! I’M Currently Working On A Project Where I’M Making Use Of Clearml For Hyperparameter Tuning. In My Workflow, I Have A Python Script That I Usually Run With The Following Command:

Hi @<1566959357147484160:profile|LazyCat94>
So it seems the arg parser is detecting the configuration YAML
The first thing I would suggest is changing it to a relative path (so that when launched on remote machines it will find the YAML file)

Regardless how are you launching the HPO ? are you spinning a new agent ?
(as background, argparser arguments are injected in realtime by the agent or the HPO when running as subprocesses)

2 years ago

0 Hello! I Have An Issue Reproducing My Runs. The Task.Create Completes Successfully. When I Clone And Enqueue A Completed Task The Clone Fails. It Fails During The Python Requirements Installation. Why Is This? Do You Know How I Can Debug? Thank You In Adv

lastly try to add:

extra_pip_install_flags: ["--use-deprecated=legacy-resolver", ]

None

one year ago

0 Hello, I'M Diego. I'M Training Nns Using Clearml. I'Ve Had Some Problems When Cloning Experiments And Changing Hyper Params. My Train Script Loads

Now in case I needed to do it, can I add new parameters to cloned experiment or will these get deleted?

Adding new parameters is supported 🙂

4 years ago

0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

Hi AverageBee39
What's the clearml-server and clearml packge you are using ?
(I looks like some capability that is missing from the server, i.e. needs upgrade ?!)

4 years ago

0 I Am Trying To Use

No that's okay

4 years ago

0 Hi Everyone, I’M Getting An Error During Model Upload To S3. The Error Shows Up In The Console Like Below And I Don’T See Any Uploaded Objects In S3:

So without the flush I got the error apparently at the very end of the script -

Yes... it's a python thing, background threads might get killed in random order, so that when one needs a background thread that died you get this error, which basically should mean you need to do the work in the calling thread.
This actually explains why calling Flush solved the issue.
Nice!

3 years ago

My pleasure

3 years ago

0 Hi All! Please Tell Me There Are Examples Of Clearml And Pytorch-Lightning Integration

Do you accidentally know if there are any plans for an implementation with the logger variable, so that in case of something it would be possible to write to different tables?

CheerfulGorilla72 what do you mean "an implementation with the logger variable" ? pytorch-lighting defaults to the TB logger, which clearml will automatically catch and log into the clearml-server, you can always add additional logs with clearml interface Logger.current_logger().report_???
What am I mis...

4 years ago

0 My Nth Question For The Day

Is there a way to do this all elegantly?

Of yes there is, this is how TaskB code will look:

` task = Task.init(..., 'task b')
param = {'TaskA' :'TaskAs ID HERE'}
task.connect(param)
taska_model = Task.get_task(param['TaskA']).models['output''][-1]
torch.load(taska_model.get_local_copy())

train

torch.save('modelb') `I might have missed something there, but generally speaking this will let you:
Select TASKA as a parameter of TaskB training process Will register automagically Tasks'A...

4 years ago

0 Hey All, I Want To Purchase The Pro Version Of Clearml But Would Like To Have A Better Understanding Of The Metric Events And Api Calls That Are Performed When Using Clearml-Serving. For Example: I Have No Understanding Which Docker Container Calls The Ap

I reached over 1M API calls in about one week using clearml-serving

Oh that makes sense now 🙂
If I remember correctly, adding an additional model to a signal clearml-serving instance should not actually change the number of API calls, they are mostly affected by the number of clearml-serving / containers and not in the number of models.

2 years ago

0 Hi!

My only point is, if we have no force_git_ssh_port or force_git_ssh_user we should not touch the SSH link (i.e. less chance of us messing with the original URL if no one asked us to)

4 years ago

0 Hi All, I'M Trying To Use The Relatively New Jupyter Preview Feature But For Some Reason I Have The Notebook Artifact Under Artifacts But The Preview Is Unavailable.. Am I Missing Some Needed Steps? Thanks!

RipeGoose2 you mean to have the preview html on S3 work as expected (i.e. click on it add credentials , open in a new tab) ?

4 years ago

0 Hey All. Another Question - How Are Private Packages Handled/Installed So That Clearml-Agent Can Execute A Task? I Have A Bunch Of Private Repos For Communicating With The Data Warehouse. I Could Do A System-Wide Installation For It On The Clearml-Agent I

My pleasure 🙂
Maybe we should do a webinar... I have a feeling the MLOps aspects are not as straight forward as we would like to think ...

4 years ago

0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Hi NonchalantDeer14
In multi-gpu, can you still see the logs on the local Tensorboard ?
Are you running manually or with an agent ?

4 years ago

0 Hello All, Thanks For This Really Cool Software And Community! I Have A Question On

Hmm that makes sense, btw the PYTHONPATH set by the agent would be the working dir listed under the Task, But if you set the agent.force_git_root_python_path the agent would also add the root git repo to the python path

2 years ago

0 When Using Docker Mode (And Specifically K8S Glue), What Are The Options For Caching? One Option Is Definitely Having A Base Image That Has The Things Needed. Anything Else? Thanks!

Gitlab has support for S3 based cache btw.

This might still be considered "slow" compared to local-dist/cluster mount

Would adding support for some sort of post task script help? Is something already there?

Interesting, can you expand on the use case? (currently there is only pre-task script, for setup)

4 years ago

I'm checking the preview HTML and it seems like it was not uploaded...

4 years ago

0 Hi All. I'M Setting Up An Model Export Script That Will Export Trained Models For Edge Deployment. I Initially Thought About Setting It Up As A Trigger Scheduler, And To Have It Trigger On Tags On A Published Model, But As Time Goes By The Trigger Schedul

Also, how do pipelines compare here?

Pipelines are a type of Task, so like Tasks you can clone and enqueue them, or set them as the target of the trigger.

the most flexible solution would be to have some way of triggering the execution of a script in the parent task environment,

This is the exact idea of the TriggerScheduler None
What am I missing here?

one year ago

0 Hi Everyone, I'M Trying To Execute Trains-Agent In Docker Mode With Conda As Package Manager, Is It Supported? I Tried To Work With Nvidia/Cuda:10.0-Runtime-Ubuntu18.04 And Got The Error "Trains_Agent: Error: Error: Package Manager "Conda" Selected, But '

RattySeagull0 I think you are correct, python 3.6 is the installed inside the docker. Is it important to have 3.7 ? You might need another docker (or change the installation script and install python 3.7 inside)

5 years ago

0 Is There Any Specific Version Of Numpy You Recommend To Use With Clearml Python Library? I Am Building An Python Alpine Docker Image With Clearml==1.7.2 But It Breaks When Building Image From Dockerfile.

this is why....

3 years ago

Show more results