AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hello, I Have A Question Regarding Creating A Clearml Pipeline Using Pytorch Lightning. I Am Not Really Sure Where To Begin. Should I Create A Task For Each Pytorch Lightning Class In My Pipeline? Is There A Demo Or Clearml Project That Specifically Uses

Hi @<1547028031053238272:profile|MassiveGoldfish6>
What is the use case? the gist is you want each component to be running on a different machine. and you want to have clearml do the routing of data and logic between.
How would that work in your use case?

one year ago

0 Potential Feature Request: Having The Parallel Coordinates Plot Available From The Hp Parent Task. Right Now, If I Want To See The Parallel Coord Plot (Shown Below), I Have To Manually Select All Trials In A Hpo Run > Compare > Hyperparameters > Parallel

LudicrousParrot69
Yes please add to GitHub 🙂 The problem is, if this is on single Task than we loose the ability have the nice interactive abilities (selecting diff scalars / parameters) etc...

3 years ago

... grab the model artifacts for each, put them into the parent HPO model as its artifacts, and then go through the archive everything.

Nice. wouldn't it make more sense to "store" a link to the "winning" experiment. So you know how to reproduce it, and the set of HP that were chosen?
No that the model is bad, but how would I know how to reproduce it, or retrain when I have more data etc..

3 years ago

0 Hi, Can I Run An

RoundMosquito25 actually you can 🙂
# check the state every minute while an_optimizer.wait(timeout=1.0): running_tasks = an_optimizer.get_active_experiments() for task in running_tasks: task.get_last_scalar_metrics() # do something herebase line reference
https://github.com/allegroai/clearml/blob/f5700728837188d7d6005726c581c9d74fd91164/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py#L127

one year ago

0 Hi Team, Me Again! Im Curious If Someone Can Explain To Me Better How Task And Optimisers Integrate With Each Other. In The Example Hyperparameter Optimisation, There Is Both A Task Initialised With

LudicrousParrot69 ,
Are you trying to post execution parse the attached Table, then put it into a CSV on the HPO Task ?

3 years ago

I see now, give me a minute I'll check

3 years ago

LudicrousParrot69 I would advise the following:
Put all the experiments in a new project Filter based on the HPO tag, and sort the experiments based on the metric we are optimizing (see adding custom columns to the experiment table) And select + archive the experiments that are not usedBTW: I think someone already suggested we do the auto archiving inside the HPO process itself. Thoughts ?

3 years ago

LudicrousParrot69 we are working on adding nested project which should help with the humongous mass the HPO can create. This is a more generic solution for the nesting issue. (since nesting inside a table is probably not the best UX solution 🙂 )

3 years ago

Doesnt solve the issue if a HPO run is going to take a few days

The HPO Task has a table of the top performing experiments, so when you go to the "Plot" tab you get a summary of all the runs, with the Task ID of the top performing one.
No need to run through the details of the entire experiments, just look at the summary on the HPO Task.

3 years ago

Are tagging / archiving available in the API for a task?

Everything that the UI can do you can do programmatically 🙂
Tags:
task.add_tags / set_tags / get_tags
Archive:
task.set_system_tags(task.get_system_tags() + ['archived'])

3 years ago

0 Hi All, Playing Around With Hp Optimisation, And I Notice In The Hyperparameteroptimizer Class Itself, The

Hi LudicrousParrot69
I guess you are right this is not trivial distinction:
min: means we are looking for the the minimum value of a specific scalar. meaning 1.0, 0.5, 1.3 -> the optimizer will get these direct values and will optimize based on that
global min: means the optimizer is getting the minimum values of the specific scalar. With the same example: 1.0, 0.5, 1.3 -> the HPO optimizer gets 1.0, 0.5, 0.5
The same holds for max/global_max , make sense ?

3 years ago

0 Hi All, Playing Around With Hp Optimisation, And I Notice In The Hyperparameteroptimizer Class Itself, The

Correct, which makes sense if you have a stochastic process and you are looking for the best model snapshot. That said I guess the default use case would be min/max (and not the global variant)

3 years ago

0 [Caching Of Environment And Storage When Using Aws Auto Scaler]

I can see that the data is reloaded each time, even if the machine was not shut down in between.

You can verify by looking into the Task's Log, it will contain all the docker arguments, one of them should be the cache folder mount

one year ago

0 Hello, There'S A Particular Metric (Perplexity) I'D Like To Track, But Clearml Didn'T Seem To Catch It. Specifically, This "Evaluation" Section Of Run_Mlm.Py In The Transformers Repo:

Thanks SmallDeer34 !
This is exactly what I needed

3 years ago

0 Is It Possible To Set An Environment Variable For A Task?

Have a wrapper over Task to ensure S3 usage, tags, version number etc and project name can be skipped and it picks from the env var

Cool. Notice that when you clone the Task and the agents executes it, the project is already defined, so this env variable is meaningless, no ?

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

(BTW: draft means they are in edit mode, i.e. before execution, then they should be queued (i.e. pending) then running then completed)

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

Go to the workers & queues, page right side panel 3rd icon from the top

3 years ago

0 With

Looking at the

supervisor

method of the base

AutoScaler

class, where are the worker IDs kept.
Is it in the class attribute

queues

?

Actually the supervisor is passing a fixed prefix, then it asks the clearml-server on workers starting with this name.
This way we can have a fixed init script for all agents, while we still can differentiate them from the other agent instances in the system. Make sense ?

3 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

ShaggyHare67 are you saying the problem is trains fails discovering the packages in the manual execution ?

3 years ago

0 Hello, I'M A Bit Lost In The Docs For The Mlops, I Have Script Which Already Integrate Clearml Logging, Should I Use Clearml-Task To Launch It On An Agent ? (I Already Have A Clearml-Server And A Clearml-Agent Running).

Ohh, two options:
From the script itself you can do:
from clearml import Task task = Task.init(...) task.execute_remotely(queue='default')Then run the script locally, it will get until the "execute_remotely call, quit the process and re-launch it on the "default" queue.
Option B:
Use the cleaml-task
$ clearml-task --folder <where the script is> --project ...See https://github.com/allegroai/clearml/blob/master/docs/clearml-task.md#launching-a-job-from-a-local-script

3 years ago

0 Hi Everyone, I'M Trying To Execute Trains-Agent In Docker Mode With Conda As Package Manager, Is It Supported? I Tried To Work With Nvidia/Cuda:10.0-Runtime-Ubuntu18.04 And Got The Error "Trains_Agent: Error: Error: Package Manager "Conda" Selected, But '

Hi RattySeagull0

I'm trying to execute trains-agent in docker mode with conda as package manager, is it supported?

It should, that said we really do not recommend using conda as package manager (it is a lot slower than pip, and can create an environment that will be very hard to reproduce due to internal "compatibility matrix" of conda, that might be changing from one conda version to another)

"trains_agent: ERROR: ERROR: package manager "conda" selected, but 'conda' executable...

3 years ago

0 I Am Running Trains=0.16.4 Python==3.7.5 , And Notice That The "Log" Page Sometimes Didn'T Capture The Console Log From My Program. Is This A Known Issue, Anyone Have Experienced Similar Behavior?

The log is missing, but the Kedro logger is print to sys.stdout in my local terminal.

I think the issue night be it starts a new subprocess, and that subprocess is not "patched" to capture the console output.
That said if an agent is running the entire pipeline, then everything is logged from the outside, so whatever is written to stdout/stderr is captured.

3 years ago

0 What Could Be The Reason For Fail Status Of A Task That Seems To Have Completed Correctly? No Information In The Log Whatsoever

Hmm... any idea on what's different with this one ?

3 years ago

0 Hi All, Playing Around With Hp Optimisation, And I Notice In The Hyperparameteroptimizer Class Itself, The

Hmmm:

WOOT WOOT we broke the record! Objective reached 17.071016994817196
WOOT WOOT we broke the record! Objective reached 17.14302934610711

These two seems strange, let me look into it

3 years ago

0 Hi All, Playing Around With Hp Optimisation, And I Notice In The Hyperparameteroptimizer Class Itself, The

Found it, definitely a bug in the callback, it has not effect on the HPO process itself

3 years ago

0 Hi All, Playing Around With Hp Optimisation, And I Notice In The Hyperparameteroptimizer Class Itself, The

Bugs, definitely GitHub, this is the easiest to track.
Documentation, if these are small issues, Slack is fine, otherwise, GitHub issue.
Regrading the documentation, we are working on another iteration of improvement, but if you find inaccuracies/broken links please report 🙂

3 years ago

0 Hello, I Have A Local Install Using The Docker Compose Approach. I'M Trying To Set

Hi MistakenDragonfly51

I'm trying to set

default_output_uri

in

This should be set wither on your client side, or on the worker machine (running the clearml-agent).
Make sense ?

2 years ago

0 Is It Expected That K8S Helm Chart Also Starts A Clearml Worker?

In that case, no the helm chart does not spin a default agent (You should however spin a service mode agent for running pipelines logic)

2 years ago

0 Hey There, Happy New Year To All Of You

Hi JitteryCoyote63
If you want to stop the Task, click Abort (Reset will not stop the task or restart it, it will just clear the outputs and let you edit the Task itself) I think we witnessed something like that due to DataLoaders multiprocessing issues, and I think the solution was to add 'multiprocessing_context='forkserver' to the DataLoaderhttps://github.com/allegroai/clearml/issues/207#issuecomment-702422291
Could you verify?

3 years ago

0 Running This Code From Inside A Docker Container Locally:

AttributeError: 'NoneType' object has no attribute 'base_url'

can you print the model object ?
(I think the error is a bit cryptic, but generally it might be the model is missing an actual URL link?)
print(model.id, model.name, model.url)

2 years ago

Show more results