Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
49 Questions, 8109 Answers
  Active since 10 January 2023
  Last activity 11 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Hi, Similar To Task.Set_Offline(True), Is There A Way To Simulate An Execution In An Agent? (For Testing Purposes)

It's the same but done from outside, you want the same and "offline" as well right?

2 years ago
0 Hi All

CooperativeFox72 could you expand on "not working"?
If you have a yaml file, I would do:
` # local_path = './my_config.yaml'
path = task.connect_configuration(local_path, name=name)
if task.running_locally():
with open(local_path, "r") as config_file:
my_params_dict = yaml.load(config_file, Loader=yaml.FullLoader)
my_params_dict['change_me'] = 'new value'
my_params_text = yaml.dump(my_params_dict)

store back the change, my_params assumed to be the content of the param file (tex...

3 years ago
0 Unrelated Problem (Or Is It?) The Clearml'S Built In Cleanup Service Fails

Very odd, I still can't reproduce. This is just the cleanup service running without anything else ?
What's the clearml version it is using ?

3 years ago
0 Hi All, I'Ve Successfully Run A Task Locally, And Now I'M Trying To Clone It And Send It To A Queue. It Looks Like The Environment Is Built Successfully, But It Hangs Here:

Nope - confirmed to be running on the OS's Python environment,

okay so bare metal root is definitely not recommended.
I'm not sure how/why it get's stuck though 😞
Any chance you can run the agent as non-root?
Also maybe preferred in docker mode, so it is easier for you to control the environment of the Task

7 months ago
0 Hi, I Am Wondering Why Do I Need To Create Files Before Applying Diff ?

Thanks DefeatedOstrich93
Let me check if I can reproduce it.

4 years ago
0 Hello, In The Following Context:

task.wait_for_status() task.reload() task.artifacts["output"].get()

4 years ago
0 Hey Community! I Have A Question Regarding The Optuna Optimizer With Clearml. I'M Using A Config Yaml File That I'M Connecting Via

Well it should work out if the box as long as you have the full route, i.e. Section/param

one year ago
0 Correct Way To Configure Ssh Authentication For Git In Agent With Docker Mode

and I run agent from local user and I would expect that settings to have effect -v /home/localuser/.ssh:/home/testuser/.ssh

It does not map it directly, it creates a temp copy in the host /tmp folder of the entire ".ssh" folder, than maps this folder inside the container:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/clearml_agent/commands/worker.py#L3422
Notice that the "docker_internal_mounts" section is nested inside the "agent" section ...

2 years ago
0 Hi, I Am Considering Different Plans Clearml Offers And It Would Be Great If Somebody Could Confirm If My Understanding Is Correct. So For Now What I Understood Is: All Of The Information Related To Tasks Like Artifacts, Scalars And Plots Are By Default U

Hi @<1566596960691949568:profile|UpsetWalrus59>
All correct with the exception of " ...or 1GB Metric" this is a limit, since metrics (and meta data) is always stored on the clearml-server, so it is metered. There is also an API limit, basically anti abuse, which of course resets every month, but if you are running tens of experiments at the same time you will hit this limit. Make sense ?

one year ago
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

why doesn't this happen on my other experiments?

same 100+ reports ?
(My new theory is that calling Task.reload() will fix it, and it might be called internally for the other experiments, like when reporting models/artifacts)
Could that be the case ?

3 years ago
0 Hey, I Want To Use The Aws Autoscaler With Spot Instances, And I Was Wondering How (Or If) You Handle Interruptions. What We Currently Implemented Is A Mechanism That On Spot Failure Reruns The Training With A Flag, And Our Code Knows To Search For The La

Are there any services OOB like this?

On the open-source, I can't recall any but will probably be easy to write. Paid tier might have an offering though, not sure πŸ™‚

3 years ago
0 Hi, Similar To Task.Set_Offline(True), Is There A Way To Simulate An Execution In An Agent? (For Testing Purposes)

Because it lives behind a VPN and github workers don’t have access to it

makes sense
If this is the case, I have to admit that combining offline-mode and remote execution makes sense, no?

2 years ago
0 Hi All, There Is A Way To Get From A Task-Object The Experiment Source Code? In Other Words, Assume I Have Access To A Specific Trains Server And Want To Store From A Particular Task The Experiment Source Code In A Temp File. There Is A Convenient Way To

SpotlessFish46
yes you can access the entire code in the incomitted changes, you can test it with:task = Task.get_task(task_id='aabb') task_dict = task.export_task()2. correct, but then if you need the entire code base you need to clone the arepo and apply the uncommitted changes. Basically trains-agent does that when execute with build
trains-agent build --id aabb --target ~/my_task_env3. See (2)

4 years ago
0 Hey :wave: *Tensorboard Logs Overwhelming Elasticsearch* I am running a clear ml server, however when running experiments with tensorboard logging I am seeing the elastic indexing time increase drastically and in some cases I have also seen timeout erro

... training script was set to upload every epoch. Seems like this resulted in a torrent of metrics being uploaded.

oh that makes sense, so basically you were bombarding the server with requests, and ending with kind of denial of service

one year ago
0 Hi

Hi @<1539780284646428672:profile|PoisedElephant79>

it's giving me no module name clearml

I'm assuming no clearml inside one og the pipeline components ? make sure you have all the imports inside the pipeline component function, like in this example:
None

one year ago
0 Hi, Similar To Task.Set_Offline(True), Is There A Way To Simulate An Execution In An Agent? (For Testing Purposes)

So there is a hack for it:
CLEARML_OFFLINE_MODE=1 python3 my_main.pyWhich is the same as calling Task.set_offline
Then inside the code After the Task.init call:
` task = Task.init(...)

not sure what the if here is?!

Task.debug_simulate_remote_task(task_id="offline-1") `This will make things act as if this is running remotely , i.e. your logic Task.running_remotely() will be called.
Do notice that in remote mode, all the arguments / data is read from the clearml-server into the cod...

2 years ago
0 Hi, I Have A Question About Clearml-Data. Clearml-Data Probably Does Well On Data Versioning, But When It Comes To Actual Loading Of Data, Are There Examples Of How It Can Make Use Of Advanced Features Such That Those In

tf datasets is able to handle batch downloading quite well.

SubstantialElk6 I was not aware of that, I was under the impression tf dataset is accessed on a file level, no?

3 years ago
0 I Have A Questions About Queue Priorities With Clearml-Agent. I Have Two Queues,

Yes, albeit not actually "intercept" as the user will be able to directly put Task sin queues B_machine_a/B_machine_b , but any time the user is pushing Tasks into queue B, this service will pull it and push to the individual machines queue.
what do you think?

3 years ago
0 Hi, Similar To Task.Set_Offline(True), Is There A Way To Simulate An Execution In An Agent? (For Testing Purposes)

ClearML maintains a github action that sets up a dummy clearml-server,

You have one, it's the http://app.clear.ml (not a dummy one, but for this purpose it will work)
thoughts ?

2 years ago
0 Hi There,

Ok no it only helps if as far as I don't log the figure.

you mean if you create the natplotlib figure and no automagic connect you still see the mem leak ?

one year ago
0 Hi, I'M Trying To Clone And Queue Experiments For Running Them On My Workers. I Am Able To Successfully Clone And Queue The Task, But Seems Like The Task Does Not Pass The Correct Parameters To My Python Script On The Worker. We Use Hydra For Configuring

I just cloned it from the examples that are available in the SaaS console upon account creation

Ohhh! that would explain it. Maybe it is broken there?! let me check a second

3 years ago
0 Hi All, Playing Around With Hp Optimisation, And I Notice In The Hyperparameteroptimizer Class Itself, The

Correct, which makes sense if you have a stochastic process and you are looking for the best model snapshot. That said I guess the default use case would be min/max (and not the global variant)

4 years ago
0 Hi All, I Was Trying To Use Clearml-Task To Run A Custom Docker(With Poetry To Install All The Python Dependencies And Activated The Environment) Using Clearml Gpu, But It Seems Like Clearml Always Create A Virtual Environment And Run The Python Script Fr

but it still not is able to run any task after I abort and rerun another task

When you "run" a task you are pushing it to a queue, so how come a queue is empty? what happens after you push your newly cloned task to the queue ?

one year ago
Show more results compactanswers