Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 6 months ago

Reputation

0

Badges 1

25 × Eureka!
0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

maybe I should use explicit reporting instead of Tensorboard

It will do just the same 😞

there is no method for setting 

last iteration

, which is used for reporting when continuing the same task. maybe I could somehow change this value for the task?

Let me double check that...

overwriting this value is not ideal though, because for :monitor:gpu and :monitor:machine ...

That is a very good point

but for the metrics, I explicitly pass th...

2 years ago
0 In Pipelinev2, Is It Possible To Register Artifacts To The Pipeline Task? I See There Is A Private Variable

If this is the case I would do:

` # Add the collector steps (i.e. the 10 Tasks
pipe.add_task(...
post_execute_callback=Collector.collect_me
)

pipe.start()
pipe.wait()
Collector.process_results(pipe) `wdyt?

2 years ago
0 Hi, I Am Having Difficulties When Using The Dataset Functionality. I Am Trying To Create A Dataset With The Following Simple Code:

But what I get with 

get_local_copy()

 is the following path: ...

Get local path will return an immutable copy of the dataset, by definition this will not be the "source" storing the data.
(Also notice that the dataset itself is stored in zip files, and when you get the "local-copy" you get the extracted files)
Make sense ?

3 years ago
0 Hi All—First Off, Thanks For Being Such A Helpful And Thorough Group Of People. I Learn A Ton Just Searching Through The Channel For Problems. I’M Seeing A Weird Issue. I Have A Conda Env On My Linux Machine, And I Can Successfully Run A Training Script

not sure if this is considered a bug or not! but I’d happily make an issue on github if needed.

I think we should, at least for the sake of transparency and visibility 🙂

thanks again for all your help.

My pleasure 🙂

3 years ago
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

MuddySquid7 you mean you are creating them with TB ? or are you uploading them as debug images ?
Specifically in the ClearML UI, do you have it under "plots" tab or "debug samples" tab ?

3 years ago
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

MuddySquid7 I might have found something, and this is very very odd, it seems it will Not upload any new images post the history size, which is very odd considering the number of users actively using this feature...
Do you want to try a hack to see if it solved your issue ?

3 years ago
0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

oh...so is this a bug?

It was always a bug, only an elusive one 😉
Anyhow, I'll make sure we push a fix to GitHub, an RC is planned for later this week, it will contain it

3 years ago
0 Any Chance Storagemanager Could Re-Download Files Only If Their Size Is Different From File In Cache (As An Option)?

any chance StorageManager could re-download files only if their size is different from file in cache (as an option)?

I think there is force argument, to force download.
I think the main issue is getting the size from different backends (i.e. s3 /https / etc.)
Maybe we should add it as a GitHub feature request issue?
The main limitation is that the driver "list()" does not return file size.
For example it might be an issue with the default http files-server.
wdyt?

3 years ago
0 For Remote Execution Where The Queue Has

Hmm @<1523701083040387072:profile|UnevenDolphin73> I think this is the reason, None
and this means that even without a full lock file poetry can still build an environment

one year ago
0 Anyone Doing Sagemaker With Clearml - Something Like The K8S Glue But The Tasks Are Pulled Into Sagemaker Training Jobs

Aws autoscaler will work with iam rules along as you have it configured on the machine itself. Sagemaker job scheduling (I'm assuming this is what you are referring to, and not the notebook) you need to select the instance as well (basically the same as ec2). What do you mean by using the k8s glue, like inherit and implement the same mechanism but for sagemaker I stead of kubectl ?

3 years ago
0 Hello Clearml Ppl

Hi SmoggyGoat53
What do you mean by "feature store" ? (These days the definition is quite broad, hence my question)

2 years ago
0 Hi Team, Me Again! Im Curious If Someone Can Explain To Me Better How Task And Optimisers Integrate With Each Other. In The Example Hyperparameter Optimisation, There Is Both A Task Initialised With

, is the team open to PRs from external people?

Yes please do! PRs are welcomed! I thought we fixed the GitHub readme to reflect it, anyhow I'll make sure we do 🙂

3 years ago
0 Hi, Plotting A Debug Sample With A

Thanks VirtuousFish83 !
This is great

3 years ago
0 Hi Guys. Say That We Train A Model With 10 Epoch, And Suddenly Interruption Occur On Epoch 5. How Can We Continue The By Using Clearml?

Hi @<1546665666675740672:profile|AttractiveFrog67>

  • Make sure you stored the model's checkpoint (either pass output_uri=True in Task.init or manually upload)
  • When you call Task.init pass " continue_last_task=True "
  • Now you can do last_checkpoint=task.models["output"][-1].get_local_copy() and all you need is to load last_checkpoint
one year ago
0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

Oh if this is the case you can probably do
` import os
import subprocess
from clearml import Task
from clearml.backend_api.session.client import APIClient

client = APIClient()

queue_ids = client.queues.get_all(name="queue_name_here")

while True:
result = client.queues.get_next_task(queue=queue_ids[0].id)
if not result or not result.entry:
sleep(5)
continue
task_id = result.entry.task
client.tasks.started(task=task_id)
env = dict(**os.environ)
env['CLEARML_TASK_ID'] = ta...

2 years ago
0 I’M Getting These Errors When Using Agent In Docker Mode

it works if I run the same command manually.

What do you mean?
Can you do:
docker run -it <my container here> bashThen immediately get an interactive bash ?

3 years ago
0 Hi I Saw This On The Clearml-Agent Docs But Other Than The Docker Image, I'M Not Sure How To Integrate This With Clearml Py And Clearml-Server. Please Advise.

Hi SubstantialElk6
I'm not sure what you are asking 🙂
Basically the clearml-agent will pull a Task from an execution queue, and execute it (based on the definition on the Task, i.e. git repo, python packages docker image etc.)

3 years ago
Show more results compactanswers