Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
213 Questions, 1020 Answers
  Active since 10 January 2023
  Last activity 25 days ago

Reputation

0

Badges 1

978 × Eureka!
0 Votes
2 Answers
620 Views
0 Votes 2 Answers 620 Views
Hi, a small bug (not really a bug) in the autoscaler: I have p3.2xlarge instances that take a long time to shutdown. With polling_interval_time_min=1 , the a...
2 years ago
0 Votes
2 Answers
588 Views
0 Votes 2 Answers 588 Views
Hi, Is it still true that --services-mode only supports docker mode?
3 years ago
0 Votes
19 Answers
574 Views
0 Votes 19 Answers 574 Views
I guess one experiment is running backwards in time πŸ˜„
2 years ago
0 Votes
9 Answers
579 Views
0 Votes 9 Answers 579 Views
Hi, I want to upgrade clearml server from 1.1 to 1.2 (self hosted). I have the following setup: /dev/nvme0n1p1 30G 21G 8.9G 70% / <- This is where /opt/clear...
2 years ago
0 Votes
12 Answers
692 Views
0 Votes 12 Answers 692 Views
Hi there! Is there an easy way to retrieve the site-package directory that was created by an agent from inside a task? Eg. task = Task.init(...) task.add_req...
one year ago
0 Votes
0 Answers
698 Views
0 Votes 0 Answers 698 Views
Hi all, Would it be possible to make the aws autoscaler log each scale in/out operation in the console to help debugging/understanding the course of events?
3 years ago
0 Votes
1 Answers
576 Views
0 Votes 1 Answers 576 Views
Hi, would it be possible to parse torch requirement when it’s part of the extras_require dict? In my code, I have the following: train_task._update_requireme...
2 years ago
0 Votes
2 Answers
604 Views
0 Votes 2 Answers 604 Views
Looks like trains-agent 0.16 doesn't support --install-globally documented parameter -> Only available for trains-agent build command. Would it be possible t...
3 years ago
0 Votes
6 Answers
581 Views
0 Votes 6 Answers 581 Views
Hi there, is it possible to configure the clearml-agent to run some commands before running each experiment it launches? Eg. echo "test" > "test.txt" && <-- ...
2 years ago
0 Votes
10 Answers
712 Views
0 Votes 10 Answers 712 Views
Hey, what is the exact difference between agent.package_manager.system_site_packages and trains-agent --install-globally ?
3 years ago
0 Votes
6 Answers
667 Views
0 Votes 6 Answers 667 Views
Hi, I am using the aws autoscaler and getting the following error while trying to spin up spot instances: 2021-08-16 17:18:48 Spinning new instance type=v100...
2 years ago
0 Votes
30 Answers
608 Views
0 Votes 30 Answers 608 Views
Hi, is it possible to pass environment variables to agents created by the AWS AutoScaler service?
3 years ago
0 Votes
1 Answers
587 Views
0 Votes 1 Answers 587 Views
Hi, is there a way to update the setup shell script via the SDK?
one year ago
0 Votes
2 Answers
602 Views
0 Votes 2 Answers 602 Views
Hi, is it possible to start a clearml-agent (not in docker mode) on a machine with a gpu, but enforce the clearml-agent to not “see” the gpu? So that the exp...
2 years ago
0 Votes
7 Answers
671 Views
0 Votes 7 Answers 671 Views
Hi, I deleted all archived experiments in a project and I just realized all experiments of all projects were deleted (clearml server v1.0.0) πŸ€”
3 years ago
0 Votes
18 Answers
543 Views
0 Votes 18 Answers 543 Views
Hi Guys, I had several times now the following errors poping in agents while executing a task: trains_agent: ERROR: Failed applying git diff: I attached the ...
3 years ago
0 Votes
4 Answers
619 Views
0 Votes 4 Answers 619 Views
2 years ago
0 Votes
3 Answers
554 Views
0 Votes 3 Answers 554 Views
Hi, I have several long running experiments failing with Process failed, exit code -9 and no other error with clearml 1.0.4 and clearml-agent 1.0.0, what cou...
2 years ago
0 Votes
10 Answers
509 Views
0 Votes 10 Answers 509 Views
Hi, just want to report a small bug in the clearml dashboard: after queuing an experiment, if I change the experiment queue, then go back to the experiment I...
2 years ago
0 Votes
29 Answers
589 Views
0 Votes 29 Answers 589 Views
Hi, although https://github.com/allegroai/clearml/issues/181 is resolved, clearml-agent (0.17.2) still logs tqdm iterations as different lines, is there some...
3 years ago
0 Votes
3 Answers
559 Views
0 Votes 3 Answers 559 Views
Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others
2 years ago
0 Votes
27 Answers
590 Views
0 Votes 27 Answers 590 Views
Hi, similar to Task.set_offline(True), is there a way to simulate an execution in an agent? (for testing purposes)
one year ago
0 Votes
11 Answers
600 Views
0 Votes 11 Answers 600 Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
2 years ago
0 Votes
4 Answers
541 Views
0 Votes 4 Answers 541 Views
Hey there, happy new year to all of you 🍾 I have several tasks that are stuck while training a model with pytorch/ignite, more precisely right after uploadi...
3 years ago
0 Votes
2 Answers
539 Views
0 Votes 2 Answers 539 Views
Hi, how can I search an old experiment based on its commit hash?
one year ago
0 Votes
23 Answers
609 Views
0 Votes 23 Answers 609 Views
Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...
3 years ago
0 Votes
1 Answers
577 Views
0 Votes 1 Answers 577 Views
3 years ago
0 Votes
3 Answers
680 Views
0 Votes 3 Answers 680 Views
Hi, I am considering making automated backups of my clearml-server using Amazon EBS snapshots. Should I be concerned with the same problem described here > h...
3 years ago
0 Votes
2 Answers
544 Views
0 Votes 2 Answers 544 Views
Hi, in the AWS AutoScaler, I am getting the following warning: Warning! exception occurred: APIError: code 400/1004: Worker is not registered: worker=aws:A10...
3 years ago
0 Votes
11 Answers
574 Views
0 Votes 11 Answers 574 Views
Hi, some properties of the Task object are not listed in the documentation (such as task.parent, which is not clear whether it is the parent task object itse...
3 years ago
Show more results questions
0 Hi, Is There A Way To Stop A Clearml-Agent From Within An Experiment? Or Block It To Prevent It Running Any Other Task?

The simple workaround I imagined (not tested) at the moment is to sleep 2 minutes after closing the task, to keep the clearml-agent busy until the instance is shutted down:
self.clearml_task.mark_stopped() self.clearml_task.close() time.sleep(120) # Prevent the agent to pick up new tasks

2 years ago
0 Hi, What Happens Exactly When I Execute The Following Command:

Thanks AgitatedDove14 !
What would be the exact content of NVIDIA_VISIBLE_DEVICES if I run the following command?
trains-agent daemon --gpus 0,1 --queue default &

3 years ago
0 Hi, I Would Like To Follow-Up In This

Another error that just popped up:

2 years ago
0 Hi, I Would Like To Follow-Up In This

Hi AppetizingMouse58 , I sent you the files in PM πŸ™‚

one year ago
0 Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

Unfortunately this is difficult to reproduce... Neverthless it would be important to me to be robust against it, because if this error happens in a task in the middle of my pipeline, the whole process fails.

This binds to another wider topic I think: How to "skip" tasks if they already run (a mechanism similar to what [ https://luigi.readthedocs.io/en/stable/ ] offers). That would allow to restart the pipeline and skip tasks until the point where the task failed

3 years ago
0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Thanks for the hack! The use case is the following: I have a controler that creates training/validation/testing tasks by cloning (so that the parent task id is properly set to the controler). Otherwise I could simply create these tasks with Task.init, but then I would need to set manually the parent task for each one of these tasks, probably with a similar hack, right?

3 years ago
0 Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

mmmmh I just restarted the experiment and it seems to work now. I am not sure why that happened. From this SO it could be related to size of the repo. Might be a good idea to clone with --depth 1 in the agents?
Or more generally, try to catch this error and retry a few times?

3 years ago
0 Hi, How Does

Ping CostlyOstrich36 AgitatedDove14 SuccessfulKoala55 Just making sure this wasn't missed πŸ™‚

one year ago
0 Hi, Together With

Sure πŸ™‚

3 years ago
0 Hi, Is There A Way To Stop A Clearml-Agent From Within An Experiment? Or Block It To Prevent It Running Any Other Task?

I want the clearml-agent/instance to stop right after the experiment/training is β€œpaused” (experiment marked as stopped + artifacts saved)

2 years ago
0 Hi, I Would Like To Bring Awareness

πŸš€ Thanks @<1523701205467926528:profile|AgitatedDove14> !

11 months ago
0 Hi,

Awesome, huge thanks to the team!

3 years ago
0 Got Some Errors While Running Migration Script From Es5 To Es7:

sure, will be happy to debug that πŸ™‚

3 years ago
0 Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

That would be awesome, yes, only from my side I have 0 knowledge of the pip codebase πŸ˜„

3 years ago
3 years ago
0 Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

yes, the only thing I changed is:
install_requires=[ ... "my-dep @ git+ ]to:
install_requires=[ ... "git+ "]

3 years ago
0 Hi, Together With

Exactly

3 years ago
0 Hi Guys, I Would Like To Start Using The Aws Autoscaler Shipped In Trains. I Need To Create A Iam User To Get And I Would Like To Know What Are The Minimal Permissions Required For The Autoscaler To Work?

Hey FriendlySquid61 ,
I ended up asking for full control of EC2 not to be blocked, so unfortunately I cannot give you a more precise list πŸ˜•

3 years ago
0 Hi There, I Used

and this works. However, without the trick from UnevenDolphin73 , the following won’t work (return None):
if __name__ == "__main__": task = Task.current_task() task.connect(config) run() from clearml import Task Task.init()

2 years ago
0 Hi There, I Used

UnevenDolphin73 , task = clearml.Task.get_task(clearml.config.get_remote_task_id()) worked, thanks

2 years ago
0 Hi There, I Used

AgitatedDove14 So I copied pasted locally the https://github.com/pytorch-ignite/examples/blob/main/tutorials/intermediate/cifar10-distributed.py from the examples of pytorch-ignite. Then I added a requirements.txt and called clearml-task to run it on one of my agents. I adapted a bit the script (removed python-fire since it’s not yet supported by clearml).

2 years ago
0 Hi There, I Used

So I guess the problem is that the following snippet:
from clearml import Task Task.init()Should be added before the if __name__ == "__main__": ?

2 years ago
0 Hi There, Congrats For Releasing V1

I am using 0.17.5, it could be either a bug on ignite or indeed a delay on the send. I will try to build a simple reproducible example to understand to cause

3 years ago
3 years ago
Show more results compactanswers