Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Https://Clearml.Slack.Com/Archives/Ctk20V944/P1713357955958089

Votes Newest

Answers 18


PricklyRaven28 thank you for the feedback. We will investigate this further

  
  
Posted 10 months ago

Interesting, i wasn't aware of this python module for executing accelerate. I'll try to use that.

It's essentially the cmd line:
None
None

  
  
Posted 11 months ago

AgitatedDove14
Only got some time to work on it now, i created a small reproducible example.
I also tried to use your suggestion with import accelerate, it also had issues.

overall, when using debug_pipeline it works ok, but both methods don't work without it, i think it has something to do with wrapping accelerate.

Problem with launching through python module (your suggestion), the argparse breaks.
Problem with launching using a new process - rank0 process hangs and never finishes.

Both work fine with debug_pipeline

  
  
Posted 10 months ago

Thank you PricklyRaven28 !!!
Let me see if we can reproduce and how to solve it

  
  
Posted 10 months ago

Hi PricklyRaven28 ! Thank you for the example. We managed to reproduce. We will investigate further to figure out the issue

  
  
Posted 10 months ago

We used subprocess for it, ...

Popen? os.system? fork?

  
  
Posted 11 months ago

How does this work in the context of a pipeline? One of the steps is a multi gpu training that requires accelerate.

  
  
Posted 11 months ago

If nothing specific comes to mind i can try to create some reproducible demo code (after holiday vacation)

Yes please! 🙏
In the mean time see if the workaround is a valid one

  
  
Posted 11 months ago

It's with decorators.

Interesting, i wasn't aware of this python module for executing accelerate. I'll try to use that.

We used subprocess for it, but for some reason only when invoked in the pipeline the process freezes and doesn't close the main accelerate process. Works fine outside of clearml, any Idea?

  
  
Posted 11 months ago

We tried both subprocess.run and popen

  
  
Posted 11 months ago

SmugDolphin23 AgitatedDove14
Any updates? 🙂

  
  
Posted 10 months ago

PricklyRaven28 Can you please try clearml==1.16.2rc0 ? We have released a fix that will hopefully solve your problem

  
  
Posted 10 months ago

to make it very reproducible, i created a docker file for it, so make sure to run build_docker.sh and then run.sh

  
  
Posted 10 months ago

Hi SmugDolphin23
Confirming that rank0 process does not hang with the new version!

The accelerate CLI problem does still reproduce though (it's in my demo)

  
  
Posted 10 months ago

Hi PricklyRaven28
Sorry, we missed that one

we need to invoke it with

accelerate launch

so we use

subprocess.run

So you have two options, either you change the script entry of the Task from your " script.py " to" -m accelerate launch script.py
or you manually do that inside your entry point (i.e. call accelerate launch)
BTW, I "think" we added an "auto detect" for it, so that if you launched it manually this way it will know to register it as " -m accelerate launch ... "

  
  
Posted 11 months ago

Glad to hear you were able to reproduce it! Waiting for your reply 🙏

  
  
Posted 10 months ago

If nothing specific comes to mind i can try to create some reproducible demo code (after holiday vacation)

  
  
Posted 11 months ago

How does this work in the context of a pipeline?

Is your pipeline from functions / decorators ? or is it from Tasks ?
(if this is Tasks then just changing the entry point in the overides)
In case of functions or decorators, you have to do that manually (i.e. your function needs to do "accelerate launch"

from accelerate.commands.launch import launch_command, launch_command_parser
parser = launch_command_parser()
args = parser.parse_args("-command -here".split())
launch_command(args)
  
  
Posted 11 months ago
1K Views
18 Answers
11 months ago
10 months ago
Tags
Similar posts