Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi! I'M Running Launch_Multi_Mode With Pytorch-Lightning


@<1523701435869433856:profile|SmugDolphin23> I added os.environ["NCCL_SOCKET_IFNAME" and I managed to run on nccl
But it seems that workaround that you said do not run 2 processes on 2 nodes, but 4 processes on 4 different nodes
current_conf = task.launch_multi_node(args.nodes*args.gpus)
os.environ["NODE_RANK"] = str(current_conf.get("node_rank", ""))
os.environ["NODE_RANK"] = str(current_conf["node_rank"] // args.gpus)
os.environ["LOCAL_RANK"] = str(current_conf["node_rank"] % args.gpus)
And when I set args.nodes=2, args.gpus=2, I have 4 tasks:

  1. first host, global rank = 0, local rank = 0
  2. second host, global rank = 1, local rank = 1
  3. third host, global rank = 2, local rank = 0
  4. fourth host, global rank = 3, local rank = 1
    How do I fix this?
  
  
Posted 5 months ago
49 Views
0 Answers
5 months ago
5 months ago