Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hey, We'Ve Experienced Some Issues With Clearml Trigger Schedulers We Were Playing With In The Last Few Days. This Is What Happened:


Unfortunately, no, I can't paste the whole code. In a nutshell, the trigger spawns a new GCE instance with a clearml-agent running to schedule the experiments in Cloud.
This is an excerpt:

def gcp_start_trigger(task_id: str):
    curr_task = Task.get_task(task_id)
    #curr_task.reset(force=True)
    config = extract_config(curr_task)
    machine_type = config.get('machine-type')
    queue_name = f"gcp/{machine_type}"
    ensure_queue(queue_name)  # creates a new queue if it doesn't exist
    instance_name = name_generator(task_id)
    print(config)  # debug print
    gpus = create_gpus(config)  # define gpus
    create_from_machine_type(
        project_id=GOOGLE_PROJECT,
        zone=f"{GOOGLE_ZONE}",
        instance_name=instance_name,
        machine_type=machine_type,
        accelerators=gpus,
        queue_name=queue_name
    )
    Task.dequeue(curr_task)  # remove from an empty queue
    Task.enqueue(curr_task, queue_name=queue_name)  # put the task in a particular queue
    return

def gcp_stop_trigger(task_id):
    instance_name = name_generator(task_id)
    delete_instance(
        project_id=GOOGLE_PROJECT,
        zone=f"{GOOGLE_ZONE}",
        machine_name=instance_name
    )
    delete_disk(
        project_id=GOOGLE_PROJECT,
        zone=f"{GOOGLE_ZONE}",
        machine_name=f"{instance_name}",
    )
    return

trigger = TriggerScheduler(pooling_frequency_minutes=10/60)
trigger.add_task_trigger(
    trigger_required_tags=['google'],
    schedule_function=gcp_start_trigger,
    trigger_on_status=['queued'],
    name="job_start",
)
trigger.add_task_trigger(
    trigger_required_tags=['google'],
    schedule_function=gcp_stop_trigger,
    trigger_on_status=['failed', 'completed', 'stopped', 'closed'],
    name="job_end",
)
trigger.start_remotely()

however, I don't think it's our code, since the trigger is not triggered at all, unless a new task is created :((

as for the clearml version, they differ:

  • the clearml server we self-host shows this: WebApp: 1.7.0-232 • Server: 1.7.0-232 • API: 2.21
  • the installed clearml in a trigger task shows clearml==1.8.2
  • the installed clearml in the experiment task that attempts to trigger is 1.9.0
  
  
Posted one year ago
99 Views
0 Answers
one year ago
one year ago