Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hey, We Are Using Clearml 1.9.0 With Transformers 4.25.1… And We Started Getting Errors That Do Not Reproduce In Earlier Versions (Only Works In 1.7.2 All 1.8.X Don’T Work):


Hey @<1523701949617147904:profile|PricklyRaven28> , So as discussed above there were 2 issues. The first one is still waiting on the second, it's on the backlog of our devs and should be done soon(tm).

That said, in the meantime I also wanted to do fun stuff with transformers, so I've written a quick hack that deals with the bug. It's bascially 2 functions that keep track of which types of keys are in the dict.

def cast_keys_to_string(d, changed_keys=dict()):
    nd = dict()
    for key in d.keys():
        if not isinstance(key, str):
            casted_key = str(key)
            changed_keys[casted_key] = key
        else:
            casted_key = key
        if isinstance(d[key], dict):
            nd[casted_key], changed_keys = cast_keys_to_string(d[key], changed_keys)
        else:
            nd[casted_key] = d[key]
    return nd, changed_keys

def cast_keys_back(d, changed_keys):
    nd = dict()
    for key in d.keys():
        if key in changed_keys:
            original_key = changed_keys[key]
        else:
            original_key = key
        if isinstance(d[key], dict):
            nd[original_key], changed_keys = cast_keys_back(d[key], changed_keys)
        else:
            nd[original_key] = d[key]
    return nd, changed_keys

You can then use them like this:

        training_args = TrainingArguments(
            output_dir="my_awesome_model",
            learning_rate=2e-5,
            per_device_train_batch_size=16,
            per_device_eval_batch_size=16,
            dataloader_num_workers=0,
            num_train_epochs=2,
            weight_decay=0.01,
            evaluation_strategy="epoch",
            save_strategy="epoch",
            load_best_model_at_end=True
        )

        # Allow ClearML access to the training args and allow it to override the arguments for remote execution
        args_class = type(training_args)
        args, changed_keys = cast_keys_to_string(training_args.to_dict())
        training_args = args_class(**cast_keys_back(args, changed_keys)[0])

        self.trainer = Trainer(
            model=self.model,
            args=training_args,
            train_dataset=tokenized_dataset["train"],
            eval_dataset=tokenized_dataset["test"],
            tokenizer=self.tokenizer,
            data_collator=data_collator,
            compute_metrics=self.compute_metrics,
        )

        self.trainer.train()

This "hack" in combination with the patch to Huggingface from above should work 🙂 That said, it is a hack, so a production version of this should be there soon. I'll let you know when that happens!

  
  
Posted one year ago
159 Views
0 Answers
one year ago
one year ago