Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello! I’M Using Clearml On A Kubernetes Cluster And Have Encountered A Strange Behavior When Training A Model From A Non-Main (Master) Branch. In My Code (

Hello!
I’m using ClearML on a Kubernetes cluster and have encountered a strange behavior when training a model from a non-main (master) branch. In my code ( train.py + Hydra), I use task.set_script to specify the “repository” and “branch,” with the branch being a separate experimental branch. Everything was working fine until I decided to change the versions of some packages, and the training started failing.
In the training logs, I can see that packages with the new versions specified in setup.cfg are installed first, but then older packages from the setup.cfg file of the master branch begin to install. Additionally, when I connected to the Kubernetes pod, I saw that the code under /root/.clearml/vcs-cache/myrepo.git.bab58651b8533039258495e21cb16e0f/myrepo.git/ was also on the master branch (see attached screenshot).
Could you please advise what might be causing this issue? I’ve tried clearing the pip cache, but it didn’t help. Perhaps I’ve missed something?
I’d appreciate any guidance!
Thanks in advance!
image

  
  
Posted 24 days ago
Votes Newest

Answers 3


Sorry, got sick and just now got to my laptop, I will try to find logs tomorrow (if they still there ’cause this is test environment and my teammate could already delete them).

  
  
Posted 18 days ago

Hi @<1743079861380976640:profile|HighKitten20> , can you please provide the log of the job itself in ClearML(Console section of the experiment)?

  
  
Posted 23 days ago

In short: the issue is that during model training, ClearML installs packages from both the experimental (non-master) branch and the master branch, leading to version conflicts 🙂

  
  
Posted 24 days ago