Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi, I'Ve Been Getting The Following Error When Running Training Code Through An Agent,


@<1523701087100473344:profile|SuccessfulKoala55> and @<1523701070390366208:profile|CostlyOstrich36> Ok so I found the problem but its weird,
when the agent is setting up the enviorment its installing torch=1.11.0 and not installing the one in the requirements which is torch=1.11.0+cu113,
I've checked the clearml.conf and i do have this flag set:

force_repo_requirements_txt: true

and I have a local whl of torch=1.11.0+cu113 with a path set to its location in the requirements.txt but its not installing the local whl but using a cached one without cuda.
i do know that i have a miss match between the installed cuda (12.0) and the one stated in the requirements(11.3) and i noticed in the log that it says the following:

Torch CUDA 118 index page found

and yet when i run locally Its using my conda env with torch1.11.0+cu113 perfectly,
Can an a agent run with a higher version CUDA run a application with a lower version?
Why when running from the agent its not installing my requirements and caching them into a env?

  
  
Posted one year ago
149 Views
0 Answers
one year ago
one year ago