Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone, I Am Having Problems Getting Pytorch Nightly (For Torch 2.0 Preview) To Run On Clearml-Agent. Here Is My Log. Maybe Someone Sees What The Issue Is. I Dont Get It. It Runs All Fine Locally!

Hi everyone,
I am having problems getting PyTorch Nightly (for torch 2.0 preview) to run on clearml-agent. Here is my log. Maybe someone sees what the issue is. I dont get it. It runs all fine locally!

  
  
Posted one year ago
Votes Newest

Answers 28


Maybe if you have time you can take a look at the log I posted in the beginning. I think I have the same extra_index_url and the nightly flag activated 😕

  
  
Posted one year ago

And I use....I think python 3.8

  
  
Posted one year ago

It seems like clearml removes the dev... from torch == 1.14.0.dev20221205+cu117 in the /tmp/ cached requirements.txt

  
  
Posted one year ago

Can you maybe also tell me which docker image you used? For me this is all not working unfortunately

  
  
Posted one year ago

Alright, thank you. I will try to debug further

  
  
Posted one year ago

Sorry, not of the script, of the Task. I just added --extra-index-url to the "Installed Packages" section, and it worked.

  
  
Posted one year ago

Why not add the extra_index_url to the installed packages part of the script? Worked for me 😄

  
  
Posted one year ago

I didn't run inside a docker

  
  
Posted one year ago

Can You tell me which python version is running on the agent/docker and which docker image?

  
  
Posted one year ago

You mean I can add exactly what you wrote
--extra-index-url clearml torch == 1.14.0.dev20221205+cu117 torchvision == 0.15.0.dev20221205+cputo the installed packages section?

  
  
Posted one year ago

Yeah! I think maybe we don't parse the build number..let me try 🙂

  
  
Posted one year ago

Ran on an agent ofcourse 🙂

  
  
Posted one year ago

Yes

  
  
Posted one year ago

Bonus question: Is there some clearml-agent mode that does not do "some magic" and instead just installs exactly what is shown in the "INSTALLED PACKAGES" editor in the web UI?

  
  
Posted one year ago

Let me try

  
  
Posted one year ago

Do you have the same python version locally as remotely?
Some ways you could continue now:
you can reuse an existing python virtual environment: https://clear.ml/docs/latest/docs/clearml_agent/#virtual-environment-reuse

You can also run the agent in docker mode: https://clear.ml/docs/latest/docs/clearml_agent/#docker-mode

I'll have a look at the differences concerning the dev disappearing.

  
  
Posted one year ago

ReassuredTiger98 , Pytorch installation are a sore point 🙂 Can you maybe try to specify a specific build and see if it works?

  
  
Posted one year ago

What you mean by "Why not add the extra_index_url to the installed packages part of the script?"?

  
  
Posted one year ago

Is it possible to set extra-index-url on a per-task basis? Just asking because of the way you wrote it with the two dashes 🙂

  
  
Posted one year ago

Thanks :)

  
  
Posted one year ago

What I am trying to do it install this
torch == 1.14.0.dev20221205+cu117 torchvision == 0.15.0.dev20221205+cpuIs this what you mean by specific build?

  
  
Posted one year ago

Hi TimelyMouse69 Thank you for your answer.
I use 3.10.8 locally and 3.10.6 remotely. Everything is run in a docker container, locally and remotely on the docker-agent (exactly the same docker image).
Thank you for looking into the disappearing dev . It seems like this should be the reason for pip trying to install a stable version of 1.14, which does only exist as nightly

  
  
Posted one year ago

I only added
# Python 3.8.2 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] --extra-index-url clearml torch == 1.14.0.dev20221205+cu117 torchvision == 0.15.0.dev20221205+cpuand I used a amd64/ubuntu:20.04 docker image with python3.8 . Same error. If it is not too much to ask, could you try to run it with this docker image?

  
  
Posted one year ago

ReassuredTiger98 I think it works for me 🙂
I added this to the requirements (You can put the extra-index-url in the clearml.conf), and I've enabled the torch nightly flag:
--extra-index-url https://download.pytorch.org/whl/nightly/cu117
clearml
torch == 1.14.0.dev20221205+cu117
torchvision == 0.15.0.dev20221205+cpu

  
  
Posted one year ago

Also clearml-agent at version 1.5 does not look for nightly at the correct indexes even of torch_nightly set to true in clearml.conf

Looking in indexes: https://pypi.org/simple , https://download.pytorch.org/whl/cu117/

  
  
Posted one year ago

btw: Could you check whether agent.package_manager.system_site_packages is true or false in your config and in the summary that the agent gives before execution?
I start my agent in --foreground mode for debugging and it clearly show false , but in the summary that the agent gives before the task is executed, it shows true .

  
  
Posted one year ago
1K Views
28 Answers
one year ago
one year ago
Tags