Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Do I Understand Correctly That Python Versions Must Match Between Client (My Mac, Sends Task For Remote Execution) And Clearml-Agent? I Don’T Really Get How The Environments Are Managed. All I Want To Do Is Take My Code And Execute It On The Agent Machin

Do I understand correctly that python versions must match between client (my mac, sends task for remote execution) and clearml-agent?

I don’t really get how the environments are managed. All I want to do is take my code and execute it on the agent machine in a predefined agent environment. But it seems to be taking the packages from my local env and trying to install them on an agent venv, runs into some issue with incompatiable versions and crashes

What happens if my local python env has pytorch for cpu and I want to send something to be executed on GPU?

  
  
Posted one year ago
Votes Newest

Answers 21


On the agent side it’s trying to install different pytorch versions (even though the env already has it all configured), then fails with torch_<something>.whl is not a valid wheel for this system

  
  
Posted one year ago

What I am seeing is that the agent always fails trying to install some packages when I am not asking it at all

  
  
Posted one year ago

I have no idea what it is doing

  
  
Posted one year ago

Is there a reason it is requiring pytorch? )
The script you provided has only clearml as a requirement

  
  
Posted one year ago

What version of python is the agent machine running locally?
Does it support
torch == 1.12.1?

  
  
Posted one year ago

I think you can either add the requirement manually through code ( https://clear.ml/docs/latest/docs/references/sdk/task#taskadd_requirements ) or force the agent to use the requirements.txt when running in remote

  
  
Posted one year ago

Well I don’t want that! My local machine is a Mac with no GPU. But I want to execute my code on a server with GPUs. I don’t want my local environment, I want the one configured for the agent!

  
  
Posted one year ago

Here’s the error I get:
https://justpaste.it/7aom5

It’s trying to downgrade pytorch to 1.12.1 for some reason (why?) using a version for an outdated CUDA (I have 11.7, it tries to use pytorch for CUDA 11.6). Finally crashes

  
  
Posted one year ago

The failure is that it does not even run

  
  
Posted one year ago

Let me get the exact error for you

  
  
Posted one year ago

Here’s the agent config. It’s basically default
https://justpaste.it/4ozm3

  
  
Posted one year ago

Hi AdventurousButterfly15 ,

When running code locally, how are the installed packages detected? Does it detect your entire venv or does it detect only the packages that were used?

  
  
Posted one year ago

Locally I have a conda env with some packages and a basic requirements file.
I am running this thing:
` from clearml import Task, Dataset
task = Task.init(project_name='Adhoc', task_name='Dataset test')
task.execute_remotely(queue_name="gpu")

from config import DATASET_NAME, CLEARML_PROJECT
print('Getting dataset')

dataset_path = Dataset.get(
dataset_name=DATASET_NAME,
dataset_project=CLEARML_PROJECT,
).get_local_copy()#.get_mutable_local_copy(DATASET_NAME)

print('Dataset path', dataset_path) `Then on the server side I have clear-ml agent running in default (venv) mode, started from a conda env with the same python version. Then it does something to packages and crashes

  
  
Posted one year ago

(agent) adamastor@adamastor:~/clearml_agent$ python -c "import torch; print(torch.__version__)" 1.12.1

  
  
Posted one year ago

CostlyOstrich36 in installed packages it has:
` # Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:41:22) [Clang 13.0.1 ]

Pillow == 9.2.0
clearml == 1.7.1
minio == 7.1.12
numpy == 1.23.1
pandas == 1.5.0
scikit_learn == 1.1.2
tensorboard == 2.10.1
torch == 1.12.1
torchvision == 0.13.1
tqdm == 4.64.1 `Which is the same as I have locally and on the server that runs clearml-agent

  
  
Posted one year ago

So I think I'm missing something. What is the point of failure?
ClearML tries to detect the packages you used during the code execution. It will then try to install those packages when running remotely.

  
  
Posted one year ago

Can you add here the agent section of your ~/clearml.conf

  
  
Posted one year ago

When you look at the original task that appears in the UI, what are the requirements shown in the 'execution' tab?

  
  
Posted one year ago

So how do you attach the pytorch requirement?

  
  
Posted one year ago

Yeah, pytorch is a must. This script is a testing one, but after this I need to train stuff on GPUs

  
  
Posted one year ago

Pytorch is configured on the machine that’s running the agent. It’s also in requirements

  
  
Posted one year ago
1K Views
21 Answers
one year ago
one year ago
Tags