Reputation
Badges 1
44 × Eureka!Yes the example works. As the example, in my code I am basically starting with doing, was not that supposed to work?
` @hydra.main(config_path="config", config_name="config")
def main(cfg: DictConfig):
import os
import pytorch_lightning as pl
import torch
import yaml
import clearml
pl.seed_everything(cfg.seed)
task = clearml.Task.init(
project_name=cfg.project_name,
task_name=cfg.task_name,
) `
I think not, I have not set any ENV variable. Just went to the web UI, added an autoscaler, filled the data in the UI and launched the autoscaler.
By inspecting the scaler task, it is running the following docker image: allegroai/clearml-agent-services-app:app-1.1.1-47
This is being started as a command line script.
Also tried saving the model with:task.set_model_config(cfg.model) task.update_output_model("best_model.onnx")
But got the same exception,
So downgrading to python 3.8 would be a workaround?
I get the same error:
⋊> /d/c/p/c/e/reporting on master ◦ python model_config.py (longoeixo) 17:48:14
ClearML Task: created new task id=xxx
ClearML results page: xxx
` Any model stored from this point onwards,...
Yes, tried with python 3.8, now it works.
Its a S3 bucket, it is working since I am able to upload models before this call and also custom artifacts on the same script.
Ubuntu 18.04
Python: 3.9.5
Clearml: 1.0.4
Tried using a custom python version:
` FROM nvidia/cuda:11.7.0-runtime-ubuntu22.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y
build-essential libssl-dev zlib1g-dev libbz2-dev
libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev
xz-utils tk-dev libffi-dev liblzma-dev git
&& rm -rf /var/lib/apt/lists/*
RUN git clone ~/.pyenv
RUN /root/.pyenv/bin/pyenv install 3.10.6
ENV PATH="/root/.pyenv/versions/3.10....
I just called the script with:
task.set_base_docker(
docker_image="nvidia/cuda:11.7.0-runtime-ubuntu22.04",
# docker_arguments="--privileged -v /dev:/dev",
)
task.execute_remotely(queue_name="default")
Then in the console:
` Exception: Command '['/usr/bin/python3', '-m', 'poetry', 'config', '--local', 'virtualenvs.in-project', 'true']' returned non-zero exit status 1.
Error: Failed configuring Poetry virtualenvs.in-project
failed installing poetry requirements: Comman...
I edited the clearml.conf on the agent and set the manager to poetry, do I need to have poetry installed on the agent beforehand, considering that I am using docker?
Do I have to have the lock file in the root? or it can be on the working dir?
So It could be launched by the clearml cli? I can also try that.
Hi, any update on that?
AgitatedDove14 , here follows the full log: