Reputation
Badges 1
44 × Eureka!Thank you, now I am getting AttributeError: 'DummyModel' object has no attribute 'model_design'
when calling task.update_output_model("best_model.onnx")
. I checked the could I thought that it was related to the model not having a config defined, tried to set it with task.set_model_config(cfg.model)
but still getting the error.
I do not recall the older version, It is from a couple of months ago, but the new version is WebApp: 1.2.0-153 • Server: 1.2.0-153 • API: 2.16
Do I have to have the lock file in the root? or it can be on the working dir?
Thank you, I was importing everything inside the function so the the hydra autocomplete would run faster.
Yes the example works. As the example, in my code I am basically starting with doing, was not that supposed to work?
` @hydra.main(config_path="config", config_name="config")
def main(cfg: DictConfig):
import os
import pytorch_lightning as pl
import torch
import yaml
import clearml
pl.seed_everything(cfg.seed)
task = clearml.Task.init(
project_name=cfg.project_name,
task_name=cfg.task_name,
) `
Hydra params are still not upload on 1.0.4
With the account admin email. The one in which I got the receipt.
I edited the clearml.conf on the agent and set the manager to poetry, do I need to have poetry installed on the agent beforehand, considering that I am using docker?
Follows the failure part of the log:
` Requirement already satisfied: pip in /root/.clearml/venvs-builds/3.1/lib/python3.10/site-packages (22.2.2)
Collecting Cython
Using cached Cython-0.29.32-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
Installing collected packages: Cython
Successfully installed Cython-0.29.32
Collecting boto3==1.24.59
Using cached boto3-1.24.59-py3-none-any.whl (132 kB)
ERROR: Could not find a version that satisfies the requ...
Yes, but the issue is caused because rmdatasets is installed in the local environments, I needed it installed in order to test the code locally, so it is caught on the package list.
I will probably stop installing the sibling packages and adding them manually to sys.path.
Actually, this error happens when a launch the autoscaler from the Web UI, when I enqueue a task, it launches an EC2 instance which "Status Check" stays in "Pending" for over 15 minutes and then the instance is terminated by the scaler that launches another one in a loop.
SuccessfulKoala55 , how do I set the agent version when creating the autoscaler?
Basically, I am following the steps in this video:
https://www.youtube.com/watch?v=j4XVMAaUt3E
AgitatedDove14 , here follows the full log:
Hi, any update on that?
I think not, I have not set any ENV variable. Just went to the web UI, added an autoscaler, filled the data in the UI and launched the autoscaler.
By inspecting the scaler task, it is running the following docker image: allegroai/clearml-agent-services-app:app-1.1.1-47
I am launching through the UI, "XXX workspace / https://app.clear.ml/applications / AWS Autoscaler".
So It could be launched by the clearml cli? I can also try that.
Hi, AgitatedDove14
How do I set the version to 1.5.1,? When I launch the autoscaler the version 1.5.0 is picked by default.
Thank you, After running the script, I run docker-compose -f /opt/clearml/docker-compose.yml up -d
?
Solved by removing default parts.
Now I got a strange behavior in which I have 2 tasks on queue, the autoscaler fires two EC2 instances and then turn them off without running the tasks, then It fires two new instances again in a loop.
Thank you, I have defined the AMI manually instead of using the default, now I am getting the following error:
Error: An error occurred (InvalidParameterValue) when calling the RunInstances operation: User data is limited to 16384 bytes
I am using hydra to configure my experiments. Specifically, I want to retrieve the OmegaConf data created by hydra, config = task.get_configuration_objects()
returns a string with those values, but I do not know how to parse it, or whether I can get this data in a nested dict.
I have a task that is already completed, and, in other script, I am trying to load it and analyse the results.
Hi, sorry for the delay. rmdatasets == 0.0.1 is the name of the local package that lives in the same repo as the training code. Instead of picking the relative path to the package.
As as work around I set the setting to force the use of requirements.txt and I am using this script to generate it:
` import os
import subprocess
output = subprocess.check_output(["pip", "freeze"]).decode()
with open("requirements.txt", "w") as f:
for line in output.split("\n"):
if " @" in line...
So downgrading to python 3.8 would be a workaround?
Yes, tried with python 3.8, now it works.
I get the same error:
⋊> /d/c/p/c/e/reporting on master ◦ python model_config.py (longoeixo) 17:48:14
ClearML Task: created new task id=xxx
ClearML results page: xxx
` Any model stored from this point onwards,...
This is being started as a command line script.
Also tried saving the model with:task.set_model_config(cfg.model) task.update_output_model("best_model.onnx")
But got the same exception,