Reputation
Badges 1
44 × Eureka!I have a task that is already completed, and, in other script, I am trying to load it and analyse the results.
I am using hydra to configure my experiments. Specifically, I want to retrieve the OmegaConf data created by hydra, config = task.get_configuration_objects()
returns a string with those values, but I do not know how to parse it, or whether I can get this data in a nested dict.
Thank you, now I am getting AttributeError: 'DummyModel' object has no attribute 'model_design'
when calling task.update_output_model("best_model.onnx")
. I checked the could I thought that it was related to the model not having a config defined, tried to set it with task.set_model_config(cfg.model)
but still getting the error.
I get the same error:
⋊> /d/c/p/c/e/reporting on master ◦ python model_config.py (longoeixo) 17:48:14
ClearML Task: created new task id=xxx
ClearML results page: xxx
` Any model stored from this point onwards,...
So downgrading to python 3.8 would be a workaround?
Thank you, After running the script, I run docker-compose -f /opt/clearml/docker-compose.yml up -d
?
Yes, but the issue is caused because rmdatasets is installed in the local environments, I needed it installed in order to test the code locally, so it is caught on the package list.
I will probably stop installing the sibling packages and adding them manually to sys.path.
Thank you, I have defined the AMI manually instead of using the default, now I am getting the following error:
Error: An error occurred (InvalidParameterValue) when calling the RunInstances operation: User data is limited to 16384 bytes
Solved by removing default parts.
Now I got a strange behavior in which I have 2 tasks on queue, the autoscaler fires two EC2 instances and then turn them off without running the tasks, then It fires two new instances again in a loop.
I used the autogenerated clearml.conf, I will try erasing the unnecessary parts.
I just called the script with:
task.set_base_docker(
docker_image="nvidia/cuda:11.7.0-runtime-ubuntu22.04",
# docker_arguments="--privileged -v /dev:/dev",
)
task.execute_remotely(queue_name="default")
Then in the console:
` Exception: Command '['/usr/bin/python3', '-m', 'poetry', 'config', '--local', 'virtualenvs.in-project', 'true']' returned non-zero exit status 1.
Error: Failed configuring Poetry virtualenvs.in-project
failed installing poetry requirements: Comman...
Follows the failure part of the log:
` Requirement already satisfied: pip in /root/.clearml/venvs-builds/3.1/lib/python3.10/site-packages (22.2.2)
Collecting Cython
Using cached Cython-0.29.32-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
Installing collected packages: Cython
Successfully installed Cython-0.29.32
Collecting boto3==1.24.59
Using cached boto3-1.24.59-py3-none-any.whl (132 kB)
ERROR: Could not find a version that satisfies the requ...
Hi, sorry for the delay. rmdatasets == 0.0.1 is the name of the local package that lives in the same repo as the training code. Instead of picking the relative path to the package.
As as work around I set the setting to force the use of requirements.txt and I am using this script to generate it:
` import os
import subprocess
output = subprocess.check_output(["pip", "freeze"]).decode()
with open("requirements.txt", "w") as f:
for line in output.split("\n"):
if " @" in line...