
Reputation
Badges 1
33 × Eureka!ahhh okay the logs are in a closed environment but i will try to extract what i can 🙏
SuccessfulKoala55 SweetBadger76 hey guys i tried to run this line task.set_base_docker("<image> -e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/opt/conda/envs/rapids/bin/python -e CLEARML_AGENT__AGENT__PACKAGE_MANAGER_
TYPE=conda -e C
LEARML_AGENT__VENV_DIR=/opt/conda/envs")
but it is throwing a conda DirectoryNotACondaEnvironmentError, expecting a python 3.8
environment. Am i missing something out here?
Hmm that's strange... AgitatedDove14 i'm using the 1.0.5 pypi package as well as the most recent server from this command - curl https://raw.githubusercontent.com/allegroai/trains-server/master/docker/docker-compose.yml -o docker-compose.yml (iirc it should be 1.1.1)
AgitatedDove14 i'm still getting this error in 1.0.6rc2 tho
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/aaron/.pyenv/versions/3.7.8/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/home/aaron/.pyenv/versions/3.7.8/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
` File "/home/aaron/.pyenv/versions/dev_env/lib/python3.7/site-packages/clearml/a...
oh okay i assumed the new docker-compose would pull the latest image. erm i pulled the new image from docker and restarted the server. But it still seems to be sending me the same error tho
AgitatedDove14 yup it's with target project, the code was really just...
# Creating the pipeline
pipe = PipelineController(target_project= "pipeline-demo", default_execution_queue='default', add_pipeline_tags=False)
pipe.add_step(name='model-predict', base_task_id='17cc79f0dae0426d9354ds08d979980g')
pipe.start()
# Wait until pipeline terminates
pipe.wait()
# cleanup everything
pipe.stop()
print('pipeline completed')
yes it is running the parent hpo task.
oops sorry i found the repo in the .clearml/venv-builds/ folder but but im not sure why the remainder of the code isn't executed still
Yes actually, i'm trying to access cudf/cuml libraries on rapids and the official guide insist that these libraries within the image has to be used with conda
Hey SuccessfulKoala55 , i figured a workaround to the problem and just wanted to close the loop. Rapids requires c++ code to be integrated into their package and also auxiliary packages inside their prebuilt image and the pip ecosystem currently doesn't support their requirements https://medium.com/rapids-ai/rapids-0-7-release-drops-pip-packages-47fc966e9472 (hence the need to use conda). Instead of trying to run conda with clearml-agent i figured it might be possible to pass the ` PYTHON...
oh ahahah you meant the sdk right? yea i noticed some new pipeline functionalities...was gonna wait for an official release but yea sure i will try it. Thanks mate!
within the docker image and the conda environment
Hey AgitatedDove14 , so i have gotten the latest server version (as shown in the image from the bottom right of the user profile page) and still no luck with a simple test example like this using clearml-1.0.6.rc2
from clearml import Task, StorageManager, Dataset, PipelineController
# Creating the pipeline
pipe = PipelineController(target_project= "pipeline-demo", default_execution_queue='128RAMv100', add_pipeline_tags=False)
` pipe.add_step(name='predict', base_task_id...
SuccessfulKoala55 Yes, i believe if it's within the .set_base_docker(...) method i should be able to? Is there a specific env variable i can set?
Oops just closing the loop here, it turned out it was a permissions error on the .clearml cache that was blocking. All's well now hahah thanks!
Same error without the VENV_DIR
variable. oops sorry typo it was already double spaced
task.set_base_docker("<image> -e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/opt/conda/envs/rapids/bin/python -e CLEARML_AGENT__AGENT__PACKAGE_MANAGER__TYPE=conda")
Erm i'm not sure if the pics show what you mean. but essentially the hpo is running on the hpopt queue and my worker "fixitfelix" is assigned to the queue and is supposedly running the experiment already
yup yup the code runs good locally
It intermittently reads the requirements.txt between my repo and the cache. Im wondering if there is anyway to circumvent the cache?
CostlyOstrich36 ahhh i suspect the error might be coming from using a cached repository? e.g. Using cached repository in "/root/.clearml/vcs-cache/<my repository>
it seems like it is trying to install the requirements.txt that was cached but isnt available anymore and there are occasions where the installed packages do not reflect a complete list of what was specified in the repository's requirements.txt. Could this be a possibility for the error (either not detecting the complete list...
CostlyOstrich36 I'm using clearml_agent v1.1.2 on multiple agents in the same machine
Ahh okay this was the specific replication of the environmenttask.set_base_docker("rapidsai/rapidsai-dev:21.10-cuda11.0-devel-ubuntu18.04-py3.8 -e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/opt/conda/envs/rapids/bin/python -e CLEARML_AGENT__AGENT__PACKAGE_MANAGER__TYPE=conda")
ideally the code should be able to import cuml
CostlyOstrich36 hmmm i doubt so, i'm the only one using the machine for this particular experiment at the moment.
mmm are there any methods to approach this (toggling between pip and conda mode) at the code level? i'm actually not allowed to reconfigure the agents as a developer-user.
it was pre-built by rapidsai themselves
hmmm unfortunately it isn't as straightforward...installing it on python-pip throws this exception - Exception: Please install cuml via the rapidsai conda channel. See
https://rapids.ai/start.html for instructions.
This is the parent task...shown in the pic of the experiment list above
ahhhh i see...so if i were to logically split a single agent into multiple queues would that allow me to run more tasks?