Reputation
Badges 1
35 × Eureka!where is the dataset stored? maybe you deleted the credentials by mistake? or maybe you are not installing the libraries needed (for example if using AWS you need boto3, if GCP you need google-cloud-storage)
ok, but if you were to run it from a different machine (or a different user!) it wouldn’t work
Thanks TimelyPenguin76 for your answer! So indeed it was mounting it, and how do I check that “CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL” is working in my agent in docker?
great! and I saw that there were some system packages needed for opencv that were installed automatically that could be turned off. Now I’m just wondering if I could remove the PIP install at the very beginning, so it starts straightaway
the problem was docker, that had as entrypoint a bash script with python train.py --epochs=300
hardcoded, so I guess it was never acutally running the task setup from clearml.
Hi AgitatedDove14 , I’m talking about the following pip install.
After that pip install, it displays agent’s conf, shows installed packages, and launches the task (no installation)
` Running in Docker mode (v19.03 and above) - using default docker image: spoter ['-e CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1', '-e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1']
Running task '3ebb680b17874cda8dc7878ddf6fa735'
Storing stdout and stderr log to '/tmp/.clearml_agent_out.tsu2tddl.txt', '/tmp/.clearml_agent_o...
you would, but I’d advise against it, since that is not the intended way
ClearML downloads/caches datasets to ~/.clearml/
folder so yes, you need to modify your code.dataset_folder = Dataset.get(project_name=, dataset_name=, version=).get_local_copy() file_json_path = os.path.join(dataset_folder, 'file.json')
I’m afaid I don’t think there is a way to go around this without modifying your code.
just do:import os.path as op dataset_folder = Dataset.get(dataset_id="...").get_local_copy() csv_file = op.join(dataset_folder, 'salary.csv')
ok, I entered the container, replaced all 8081 to 8085 in every file, commited the container and changed the docker-compose.yml
to use that image instead of the allegroai/clearml:latest
and now it works 🙂
so when inside the docker, I don’t see the git repo and that’s why ClearML doesn’t see it
that depends…would that only keep the latest version of each file?
if I squash, this will rewrite the datasets, right? I want a new dataset, but keeping those there
not that much, I was just wondering if it was possible :-)
right, I’m saying I had to do that in my MAC. In your case you would have to point it to somewhere else. Please check where openblas is installed on your ubuntu
mmm, can you try the following:
create a new folder with no git repo, and copy those two notebooks launch the notebook with the base task and copy the task id launch the notebook with the hyperopt task modifying the TEMPLATE_TASK_ID
variable accordingly
would it be possible to change de dataset.add_files to some function that moves your files to a common folder (local or cloud), and then use the last step in the dag to create the dataset using that folder?
so I removed the entrypoint, and now I can see that it tries to install the packages, but it fails because it can’t download the repo
you can either add it manually to the installed packages, or remove the installed packages and use a setup.py file to manage the installation process
Hey! When you say it wasn’t enough, what do you mean? Can you launch the web UI?
also I suggested to change TMPDIR env variable, since /tmp/ didn’t have a lot of space.
agent.environment.TMPDIR = ****
is it ok to see *
**
*
instead of the actual path?
oh ok, I was wondering if this could have been an issue:agent.venvs_cache.free_space_threshold_gb = 2.0
Right, but there is a lot of free space (257 GB) in the home folder