Reputation
Badges 1
35 × Eureka!the problem was docker, that had as entrypoint a bash script with python train.py --epochs=300
hardcoded, so I guess it was never acutally running the task setup from clearml.
not that much, I was just wondering if it was possible :-)
but would installinggit+
<user>/rmdatasets
install rmdatasets == 0.0.1
?
Aren’t they redundant?
please remove rmdatasets == 0.0.1
would it be possible to change de dataset.add_files to some function that moves your files to a common folder (local or cloud), and then use the last step in the dag to create the dataset using that folder?
I’m suggesting MagnificentWorm7 to do that yes, instead of adding the files to a ClearML dataset in each step
can you share you clearml.conf file? it should do that automatically if you set the development.default_output_uri key to “s3://{your_bucket}”
if I squash, this will rewrite the datasets, right? I want a new dataset, but keeping those there
that depends…would that only keep the latest version of each file?
Thanks for the answer. You’re right. I forgot to add that this tasks runs inside a docker container and I’m currently only mapping the $PWD ( ml
folder) into /app folder in the container.
I could map the root folder of the repo into the container, but that would mean everything ends up in there
so when inside the docker, I don’t see the git repo and that’s why ClearML doesn’t see it
oh ok, I was wondering if this could have been an issue:agent.venvs_cache.free_space_threshold_gb = 2.0
That’s why I’m suggesting him to do that 🙂
right, I’m saying I had to do that in my MAC. In your case you would have to point it to somewhere else. Please check where openblas is installed on your ubuntu
before the repo was already in the docker, but now it is running the agent inside the docker (so setting a virtualenv, and cloning the repo, and installing the packages)
in linux you can run in a terminal:export CLEARML_CONFIG_FILE=/new/path/to/conf
but it would only be affecting that session in the terminal…so you would want to add it to your .bashrc
line 120 says unmark to enable venv caching (it comes commented by default, but since I’m copying my conf it isn’t commented there)
Hi! What the error is saying is that it is looking for the the ctbc/image_classification_CIFAR10.py
file in your repo.
So when you created the task you were inside a git repo, and ClearML assumed that all your files in it were commited and pushed. However your repo https://github.com/gradient-ai/PyTorch.git doesn’t contain these files
also I suggested to change TMPDIR env variable, since /tmp/ didn’t have a lot of space.
agent.environment.TMPDIR = ****
is it ok to see *
**
*
instead of the actual path?
ClearML downloads/caches datasets to ~/.clearml/
folder so yes, you need to modify your code.dataset_folder = Dataset.get(project_name=, dataset_name=, version=).get_local_copy() file_json_path = os.path.join(dataset_folder, 'file.json')
Is not direcly cached in the ~/.clearml
folder. There are some directories inside (one for storage, one for pip, another for venvs, etc.
So in your case it would be stored in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json
I’m afaid I don’t think there is a way to go around this without modifying your code.
you would, but I’d advise against it, since that is not the intended way