
Reputation
Badges 1
35 × Eureka!just do:import os.path as op dataset_folder = Dataset.get(dataset_id="...").get_local_copy() csv_file = op.join(dataset_folder, 'salary.csv')
you would, but I’d advise against it, since that is not the intended way
would it be possible to change de dataset.add_files to some function that moves your files to a common folder (local or cloud), and then use the last step in the dag to create the dataset using that folder?
I’m suggesting MagnificentWorm7 to do that yes, instead of adding the files to a ClearML dataset in each step
That’s why I’m suggesting him to do that 🙂
Is not direcly cached in the ~/.clearml
folder. There are some directories inside (one for storage, one for pip, another for venvs, etc.
So in your case it would be stored in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json
ok, but except that in that case it would be clearml-server’s job to distribute to each user internally?
it would be easier for a sysadmin to center the credentials of the bucket in the clearml-server, without the need to distribute them…every user in the server has the same credentials, and they don’t need to know them..makes sense?
no because every user that is trying to write in the bucket has the same credentials
I’m afaid I don’t think there is a way to go around this without modifying your code.
you can either add it manually to the installed packages, or remove the installed packages and use a setup.py file to manage the installation process
exactly, somewhere in the docker running
where is the dataset stored? maybe you deleted the credentials by mistake? or maybe you are not installing the libraries needed (for example if using AWS you need boto3, if GCP you need google-cloud-storage)
not that much, I was just wondering if it was possible :-)
that depends…would that only keep the latest version of each file?
I also changed the permissions of /usr/share/elasticsearch
according to this post: https://techoverflow.net/2020/04/18/how-to-fix-elasticsearch-docker-accessdeniedexception-usr-share-elasticsearch-data-nodes/ , but I’m getting the same error
before the repo was already in the docker, but now it is running the agent inside the docker (so setting a virtualenv, and cloning the repo, and installing the packages)
if I were to run an agent that would require to install pandas at some point I’d run it:OPENBLAS="$(brew --prefix openblas)" clearml-agent daemon --queue default
but would installinggit+
<user>/rmdatasets
install rmdatasets == 0.0.1
?
Aren’t they redundant?
please remove rmdatasets == 0.0.1
I could map the root folder of the repo into the container, but that would mean everything ends up in there
oh but docker-ps
shows me 8081 ports for webserver, apiserver and fileserver containers
` CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0b3f563d04af allegroai/clearml:latest "/opt/clearml/wrappe…" 7 minutes ago Up 7 minutes 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp clear...