Could I simply just reference the files by name and pass in a string such as ~/.clearml/my_file.json
ClearML downloads/caches datasets to ~/.clearml/
folder so yes, you need to modify your code.dataset_folder = Dataset.get(project_name=, dataset_name=, version=).get_local_copy() file_json_path = os.path.join(dataset_folder, 'file.json')
You can save it as a dataset and then fetch it during run time, or am i missing something?
Sure. My git repo myProject.git
does not have file.json
checked into VCS. I'd like to add this file at experiment runtime or equivalent.
After proving we can run our training, I would then advise we update our code base
I wouldn't be able to pass in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json
as an argument?
do I have to fetch it via code? I was hoping to not modify my scripts
This would be a short term solution as we build a proof of concept
I’m afaid I don’t think there is a way to go around this without modifying your code.
Is not direcly cached in the ~/.clearml
folder. There are some directories inside (one for storage, one for pip, another for venvs, etc.
So in your case it would be stored in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json
you would, but I’d advise against it, since that is not the intended way
so it caches to ~/.clearml/ any files that are under the same project name?
I assumed I would need to upload it and then reference it somehow?
ok, but if you were to run it from a different machine (or a different user!) it wouldn’t work