You can save it as a dataset and then fetch it during run time, or am i missing something?
Could I simply just reference the files by name and pass in a string such as ~/.clearml/my_file.json
After proving we can run our training, I would then advise we update our code base
This would be a short term solution as we build a proof of concept
ok, but if you were to run it from a different machine (or a different user!) it wouldn’t work
Sure. My git repo myProject.git
does not have file.json
checked into VCS. I'd like to add this file at experiment runtime or equivalent.
I assumed I would need to upload it and then reference it somehow?
do I have to fetch it via code? I was hoping to not modify my scripts
you would, but I’d advise against it, since that is not the intended way
ClearML downloads/caches datasets to ~/.clearml/
folder so yes, you need to modify your code.dataset_folder = Dataset.get(project_name=, dataset_name=, version=).get_local_copy() file_json_path = os.path.join(dataset_folder, 'file.json')
I’m afaid I don’t think there is a way to go around this without modifying your code.
so it caches to ~/.clearml/ any files that are under the same project name?
I wouldn't be able to pass in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json
as an argument?
Is not direcly cached in the ~/.clearml
folder. There are some directories inside (one for storage, one for pip, another for venvs, etc.
So in your case it would be stored in ~/.clearml/cache/storage_manager/datasets/ds_{ds_id}/my_file.json