Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Hello, I am trying to retrieve a simple dict artifact uploaded in a previous task with task.upload_artifact("my_dict", dict(foo="bar")) in a second task. I tried:
previous_task = Task.get_task(task.parent) my_dict = previous_task.get_registered_artifacts().get("my_dict")But this gives me an empty json. How should I do that?

  
  
Posted 4 years ago
Votes Newest

Answers 17


So when I create a task using `task = Task.init(project_name=config.get("project_name"), task_name=config.get("task_name"), task_type=Task.TaskTypes.training, output_uri=" s3://my-bucket ") locally, the artifact is correctly logged remotely, but when I create the task remotely (from an agent) the artifact is logged locally (in the agent machine, not on s3)

  
  
Posted 4 years ago

JitteryCoyote63 okay... but let me explain a bit so you get a better intuition for next time 🙂
The Task.init call, when running remotely, assumes the Task object already exists in the backend, so it ignores whatever was in the code and uses the data stored on the trains-server, similar to what's happening with Task.connect and the argparser.
This gives you the option of adding/changing the "output_uri" for any Task regardless of the code. In the Execution tab, change the "Output Destination" this will be the same as changing the "output_uri" , so even if you did not provide it in the original experiment, you can run it remotely and have your artifact uploaded to your S3.

Make sense ?

  
  
Posted 4 years ago

awesome! Unfortunately, calling artifact["foo"].get() gave me:
Could not retrieve a local copy of artifact foo, failed downloading file:///checkpoints/test_task/test_2.fgjeo3b9f5b44ca193a68011c62841bf/artifacts/foo/foo.json
It tries to get it from the local storage, but the json is stored in s3 (it does exists) and I did create both tasks specifying the correct output_uri (to s3)

  
  
Posted 4 years ago

Ho the object is actually available in previous_task.artifacts

  
  
Posted 4 years ago

So previous_task actually ignored the output_uri

  
  
Posted 4 years ago

thanks for your help!

  
  
Posted 4 years ago

Hi JitteryCoyote63 ,
upload_artifacts was designed to upload pre made artifacts, which actually covers everything.
With register_artifacts we tried to have something that will constantly log PD artifact, the use case was examples used for training and their order, so we could compare the execution of two different experiments and detect dataset contamination etc.
Not Sure it is actually useful though ...

Retrieving an artifact from a Task is done by:
Task.get_task(task_id='aaa').artifacts['foot'].get()
or if you want the file itself and not the object:
Task.get_task(task_id='aaa').artifacts['foot'].get_local_copy()

  
  
Posted 4 years ago

So get_registered_artifacts() only works for dynamic artifacts right? I am looking for a download_artifacts() which allows me to retrieve static artifacts of a Task

  
  
Posted 4 years ago

JitteryCoyote63 with pleasure 🙂
BTW: the Ignite TrainsLogger will be fixed soon (I think it's on a branch already by SuccessfulKoala55 ) to fix the bug ElegantKangaroo44 found. should be RC next week

  
  
Posted 4 years ago

Yes, thanks! In my case, I was actually using TrainsSaver from pytorch-ignite with a local path, then I understood looking at the code that under the hood it actually changed the output_uri of the current task, thats why my previous_task.output_uri = " s3://my_bucket " had no effect (it was placed BEFORE the training)

  
  
Posted 4 years ago

Oops, I spoke to fast, the json is actually not saved in s3

  
  
Posted 4 years ago

Great, looking forward!

  
  
Posted 4 years ago

It seems that around here, a Task that is created using init remotely in the main process gets its output_uri parameter ignored

  
  
Posted 4 years ago

nvm, bug might be from my side. I will open an issue if I find any easy reproducible example

  
  
Posted 4 years ago

Setting it after the training correctly updated the task and I was able to store artifacts remotely

  
  
Posted 4 years ago

and saved locally, which is why the second task, not executed in the same machine, cannot access the file

  
  
Posted 4 years ago

even if I explicitely use previous_task.output_uri = " s3://my_bucket " , it is ignored and still saves the json file locally

  
  
Posted 4 years ago
1K Views
17 Answers
4 years ago
one year ago
Tags