That is my workflow, code to reproduce:
def test_something():
# 1. Create a new task
task = Task.create(
project_name="Playground",
task_name="test",
)
# 2. Create a new pandas data frame and upload as artifact
test_df = pd.DataFrame(
{
"col1": [1],
"col2": [2],
}
)
task.upload_artifact("test_df", test_df)
task_id = task.task_id
task.close()
task = task.init(
project_name="Playground",
task_name="test",
reuse_last_task_id=task_id,
continue_last_task=True)
# 3. Download the pandas dataframe
downloaded_df = task.artifacts["test_df"].get()
# 4. Add a new row to the data frame and upload again with the same name (doc says this is then an update)
new_row = {
"col1": 3,
"col2": 4,
}
updated_df = downloaded_df.append(new_row, ignore_index=True)
task.upload_artifact("test_df", updated_df)
task.close()
task = task.init(
project_name="Playground",
task_name="test",
reuse_last_task_id=task_id,
continue_last_task=True)
# 5. Download the pandas dataframe again -> it still has one row (loaded first version from cache !)
downloaded_df = task.artifacts["test_df"].get()
Also, make sure you use Task.init
instead of task.init
Hey @<1547390444877385728:profile|ThickSnake12> , how exactly do you access the artifact next time? Can you provide a code sample?
You can try to add the force_download=True
flag to .get()
to ignore the locally cached content. Let me know if it helps.
yes thx a lot, it works ! i have not seen the parameter, and the Task.init is correct…