BTW:
Task.add_requirements('tensorflow', '2.2') will make sure you get the specified version 🙂
if I have 2.4 or 2.2 in both there is no issue
it's my error: I have tensorflow==2.2 in my venv, and added Task.add_requirements('tensorflow')
which forces tensorflow==2.4:
Storing stdout and stderr log into [/tmp/.clearml_agent_out.kmqde7st.txt]
Traceback (most recent call last):
File "aicalibration/generate_tfrecord_pipeline.py", line 15, in <module>
task = Task.init(project_name='AI Calibration', task_name='Pipeline step 1 dataset artifact')
File "/home/username/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/task.py", line 536, in init
TensorflowBinding.update_current_task(task)
File "/home/username/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/binding/frameworks/tensorflow_bind.py", line 36, in update_current_task
PatchKerasModelIO.update_current_task(task)
File "/home/username/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/binding/frameworks/tensorflow_bind.py", line 1412, in update_current_task
PatchKerasModelIO._patch_model_checkpoint()
File "/home/username/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/binding/frameworks/tensorflow_bind.py", line 1450, in _patch_model_checkpoint
from tensorflow.python.keras.engine.network import Network # noqa
File "/home/username/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/binding/import_bind.py", line 59, in __patched_import3
level=level)
File "/home/username/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 83, in <module>
class Network(base_layer.Layer):
File "/home/username/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 379, in Network
@trackable_layer_utils.cache_recursive_attribute('dynamic')
AttributeError: module 'tensorflow.python.training.tracking.layer_utils' has no attribute 'cache_recursive_attribute'
Hmm should not make a diff.
Could you verify it still doesn't work with TF 2.4 ?
I had to downgrade tensorflow 2.4 to 2.2 though..any idea why?
Okay, could you try to run again with the latest clearml package from github?pip install -U git+
it would be completed right after the upload
Yes
Are you trying to upload_artifact to a Task that is already completed ?
it is ok?
Task init
params setup
task.execute_remotely()
real_code here
File "aicalibration/generate_tfrecord_pipeline.py", line 30, in <module>
task.upload_artifact('train_tfrecord', artifact_object=fn_train)
File "/home/usr_341317_ulta_com/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/task.py", line 1484, in upload_artifact
auto_pickle=auto_pickle, preview=preview, wait_on_upload=wait_on_upload)
File "/home/usr_341317_ulta_com/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/binding/artifacts.py", line 560, in upload_artifact
self._task.set_artifacts(self._task_artifact_list)
File "/home/usr_341317_ulta_com/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/backend_interface/task/task.py", line 1201, in set_artifacts
self._edit(execution=execution)
File "/home/usr_341317_ulta_com/.clearml/venvs-builds/3.7/lib/python3.7/site-packages/clearml/backend_interface/task/task.py", line 1771, in _edit
raise ValueError('Task object can only be updated if created or in_progress')
ValueError: Task object can only be updated if created or in_progress
2021-03-05 22:36:13,111 - clearml.Task - INFO - Waiting to finish uploads
2021-03-05 22:36:13,578 - clearml.Task - INFO - Finished uploading
I commented the upload_artifact at the end of the code and it finishes correctly now
upload_artifact caused the "failed" issue ?
I commented the upload_artifact at the end of the code and it finishes correctly now
your example with absl package worked
well let me try excecute one of your samples
Hmm... any idea on what's different with this one ?
just this one...it marks as completed when executed locally
This is odd , and it is marked as failed ?
Are all the Tasks marked failed, or is it just this one ?
where can I find more info about why it failed?
Could you download and send the entire log ?