TimelyPenguin76 , any news regarding this?
TimelyPenguin76 , thank you for willing to help. Here is a small project attached. load_mnist.py generates a dataset, model_train.py is the script in question (it uses the dataset generated by load_mnist.py)
` Current configuration (clearml_agent v0.17.2, location: /home/olga/clearml.conf):
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.api_server =
api.web_server =
api.files_server =
api.credentials.access_key = G3TPYELRSKPW2HAHRDN1
api.host =
agent.worker_id =
agent.worker_name = anyclt104
agent.force_git_ssh_protocol = false
agent.python_binary =
agent.package_manager.type = pip
agent.package_manager.pip_version = <20.2
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = defaults
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = pytorch
agent.package_manager.torch_nightly = false
agent.package_manager.force_repo_requirements_txt = true
agent.venvs_dir = /home/olga/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/olga/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /home/olga/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /home/olga/.clearml/pip-cache
agent.docker_apt_cache = /home/olga/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:10.1-runtime-ubuntu18.04
agent.enable_task_env = false
agent.git_user = <git_username>
agent.default_python = 3.6
agent.cuda_version = 0
agent.cudnn_version = 0
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key = AKIAQ2LTCCTRIL5XE2ND
sdk.aws.s3.region = us-east-1
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
sdk.development.worker.console_cr_flush_period = 10
Executing task id [af179f16f48240fb9b9b7d79612e1626]:
repository = git@bitbucket.org:olga-gorun/clearml_examples_pipeline.git
branch = master
version_num = 5d6c20ead331356ea9e573eaa4e8184c7a63814d
tag =
docker_cmd = None
entry_point = model_test.py
working_dir = .
[package_manager.force_repo_requirements_txt=true] Skipping requirements, using repository "requirements.txt"
Using real prefix '/usr'
Path not in prefix '/home/olga/.virtualenv/clearml/include/python3.6m' '/usr'
New python executable in /home/olga/.clearml/venvs-builds/3.6/bin/python3.6
Also creating executable in /home/olga/.clearml/venvs-builds/3.6/bin/python
Installing setuptools, pip, wheel...
done.
Using cached repository in "/home/olga/.clearml/vcs-cache/clearml_examples_pipeline.git.af2cd3e0a9acd50768408558dd07db2e/clearml_examples_pipeline.git"
Note: checking out '5d6c20ead331356ea9e573eaa4e8184c7a63814d'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at 5d6c20e first clearml pipeline
type: git
url: git@bitbucket.org:olga-gorun/clearml_examples_pipeline.git
branch: HEAD
commit: 5d6c20ead331356ea9e573eaa4e8184c7a63814d
root: /home/olga/.clearml/venvs-builds/3.6/task_repository/clearml_examples_pipeline.git
Collecting pip<20.2
Using cached pip-20.1.1-py2.py3-none-any.whl (1.5 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 21.1.2
Uninstalling pip-21.1.2:
Successfully uninstalled pip-21.1.2
Successfully installed pip-20.1.1
Collecting Cython
Using cached Cython-0.29.23-cp36-cp36m-manylinux1_x86_64.whl (2.0 MB)
Installing collected packages: Cython
Successfully installed Cython-0.29.23
Adding venv into cache: /home/olga/.clearml/venvs-builds/3.6
Running task id [af179f16f48240fb9b9b7d79612e1626]:
[.]$ /home/olga/.clearml/venvs-builds/3.6/bin/python -u model_test.py
Summary - installed python packages:
pip:
- Cython==0.29.23
Environment setup completed successfully
Starting Task Execution:
Storing stdout and stderr log into [/tmp/.clearml_agent_out.3nkzl06j.txt]
Traceback (most recent call last):
File "model_test.py", line 2, in <module>
from keras.models import model_from_json
ModuleNotFoundError: No module named 'keras'
Leaving process id 9078 `
Where did you add the task.execute_remotely
command? do you have a sample code I can run?
Hi HelpfulHare30 , can you try upgrade to the latest ClearML agent?
pip install clearml-agent==1.0.0