Reputation
Badges 1
113 × Eureka!Do you want to use https or ssh to do git clone ? Setting up both in the same time is confusing
the weird things is if it's a Azure ACA issue, it would be known right ? There are so many people who use ACA and having ACA talking to each other.
this is really weird ...
Nevermind: None
By default, the File Server is not secured even if Web Login Authentication has been configured. Using an object storage solution that has built-in security is recommended.
My bad
right, in which case you want to dynamically change with your code, not with the config file. This is where the Logger.set_default_output_upload come in
Just keep in mind my your bottleneck will be the transfer rate. So mounting will not save you anything as you still need to transfer the whole dataset sooner or later to your GPU instance.
One solution is as Jake suggest. The other can be pre-download the data to your instance with a CPU only cheap instance type, then restart the instance with GPU.
oh ... maybe the bottleneck is augmentation in CPU !
But is it normal that the agent don't detect the GPU count and type properly ?
oh ..... did not know about that ...
So the question is really: how to know if there are new ClearML version so that the sysadmin can update ?
May be follow the github release ?
all good. Just wanted to know in case I missed it
You try to make your own docker image with CMake and even dlib inside manually
Then run clearml-agent inside your container, without docker mode.
wow , did not know that vscode have a http "interface" !!! Make kind of sense as vscode is just a Chrome rendering webpage behind the scene ?
Actually, I can set agent.package_manager.pip_version="" in the clearml.conf
And after reading 4x the doc, I can use the env var:CLEARML_AGENT__AGENT__PACKAGE_MANAGER__PIP_VERSION
I don't think agent are aware of each other. Which mean that you can have as many agent as you want and depending on your task usage, they will be fighting for CPU and GPU usage ...
Is this the same issue as per here ?
In which case, can you make your script run using that docker container nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu20.04 , manually, without ClearML ?
What change did you do to make it worked ? Does updating python enough ?
So I tried:
import livsdk.livbatch
import clearml
clearml.Task.add_requirements("livsdk","
")
task = clearml.Task.init(project_name="hieu-test", task_name='base_config')
print("Done")
Which give me this list of Packages Installed:
# Python 3.10.10 (main, Mar 05 2023, 19:07:49) [GCC]
# Local modules found - skipping:
# livsdk == ../[REDACTED]/livsdk/__init__.py
Augmentor == 0.2.10
Pillow == 9.2.0
PyYAML == 6.0
albumentations == 1.2.1
azure_storage_blob == 12.1...
May be create a Feature request on github ?
so the issue is that for some reason, the pip install by the agent don't behave the same way as your local pip install ?
Have you tried to manually install your module_b with pip install inside the machine that is running clearml-agent ? Seeing your example, looks like you are even running inside docker ?
So we have 3 python package, store in github.com
On the dev machine, the datascientist (DS) will add the local ssh key to his github account as authorized ssh keys, account level.
With the DS can run git clone git@github.com:org/repo1 then install that python package via pip install -e .
Do that for all 3 python packages, each in its own repo1 , repo2 and repo3 . All 3 can be clone using the same key that the DS added to his account.
The DS run a tra...
nope, we are self-hosted in Azure
When i set output uri in the client, artefact are sent to blob storage
When file_server is set to azure:// then model/checkpoint are sent to blob storage
But the are still plot and metrics folder that are stored in the server local disk. Is it correct?
From the log, I can see that if failed to install python package like dlib because it's missing CMake which is not avail in the native docker image nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu20.04
So if i spin up a new clearml server in the cloud and use the same file server mount point, i will see all task and expriment that i had on the in prem server in the cloud server?
following your example, if the seeds are hard coded in the code, then git hash will detect if changed happen and the step need to be run or not
I am more curious about how to migrate all the information stored in the local clearml server to the clearml server in the cloud
Last time I tried docker compose, elastic take a lot of RAM !!
You need to limit its RAM usage with mem_limit :
[...]
elasticsearch:
networks:
- backend
container_name: clearml-elastic
mem_limit: 2g
environment:
bootstrap.memory_lock: "true"
cluster.name: clearml
[...]
even it's just a local image ? You need a docker repository even if it will only be local PC ?
with
df = pd.DataFrame({'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
import clearml
task = clearml.Task.current_task()
task.get_logger().report_table(title='table example', series='pandas DataFrame', iteration=0, table_plot=df)
# logger.report_table(title='table example',series='pandas DataFrame',iteration=0,tabl...
you an use a docker image that already have those packages and dependencies, then have clearml-agent running inside or launching the docker container
Ok. Found the solution.
The importance is to use this:
Task.add_requirements("requirements.txt")
task = Task.init(project_name='hieutest', task_name='foo',reuse_last_task_id=False)
And not:
task = Task.init(project_name='hieutest', task_name='foo',reuse_last_task_id=False)
task.add_requirements("requirements.txt")