Reputation
Badges 1
25 × Eureka!you mean to spin a pod with the agent inside it (daemon in services mode).
Or connect the services queue to the k8s cluster (i.e. define the pod template that uses cpu with not a lot of ram)?
This part is odd:SCRIPT PATH: tmp.7dSvBcyI7m
How did you end with this random filename? how are you running this code?
Hi ConvolutedSealion94
Just making sure, you spinned the docker-compose of the clearml serving as well ?
Clearml automatically gets these reported metrics from TB, since you mentioned see the scalars , I assume huggingface reports to TB. Could you verify? Is there a quick code sample to reproduce?
Hi SmallDeer34
Can you try with the latest RC , I think we fixed something with the jupyter/colab/vscode support!pip install clearml==1.0.3rc1
I aborted the task because of a bug on my side
π
Following this one, is treating abort as failed a must feature for the pipeline (in your case) or is it sort of a bug in your opinion ?
Hi CluelessElephant89
When you edit the args (General section) in the UI, you are editing the args for "remote execution"
(i.e. when executed by the agent, the args
dict will get the values from the UI , as oppsed to "manual execution" where there UI gets the values from code)
In order to simulate the "remote execution" inside your development environment
Try:
` from clearml import Task
simulate remote execution of a specific Task instance
Task.debug_simulate_remote_task(task_id='R...
which to my understanding has to be given before a call to an argparser,
SmarmySeaurchin8 You can call argparse before Task.init, no worries it will catch the arguments and trains-agent
will be able to override them :)
Hi MagnificentSeaurchin79
This means the tensorflow was not directly imported in the repository (which is odd, it might point to the auto package analysis failing to find a the package, if this is the case please let me know)
Regardless, if you need to make sure a package is listed in the requirements either import it or use.Task.add_requirements('tensorflow')
or Task.add_requirements('tensorflow', '2.3.1')
Hmm make sense, then I would call the export_task once (kind of the easiest to get the entire Task object description pre-filled for you) with that, you can just create as many as needed by calling import_task.
Would that help?
There is a version coming out next week, the one after it (probably 2/3 weeks later) will have this feature
If you spin two agent on the same GPU, they are not ware of one another ... So this is expected behavior ...
Make sense ?
Okay that might explain the issue...
MysteriousBee56 so what you are saying ispython3 -m trains-agent --help
does NOT work
but trains-agent --help
does work?
This is why we recommend using pip and not conda ...
PunySquid88 after removing the "//gihub" package is it working ?
When I do the port forward on my own usingΒ
ssh -L
Β also seems to fail for jupyterlab and vscode, too, which i find odd
The only external port exposed is the SSH one 10022, then the client forwards it locally (so you, the user, can always have the same connect i.e. "ssh root@localhost -p 8022")
If you need to expose an additional port , when the clearml-session is running, open another terminal and do:sh root@localhost -p 8022 -L 10123:localhost:6666
This should po...
VexedCat68 are you manually creating the OutputModel object?
yes they do π
how I can turn off git diff uploading?
Sure, see here
None
One last question: Is it possible to set the pip_version task-dependent?
no... but why would it matter on a Task basis ? (meaning what would be a use case to change the pip version per Task)
No worries π glad it worked
Can you try to manually install it and see what you are getting?python3.10 -m pip install /home/boris/.clearml/pip-download-cache/cu117/torch-1.12.1+cu116-cp310-cp310-linux_x86_64.whl
Disable automatic model uploads
Disable the auto uploadtask = Task.init(..., auto_connect_frameworks{'pytorch': False})
Iβm not sure ifΒ
https
Β will work because I want to use ssh keys for creds.
BTW: I was not aware github provide pypi like artifactory, do they ?
Regrading SSH keys, they are passed from the host machine (i.e. in venv mode it will use the SSH keys from the user running the agent, and n docker mode, they are automatically mapped into the container)
Hi ColossalAnt7
Following on SuccessfulKoala55 answer
I saw that there is a config file where you can specify specific users and passwords, but it currently requires
- mount the configuration file (the one holding the user/pass) into the pod from a persistent volume .
I think the k8s way to do this would be to use mounted config maps and secrets.
You can use ConfigMaps to make sure the routing is always correct, then add a load-balancer (a.k.a a fixed IP) for the users a...