Besides that, what are your impressions on these serving engines? Are they much better than just creating my own API + ONNX or even my own API + normal Pytorch inference?
I would separate ML frameworks from DL frameworks.
With ML frameworks, the main advantage is multi-model serving on a single container, which is more cost effective when it comes to multiple model serving. As well as the ability to quickly update models from the clearml model repository (just tag + publish and the end...
Weird issue, I'll make sure we fix compatibility with python 3.9
Maybe this one?
https://github.com/allegroai/clearml/issues/448
I think it is already there (i.e. 1.1.1)
ThickDove42 you need the latest cleaml-agent RC for the docker setup script (next version due next week)pip install clearml-agent==0.17.3rc0
This is sitting on top of the serving engine itself, acting a s a control plane.
Integration with GKE is being worked on (basically KFServing as the serving engine)
I see, that means xarray
is not an actual package but a folder add to the python path.
This explains why Task.add_requirements fails, as it is supposed to add python packages to the equivalent of "requirements.txt" ...
Is the folder part of the git repository ? How would you pass it to the remote machine the cleamrl-agent is running on?
Hi ShallowArcticwolf27
Does the
clearml-task
cli command currently support remote repositories with that are intended to be used with ssh
It does 🙂
but the
git@
prefix used for gitlab's ssh it seems to default to looking for the repository locally
git@ is always the prefix for SSH repositories (it does not actually mean it uses it, it's what git will return when asked on the origin of the repository. The agent knows (if SSH credentials ...
Yes, albeit not actually "intercept" as the user will be able to directly put Task sin queues B_machine_a/B_machine_b , but any time the user is pushing Tasks into queue B, this service will pull it and push to the individual machines queue.
what do you think?
Thanks GreasyPenguin66
How about:!curl
BTW, no need to rebuild the docker, next time you can always do !apt update && apt install -y <package here>
🙂
I don't think so. it is solved by installing openssh-client to the docker image or by adding deploy token to the cloning url in web ui
You can also have the token (token==password) configured as the defauylt user/pass in your agent's clearml.conf
https://github.com/allegroai/clearml-agent/blob/73625bf00fc7b4506554c1df9abd393b49b2a8ed/docs/clearml.conf#L19
Although it's still really weird how it was failing silently
totally agree, I think the main issue was the agent had the correct configuration, but the container / env the agent was spinning was missing it,
I'll double check how come it did not print anything
I'm not sure I'm the right person to answer that, but yes my understanding is that this is a Scale/Enterprise tier feature, at least for the time being.
VexedCat68 actually a few users already suggested we auto log the dataset ID used as an additional configuration section, wdyt?
Hi WackyRabbit7 ,
Regrading git credentials, see here in the trains.conf https://github.com/allegroai/trains-agent/blob/master/docs/trains.conf#L18
Trains assumes one of two (almost three) possible setups
Your code/script is in a git repository. Then when executing manually all the git references incl` uncommitted changes are stored. Then when executing with the trains-agent, it will clone the code based on these references apply the uncommitted changes and run your code. To do that the ...
Hi @<1673501397007470592:profile|RelievedDuck3>
how can I configure my alerts to be notified when the distribution of my metrics (variables) changes on my heatmaps?
This can be done inside grafana, here is a simple example:
None
Specifically you need to create a new metric that is the distance of current distribution (i.e. heatmap) from the previous window), then on the distance metric, ...
I think the real issue is that I am not able to specify a platform for the model,
None
there is no need to specify it, remove it from the config.pbtxt - the clearml-serving will automatically add the background
BTW from the log you attached:
File "/root/.clearml/venvs-builds/3.6/lib/python3.6/site-packages/clearml/storage/helper.py", line 218, in StorageHelper
_gs_configurations = GSBucketConfigurations.from_config(config.get('google.storage', {}))
This means it tries to remove an artifact from a Task, that artifact is probably in GS (i'm assuming because it is using the GS api), and the cleanup service is missing the GS configuraiton.
WackyRabbit7 is that possible ?
I see it's a plotly plot, even though I report a matplotlib one
ClearML tries to convert matplotlib into plotly objects so they are interactive, it it fails it falls back into a static image as in matplotlib
Hi @<1697056701116583936:profile|JealousArcticwolf24>
Can you run your pipeline on an agent (i.e. remotely) but launching it from the UI (not the taskscheduler)?
Yes, because when a container is executed, the agent creates a new venv and inherits from the system wide installed packages, but it cannot inherit or "understand" there is an existing venv, and where it is.
Can you send the console output of this entire session please ?
It does not use key auth, instead sets up some weird password and then fails to auth:
AdventurousButterfly15 it ssh Into the container inside the container it sets new daemon with new random very long password
It will Not ssh to the host machine (i.e. the agent needs to run in docker mode, not venv mode), make sense ?
Thanks @<1719524641879363584:profile|ThankfulClams64> having a code that can reproduce it is exactly what we need.
One thing I might have missed and is very important , what is your tensorboard package version?
I think it is on the JWT token the session gets from the server
a bit of a hack but should work 🙂
session = task.session # or Task._get_default_session()
my_user_id = session.get_decoded_token(session.token)['identity']['user']