Reputation
Badges 1
25 × Eureka!Hi ImpressionableRaven99
Yes, it is π
Call this one before task.init, and it will run offline (at the end of the execution, you will get a link to the local zip file of the execution)Task.set_offline(True)Then later you can import it to the system with:Task.import_offline_session('./my_task_aaa.zip')
Hi DilapidatedCow43
I'm assuming the returned object cannot be pickled (which is ClearML's way of serializing it)
You can upload it as a model with
` uploaded_model_url = Task.current_task().update_output_model(model_path="/path/to/local/model")
...
return uploaded_model_url `wdyt?
Are you getting the error from boto failing to launch additional ec2 instances ?
Actually what my service do is to collect
stdout/stderr
from the Docker socket
That's exactly how the agent works, it cannot really filter it, it logs everything by default for full visibility ...
Hi @<1569858449813016576:profile|JumpyRaven4> could you test the fix? just pull & run
allegroai/clearml-serving-triton:1.3.1
allegroai/clearml-serving-inference:1.3.1
send the agent's logs to log management and monitoring service,
These are stored into ELK, it was built to store large amounts of logs, I cannot see any reason why one would want to remove it?
Maybe if there would be a way to change their format, it could also help filtering them from my side.
You mean in the UI?
In the agent, no, it pipes stdout/stderr of the container and logs everything π
to get a json or something like that?
There is an api to get all the console logs, is this what you are after?
To clarify, there might be cases where we get helm chart /k8s manifests to deploy a inference services. A black box to us.
I see, in that event, yes you could use clearml queues to do that, as long as you have the credentials the "Task" is basically just a deployment helm task.
You could also have a monitoring code there so that the same Task is pure logic, spinning the helm chart, monitoring the usage, and when it's done taking it down
Yeah that sound about right, also you can put the helm chart file as a configuration on the Task when creating it, see https://clear.ml/docs/latest/docs/references/sdk/task#set_configuration_object
Hi SmugTurtle78
Unfortunately there is no actual filtering for these logs, because they are so important for debugging and visibility. I have to ask, what's the use case to remove some of them ?
For the on-prem you can check the k8s helm charts it case spin agents for you (static agents).
For the GKE the best solution is the k8s glue:
https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py
FYI matplotlib imshow will create a debug image, and on complex plots the plot might get converted to image. (But shown under the plots section). All in all you might not be aware of it, but you are uploading image to your files server
Hmm HandsomeGiraffe70
This seem like a bug, let me see what we can do about that π
could it be the parent version was created with an older version of clearml sdk ?
Can clearml-serving does helm install or upgrade?
Not sure I follow, how would a helm chart install be part of the ml running ? I mean clearml-serving is installed via helm chart, but this is a "one time" i.e. you install the clearm-serving and then you can via CLI / python send models to be served there, this is not a "deployed per model" scenario, but a deployment for multiple models, dynamically loaded
This cam be as simple as a pod or a more complete helm chart.
True, and this could be good for batch processing, but if you want restapi service then clearml-serving is probably a better fit
does that make sense ?
And the agent continue running.
oh just kill al the processes with clearml-agent in the cmd line
pkill -9 -f clearml-agent
In the UI you can see all the agents and their IDs
Then you can so
clearml-agent daemon --stop <agent id>
exactly! it is very cool to see it in action, and it really works very well, kudos for these guys
Something like the TYPE_STRING that Triton accepts.
I saw the github issue, this is so odd , look at the triton python package:
https://github.com/triton-inference-server/client/blob/4297c6f5131d540b032cb280f1e[β¦]1fe2a0744f8e1/src/python/library/tritonclient/utils/init.py
Nice! So out of curiosity why didn't it work this time and you had to do it manually?
Hi ContemplativeCockroach39
Assuming you wrap your model with a flask app (or using any other serving solution), usually you need:
Get the model Add some metrics on runtime performance package in a dockerGetting a pretrained model is straight forward one you know either the creating Task or the Model ID
` from clearml import Task, Model
model_file_from_task = Task.get_task(task_id).models['output'][-1].get_local_copy()
or
model_file_from_model = Model(model_id=<moedl_id>).get_local_copy()...
Hi @<1657918706052763648:profile|SillyRobin38>
I have included some print statements
you should see those under the Task of the inference instance.
You can also do:
import clearml
...
def preprocess(...):
clearml.Logger.current_logger().report_text(...)
clearml.Logger.current_logger().report_scalar(...)
, specifically within the containers where the inferencing occurs.
it might be that fastapi is capturing the prints...
[None](https://github.com/tiangolo/uvicor...
then when we triggered a inference deploy it failed
How would you control it? Is it based on a Task ? like a property "match python version" ?
Hi @<1569496075083976704:profile|SweetShells3>
These environment variable are injected into the new process, are you passing them on the vault?
None
But how do you specify the data hyperparameter input and output models to use when the agent runs the experiment
They are autodetected if you are using Argparse / Hydra / python-fire / etc.
The first time you are running the code (either locally or with an agent), it will add the hyper parameter section for you.
That said you can also provide it as part of the clearml-task command with --args
(btw: clearml-task --help will list all the options, https://clear.ml/docs/...
Do people generally update the same model βentryβ? That feels so wrong to meβ¦how do you reproduce a older model version or do a rollback etc?
Correct, they do not π On the Task itself the output models will reflect the diff filenames you saved, usually ppl just add a running number.