After removing the task.connect lines, it encountered another error related to 'einops' that is not recognized. It does exist on my environment file but was not installed by the agent (according to what I see on 'Summary - installed python packages'. should I add this manually?
Yes, I'm assuming this is a derivative package that is needed by one of your packages?
Task.add_requirements("einops")
task = Task.init(...)
DeliciousSeal67 the agent will use the "install packages" section in order to install packages for the code. If you clear the entire section (you can do that in the UI or programmatically) then it will revert to requirementsd.txt
Make sense ?
Hmm I see your point.
Any chance you can open a github issue with a small code snippet to make sure we can reproduce and fix it?
Hi FierceHamster54
Sure just dodataset = Dataset.get(dataset_project="project", dataset_name="name")This will by default fetch the latest version
See the log:
Collecting keras-contrib==2.0.8
File was already downloaded c:\users\mateus.ca\.clearml\pip-download-cache\cu0\keras_contrib-2.0.8-py3-none-any.whl
so it did download it, but it failed to pass it correctly ?!
Can you try with clearml-agent==1.5.3rc2 ?
Also, on the ClearML dashboard, I can see the
clearml-agent
log:
Is the clearml-agent running in docker mode ?
Hi @<1547028116780617728:profile|TimelyRabbit96>
You are absolutely correct, we need to allow to override configuration
The code you want to change is here:
None
You can try:
channel = self._ext_grpc.aio.insecure_channel(triton_server_address, options=dict([('grpc.max_send_message_length', 512 * 1024 * 1024), ('grpc.max_receive_message_len...
Sorry @<1657918706052763648:profile|SillyRobin38> I missed this reply
Is ClearML-Serving using either System or CUCA shared memory? O
This needs to be set on the docker-compose:
and I think this line actually includes ipc: host which means there is no need to set the shm_size, but you can play around with it and let me know if you see a difference
[None](https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/docker/docker-compose-triton-gpu.yml#L1...
Thanks DilapidatedDucks58 ! We ❤ suggestions for improvements 🙂
Did you try to print a page using the browser (I think that they can all store it as pdf these days) Yes I agree, it would 🙂 we have some thoughts on creating plugins for the system, I think this could be a good use-case. Wait a week or two ;)
Hi @<1547028116780617728:profile|TimelyRabbit96>
Notice that if running with docker compose you can pass an argument to the clearml triton container an use shared mem. You can do the same with the helm chart
Hi @<1547028116780617728:profile|TimelyRabbit96>
Trying to do model inference on a video, so first step in
Preprocess
class is to extract frames.
Basically this depends on the RestAPI, usually would will be sending a link to data to be processed and returned Synchronously
What you should have a custom endpoint doing the extraction, send Raw data into another endpoint doing the model inference, basically think "pipeline" end points:
[None](https://github.com/allegro...
can we use a currently setup virtualenv by any chance?
You mean, if the cleamrl-agent needs to setup a new venv each time? are you running in docker mode ?
(by default it is caching the venv so the second time it is using a precached full venv, installing nothing)
So actually while we’re at it, we also need to return back a string from the model, which would be where the results are uploaded to (S3).
Is this being returned from your Triton Model? or the pre/post processing code?
One issue that I see is that the Dockerfile inside the agent container
Not sure I follow, these are settings for the default container to be used when the agent spins a Task for you.
How are you running the agent itself ?
Hi WackyRabbit7
First always check the functions on the Task object, they are the most straight forward access to the system.
Then if you need general purpose API calls, currently they are only documented in the doc-string of the API schema (that said it should be quite documented)
You can check all the endpoints https://github.com/allegroai/trains/tree/master/trains/backend_api/services/v2_8
And finally if you want to easily use the RestAPI :
` from trains.backend_api.session.client impo...
DistressedGoat23
We are running a hyperparameter tuning (using some cv) which might take a long time and might be even aborted unexpectedly due to machine resources.
We therefore want to see the progress
On the HPO Task itself (not the individual experiments the one controlling it all) there is the global progress of the optimization metric, is this what you are looking for ? Am I missing something?
but can it NOT use /tmp for this i’m merging about 100GB
You mean to configure your Temp folder for when squashing ?
you can do hack the following:
` import tempfile
tempfile.tempdir = "/my/new/temp"
Dataset squash
tempfile.tempdir = None `But regradless I think this is worth a GitHub issue with feature request, to set the temp folder///
Thanks GrievingTurkey78
Sure just PR (should work with any Python/Hydra version):kwargs['config']=config kwargs['task_function']=partial(PatchHydra._patched_task_function, task_function,) result = PatchHydra._original_run_job(*args, **kwargs)
ExcitedFish86 this is a general "dummy agent" that tasks and executes them (no env created, no code cloned, as you suggested)
hows does this work with HPO?
The HPO clones Tasks, changes arguments, push them into a queue, and monitors the metrics in real time. The missing part (from my understanding) was the the execution of the Tasks themselves required setup, and that you wanted multiple machine support, in order to overcome it, I post a dummy agent that just runs the Tasks.
(Notice...
Hey GiganticTurtle0 ,
So basically the issue is the the pipeline function ( prediction_service ) is getting a dict as input, and it is expecting to get basic types... if you were to do the following, it would have worked as expected.prediction_service(**default_config)I will make sure we flatten any dictionary so that we end up with config/start , instead of a serialized version of the dict.
wdyt?
Hi WickedGoat98
I try to write an article on medium about ClearML and face some a problem with plotly figures.
This is awesome !
I ran the plotly_reporting.py example locally and the uploaded plot was ok.
So are you saying the same example code from the repository worked okay on your server but showed nothing on the hosted server ?
Yeah the hack would work but i’m trying to use it form the command line to put in airflow. I’ll post on GH
Oh, then set TMP/TMPDIR environment variable, it should have the same effect
Yes, found the issue :) I'll see to it there is a fix in the next RC. ETA early next week
GrittyStarfish67
I do not wish for data duplication. Any Idea how to do this with clearml-data CLI/GUI/python?
At least in theory creating a new version with parents from multiple Datasets should just work out of the box.
wdyt?
JitteryCoyote63 I meant to store the parent ID as another "hyper-parameter" (under its own section name) not the data itself.
Makes sense ?
PompousBeetle71 so basically exclude parameters that are considered "local" only, so that other people will not accidentally use them?
Make sense. BTW: you can manually add data visualization to a Dataset with dataset.get_logger().report_table(...)