if I use automatic code analysis it will not find all packages because ofย
importlib
.
But you can manually add them with Task.add_requirements, no?
MinuteGiraffe30 if you are running the following command while your current directory is where you code is, what are you getting?
$ git ls-remote --get-url origin
is the model overridden or its version is automatically increased?
You will have another model, with the same name (assuming the second Task has the same name), but a new ID. So if I understand you correctly, we have auto-versioning :)
I prepared my own image and want use this venv
No worries, it creates a "transparent" venv, it uses everything from the docker (the penalty of create a new venv is negligible ๐ , you end up with the exact same set of packages)
Too late for what?
To update the task.requirements before it actually creates it (the requirements are created in a background thread)
(once you verify PR the fix, I'll make sure it is merged)
docstring ?
Usually the preferred way is StorageManager
https://clear.ml/docs/latest/docs/references/sdk/storage
https://clear.ml/docs/latest/docs/integrations/storage
You can do that programatically, clone the pipeline Task (a pipeline is also a Task) and change the Args section of that Task, wdyt?
Example:
None
Which version? is this reproducible in this example?
None
(can you try with the latest clearml version 1.13.2?)
like what all are important metric monitoring queries w.r.t. the serving tasks that can be visualized and shown in grafana?
Basically latency amd requests per minute are automatically reported. Additional reports are based on your RestAPI in/out.
Imagine the following restapi request json payload
{x=123, y=456}
and a return json of
{z=789}
The metrics you can add to the monitoring are the keys on both these jsons, i.e. "x", "y", "z"
These metrics can be both log...
Notice you should be able to override them in the UI (under Args seciton)
Hi LazyLeopard18
I remember someone deploying , specifically on the AZURE k8s (can't remember now how they call it).
What is exactly the feedback you are after?
Hi GreasyLeopard35
I try to resume a stopped or aborted parameter optimization experiment,
How are you continuing the HPO? are you runing everything locally? is this with an agent? are you seeing the '[0, 0]' value on the configuration when launching the HPO or when continuing it ?
. Could you clarify the question for me, please?
...
Could you please point me to the piece of ClearML code related to the downloading process?
I think I mean this part:
https://github.com/allegroai/clearml/blob/e3547cd89770c6d73f92d9a05696018957c3fd62/clearml/datasets/dataset.py#L2134
Wow, thank you very much. And how would I bind my code to task?
you mean the code that creates pipeline Tasks ?
(remember the pipeline itself is a Task in the system, basically if your pipeline code is a single script it will pack the entire thing )
as i also noticed that uploads are sometimes slow, and i see here max_connections=2
Makes sense to me, please go ahead and add that as well (basically the same thing on _AzureBlobServiceStorageDriver.upload_object
and an additional variable on the AzureContainerConfigurations
class.
Could you PR a tested draft ? we will be able to take from there
What's the python, torch, clearml version?
Any chance this can be reproducible ?
What's the full error trace/stack you are getting?
Can you try to debug it to where exactly it fails here?
https://github.com/allegroai/clearml/blob/86586fbf35d6bdfbf96b6ee3e0068eac3e6c0979/clearml/binding/import_bind.py#L48
RoughTiger69 wdyt?
Hi @<1713001673095385088:profile|EmbarrassedWalrus44>
So Triton has load/unload model, but these are slowwww, meaning you cannot use them inside a request (you'll just hit the request timeout every time it tries to load the model)
as you can see this is classified as "wish-list" , this is not trivial to implement and requires large CPU RAM to store the entire model, so "loading" becomes moving CPU to GPU memory (which also is not the fastest but the best you can do). As far as I understand ...
I see,
@<1571308003204796416:profile|HollowPeacock58> can you please send the full log?
(The odd thing is it is trying to install the python 3.10 version of torch, when your command line suggest it is running python 3.8)
quick update 1.0.2 will be ready in an hour, apologies ๐
to get all the image metrics:client.events.get_task_metrics(tasks=['6adb929f66d14731bc76e3493ab89d80'], event_type='training_debug_image')
Could you post what you see under "installed packages" in the UI ?
if it ain't broke, don't fix it
๐
Up to you, just a few features & nicer UI.
BTW: everything is backwards compatible, there is no need to change anything all the previous trains/trains-agent packages will work without changing anything ๐
(This even includes the configuration file, so you can keep the current ~/trains.conf and work with whatever combination you like of trains/clearml on the same machine)
Everything seems correct...
Let's try to set it manually.
create a file ~/trains.conf , then copy paste the credentials section from the UI, it should look something like:api { web_server: http:127.0.0.1:8080 api_server: http:127.0.0.1:8008 files_server: http:127.0.0.1:8081 credentials { "access_key" = "access" "secret_key" = "secret" } }
Let's see if that works