Reputation
Badges 1
2 × Eureka!Hi there! There are several services who need persistent storage, check here for an overview diagram.
If I'm not mistaken, there's the fileserver, elastic, mongo and redis. All info is scattered over these (e.g. model files on fileserver, logs on elastic) so there is no one server holding everything.
I'm not a k8s expert, but I think that even a dynamic PVC should not delete itself. Just to be sure though, you can indee...
FierceHamster54 I saw you saying the YOLOv5 project and name are hardcoded in there. Fixed that for ya 😉 https://github.com/ultralytics/yolov5/pull/10100
Yes you can! The filter syntax can be quite confusing, but for me it helps to print task.__
dict__
on an existing task object to see what options are available. You can get values in a nested dict by appending them into a string with a .
Example code:
` from clearml import Task
task = Task.get_task(task_id="17cbcce8976c467d995ab65a6f852c7e")
print(task.dict)
list_of_tasks = Task.query_tasks(task_filter={
"all": dict(fields=['hyperparams.General.epochs.value'], p...
Oohh interesting! Thanks for the minimal example though. We might want to add it to the docs as an example of dynamic DAG creation 🙂
If that's true, the error should be on the combine function, no? Do you have a more detailed error log or minimal reproducible example?
Not exactly sure what is going wrong without an exact error or reproducible example.
However, passing around the dataset object is not ideal, because passing info from one step to another in a pipeline requires ClearML to pickle said object and I'm not exactly sure a Dataset obj is picklable.
Next to that, running get_local_copy() in the first step does not guarantee that you can access that data from the other step. Both might be executed in different docker containers or even on different...
I'm able to reproduce, but your workaround seems to be the best one for now. I tried launching with clearml-task
command as well, but we have the same issue there: only argparse arguments are allowed.
AgitatedDove14 any better workaround for this, other than waiting for the jsonargparse issue to be fixed?
HomelyShells16 Thanks for the detailed write-up and minimal example. I'm running it now too
Unfortunately no, ClearML serving does not infer input or output shapes from the saved models as of today. Maybe you could open an issue on the github of ClearML serving to request it? Preferably with a clear, minimal example, that would be awesome! We'd take it into account for next releases
No inputs and outputs are ever set automatically 🙂 For e.g. Keras you'll have to specify it using the CLI when making the endpoint, so Triton knows how to optimise as well as set it correctly in your preprocessing so Triton receives the format it expects.
Just to be sure I understand you correctly: you're saving/dumping an sklearn model in the clearml experiment manager, then want to serve it using clearml serving, but you do not wish to specify the model input and ouput shapes in the CLI?
Most likely you are running a self-hosted server. External embeds are not available for self-hosted servers due to difficult network routing and safety concerns (need access from the public internet). The free hosted server at app.clear.ml does have it.
In the meantime, it might help to limit the amount of jobs using the advanced settings. If you know the exact amount and want to do every one for sure, just set it that way 🙂
1 Can you give a little more explanation about your usecase? It seems I don't fully understand yet. So you have multiple endpoints, but always the same preprocessing script to go with it? And you need to gather a different threshold for each of the models?
2 Not completely sure of this, but I think an AMD APU simply won't work. ClearML serving is using triton as inference engine for GPU based models and that is written by nvidia, specifically for nvidia hardware. I don't think triton will ...
You can apply git diffs by copying the diff to a file and then running git apply <file_containing_diff>
But check this thread to make sure to dry-run first, to check what it will do, before you overwrite anything
https://stackoverflow.com/questions/2249852/how-to-apply-a-patch-generated-with-git-format-patch
If you didn't use git, then clearML saves your .py
script completely in the uncommited changes
section like you say. You should be able to just copy paste it to get the code. In what format are your uncommited changes logged? Can you paste a screenshot or paste the contents of uncommitted changes
?
Check your agent logs (through clearml console tab) and check if there isn't any error thrown.
What is probably happening is that your agent tries to upload the model but fails due to some kind of networking/firewall/port issue. For example: make sure you host your self-hosted server on 0.0.0.0 host so it's able to accept external connections other than localhost
Hi @<1523701062857396224:profile|AttractiveShrimp45> , I'm checking your issue myself. Do you see any duplicate experiments in the summary table?
I agree, I came across the same issue too. But your post helps make it clear, so hopefully it can be pushed! 🙂
It depends on how complex your configuration is, but if config elements are all that will change between versions (i.e. not the code itself) then you could consider using parameter overrides.
A ClearML Task can have a number of "hyperparameters" attached to it. But once that task is cloned and in draft mode, one can EDIT these parameters and change them. If then the task is queued, the new parameters will be injected into the code itself.
A pipeline is no different, it can have pipeline par...
RoundMosquito25 it is true that the TaskScheduler
requires a task_id
, but that does not mean you have to run the pipeline every time 🙂
When setting up, you indeed need to run the pipeline once, to get it into the system. But from that point on, you should be able to just use the task_scheduler on the pipeline ID. The scheduler should automatically clone the pipeline and enqueue it. It will basically use the 1 existing pipeline as a "template" for subsequent runs.
That's what happens in the background when you click "new run". A pipeline is simply a task in the background. You can find the task using querying and you can clone it too! It is places in a "hidden" folder called .pipelines
as a subfolder on your main project. Check out the settings, you can enable "show hidden folders"
Do you have a screenshot of what happens? Have you checked the console when pressing f12?
It is not filled in by default?
projects/debian-cloud/global/images/debian-10-buster-v20210721
Are you running a self-hosted/enterprise server or on app.clear.ml? Can you confirm that the field in the screenshot is empty for you?
Or are you using the SDK to create an autoscaler script?
Could you use tags for that? In that case you can easily filter on which group you're interested in, or do you have a more impactful UI change in mind to implement groups? 🙂
Hi Jax! We have a blogpost explaining how to use it almost ready to go. I'll ping you here when its out.
In the meantime you can check out the https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/resources/tao-getting-started of TAO. Download the zipfile with examples and under notebooks>tao_launcher_starter_kit>detectnet_v2
you'll find a notebook with an example on how to use the integration.
Hmm I think we might have made it more clear in the documentation then? How would you have been helped before you figured it out? (great job BTW, thanks for the updates on it :))
I'll update you once I have more!