Reputation
Badges 1
2 × Eureka!Ok I check 3: The commandclearml-serving --id <your_id> model add --engine triton --endpoint "test_model_keras" --preprocess "examples/keras/preprocess.py" --name "train keras model" --project "serving examples" --input-size 1 784 --input-name "dense_input" --input-type float32 --output-size -1 10 --output-name "activation_2" --output-type float32
should be
` clearml-serving --id <your_id> model add --engine triton --endpoint "test_model_keras" --preprocess "examples/keras/preprocess.py" ...
I'm able to reproduce, but your workaround seems to be the best one for now. I tried launching with clearml-task
command as well, but we have the same issue there: only argparse arguments are allowed.
AgitatedDove14 any better workaround for this, other than waiting for the jsonargparse issue to be fixed?
As you can see in the issue comments, for now you can report it using iteration=0 and then adding the value as a field in the experiment overview (as leaderboard). This will give you a quick overview of your metrics per experiment in the main experiment list π
Most likely you are running a self-hosted server. External embeds are not available for self-hosted servers due to difficult network routing and safety concerns (need access from the public internet). The free hosted server at app.clear.ml does have it.
Unfortunately no, ClearML serving does not infer input or output shapes from the saved models as of today. Maybe you could open an issue on the github of ClearML serving to request it? Preferably with a clear, minimal example, that would be awesome! We'd take it into account for next releases
Hi @<1533257278776414208:profile|SuperiorCockroach75>
I must say I don't really know where this comes from. As far as I understand the agent should install the packages exactly as they are saved on the task itself. Can you go to the original experiment of the pipeline step in question (You can do this by selecting the step and clicking on Full Details" in the info panel), there under the execution tab you should see which version the task detected.
The task itself will try to autodetect t...
Ok, no problem! Take your time, I think I can help you, but I don't understand yet π
Hmm, I can't really follow your explanation. The removed file SHOULD not exist right? π And what do you mean exactly with the last sentence? An artifact is an output generated as part of a task. Can you show me what you mean with screenshots for example?
Indeed that should be the case. By default debian is used, but it's good that you ran with a custom image, so now we know it's not clear that more permissions are needed
Great! Please let me know if it works when adding this permission, we'll update the docs in a jiffy!
I'm using image and machine image interchangeably here. It is quite weird that it is still giving the same error, the error clearly asked for "Required 'compute.images.useReadOnly' permission for 'projects/image-processing/global/images/image-for-clearml'"
π€
Also, now I see your credentials even have the role of compute admin, which I would expect to be sufficient.
I see 2 ways forward:
- Try running the autoscaler with the default machine image and see if it launches correctly
- Dou...
Here is an example of deploying an sklearn model using ClearML serving.
However, please note that sklearn-like models don't have input and output shapes in the same sense as deep learning models have. Setting the I/O shapes using the CLI is usually meant for GPU-based deep learning models that need to know the sizes for better GPU allocation. In the case of sklearn on CPU, all you have to do is set up your preprocess...
Are you running a self-hosted/enterprise server or on app.clear.ml? Can you confirm that the field in the screenshot is empty for you?
Or are you using the SDK to create an autoscaler script?
With what error message did it fail? I would expect it to fail, because you finalized this version of your dataset by uploading it π You'll need a mutable copy of the dataset before you can remove files from it I think, or you could always remove the file on disk and create a new dataset with the uploaded one as a parent. In that way, clearml will keep track of what changed in between versions.
This looks to me like a permission issue on GCP side. Do your GCP credentials have the compute.images.useReadOnly
permission set? It looks like the worker needs that permission to be able to pull the images correctly π
It looks like you need to add the compute.imageUser
role to your credentials: None
Did you by any chance set up the autoscaler to use a custom image? It's trying to use βprojects/image-processing/global/images/image-for-clearmlβ which is a path I don't recognise. Is this your own, custom image? If so, we can add this role to the documentation as required when using a custom image π
No inputs and outputs are ever set automatically π For e.g. Keras you'll have to specify it using the CLI when making the endpoint, so Triton knows how to optimise as well as set it correctly in your preprocessing so Triton receives the format it expects.
Can you share the exact error message? That will help a ton!
Hi @<1533257278776414208:profile|SuperiorCockroach75> , the clearml experiment manager will try to detect your package requirements from its original environment. Meaning that if you run the code and it imports e.g. SQLAlchemy, then it will log the exact version of SQLAlchemy you have installed locally.
When you run only get_data,py
locally and have the experiment manager track it, can you then look at the task that is made in the clearml webUI and check the installed packages section? ...
I see. Are you able to manually boot a VM on GCP and then manually SSHing into it and running the docker login command from there? Just to be able to cross out networking or permissions as possible issues.
Isitdown seems to be reporting it as up. Any issues with other websites?
The built in HPO uses tags to group experiment runs together and actually use the original optimizer task ID as tag to be able to quickly go back and see where they came from. You can find an example in the ClearML Examples project.
Did you by any chance save the checkpoint without any file extention? Or with a weird name containing slashes or points? The error seems to suggest the content type was not properly parsed
AdventurousButterfly15 The fact that it tries to ping localhost means you are running the ClearML server locally right? In that case, it is a docker thing: it cannot access localhost
because localhost inside a docker image is not the same one as your machine itself. They're isolated.
That said, adding --network=host
to the docker command usually fixes this by connecting the container to the local network instead of the internal docker one.
You can add a custom argument either i...
Does it help to also run docker login in the init bash script?
You should be able to access your AWS credentials from the environment (the agent will inject them based on your config)
Just to be sure I understand you correctly: you're saving/dumping an sklearn model in the clearml experiment manager, then want to serve it using clearml serving, but you do not wish to specify the model input and ouput shapes in the CLI?
It is not filled in by default?
projects/debian-cloud/global/images/debian-10-buster-v20210721