Reputation
Badges 1
2 × Eureka!Well, that's what open source is for 😉 code borrowing is like 90% of the job of software engineers 😄
Hi CurvedHedgehog15 , so my previous reply does assume you have reported a scalar for each individual FAR level. Then you can add individual levels as shown in the gif. But like you siad, that might actually cause you to loose your overview in the scalars tab.
So I don't think there's an immediate way to do this in ClearML right now, but would you mind opening an issue on github for it? It might be interesting to add it to the tool?
Hi @<1533257278776414208:profile|SuperiorCockroach75>
I must say I don't really know where this comes from. As far as I understand the agent should install the packages exactly as they are saved on the task itself. Can you go to the original experiment of the pipeline step in question (You can do this by selecting the step and clicking on Full Details" in the info panel), there under the execution tab you should see which version the task detected.
The task itself will try to autodetect t...
Hi EmbarrassedSpider34 , would you mind showing us a screenshot of your machine configuration? Can you check for any output logs that ClearML might have given you? Depending on the region, maybe there were no GPUs available, so could you maybe also check if you can manually spin up a GPU vm?
Thanks! I've asked this to the autoscaler devs and it might be a possible bug, you are the second one. He's checking and we'll come back to you!
I see. Are you able to manually boot a VM on GCP and then manually SSHing into it and running the docker login command from there? Just to be able to cross out networking or permissions as possible issues.
Does it help to also run docker login in the init bash script?
You should be able to access your AWS credentials from the environment (the agent will inject them based on your config)
Nice! Well found and thanks for posting the solution!
May I ask out of curiosity, why mount X11? Are you planning to use a GUI app on the k8s cluster?
Hey @<1539780305588588544:profile|ConvolutedLeopard95> , unfortunately this is not built-in into the YOLOv8 tracker. Would you mind opening an issue on the YOLOv8 github page and atting me? (I'm thepycoder on github)
I can then follow up the progress on it, because it makes sense to expose this parameter through the yaml.
That said, to help you right now, please change [this line](https://github.com/ultralytics/ultralytics/blob/fe61018975182f4d7645681b4ecc09266939dbfb/ultralytics/yolo/uti...
With what error message did it fail? I would expect it to fail, because you finalized this version of your dataset by uploading it 🙂 You'll need a mutable copy of the dataset before you can remove files from it I think, or you could always remove the file on disk and create a new dataset with the uploaded one as a parent. In that way, clearml will keep track of what changed in between versions.
Ok, no problem! Take your time, I think I can help you, but I don't understand yet 🙂
That would explain why it reports the task id to be 'a' in the error. It tried to index the first element in a list, but took the first character of a string instead.
I still have my tasks I ran remotely and they don't show any uncommitted changes. @<1540142651142049792:profile|BurlyHorse22> are you sure the remote machine is running transformers from the latest github branch, instead of from the package?
If it all looks fine, can you please install transformers from this repo (branch main) and rerun? It might be that not all my fixes came through
Hi @<1540142651142049792:profile|BurlyHorse22> I think I know what is happening. So, ClearML does not support having dict keys by any other type than string. This is why I made these functions to cast the dict keys to string and back after we connect them to clearml.
What happens I think is that [id2label](https://github.com/thepycoder/sarcasm_detector/blob/dbbddec35...
Wow awesome! Really nice find! Would you mind compiling your findings to a github issue, then we can help you search better :) this info is enough to get us going at least!
Doing this might actually help with the previous issue as well, because when there are multiple docker containers running they might interfere with each other 🙂
Yes, with docker auto-starting containers is def a thing 🙂 We set the containers to restart automatically (a reboot will do that too) for when the container crashes it will immediately restarts, let's say in a production environment.
So the best thing to do there is to use docker ps
to get all running containers and then kill them using docker kill <container_id>
. Chatgpt tells me this command should kill all currently running containers:docker rm -f $(docker ps -aq)
And I...
Wow! Awesome to hear :D
Hey! Sorry, didn't fully read your question and missed that you already did it. It should not be done inside the clearm-serving-triton
service but instead inside the clearml-serving-inference
service. This is where the preprocessing script is ran and it seems to be where the error is coming from.
Hi! You should add extra packages in your docker-compse through your env file, they'll get installed when building the serving container. In this case you're missing the transformers package.
You'll also get the same explanation here .
Hey @<1526371965655322624:profile|NuttyCamel41> Thanks for coming back on this and sorry for the late reply. This looks like a bug indeed, especially because it seems to be working when coming from the clearml servers.
Would you mind just copy pasting this info into a github issue on clearml-serving repo? Then we can track the progress we make at fixing it 🙂
What might also help is to look inside the triton docker container while it's running. You can check the example, there should be a pbtxt file in there. Just to doublecheck that it is also in your own folder
Hi PanickyMoth78 ,
I've just recreated your example and it works for me on clearml==1.6.2
but indeed not on clearml==1.6.3rc1
which means we have some work to do before the full release 🙂 Can you try on clearml==1.6.2
to check that it does work there?
AgitatedDove14 I was able to recreate the error. Simply by running Lavi's example on clearml==1.6.3rc1
in a fresh env. I don't know what is unique to the flow itself, but it does seem reproducible
Also a big thank you for so thoroughly testing the system and providing this amount of feedback, it really does help us make the tool better for everyone! 😄
Can you please post the result of running df -h
in this chat? Chances are quite high your actual machine does indeed have no more space left 🙂