Reputation
Badges 1
371 × Eureka!Also what's the difference between Finalize vs Publish?
I think I get what you're saying yeah. I don't know how I would give each server a different cookie name. I can see this problem being resolved by clearing cookies or manually entering /login at the end of the url
So in my case where I schedule a task every time I publish a data, when I publish my dataset once, it triggers and starts a new task.
I just assumed it should only be triggered by dataset related things but after a lot of experimenting i realized its also triggered by tasks, if the only condition passed is dataset_project and no other specific trigger condition like on publish or on tags are added.
Basically there is an agent still listening to a queue on a machine which i might've started at some point but i can't seem to stop it.
Understandable. My main concern was that I needed initial requirements for experimentation.
Agreed. The issue does not occur when I set the trigger_on_publish to True, or when I use tag matching.
It works, however it shows the task is enqueued and pending. Note I am using .start() and not .start_remotely() for now
alright, so is there no way to kill it using worker id or worker name?
Also, since I plan to not train on the whole dataset and instead only on a subset of the data, I was thinking of making each batch of data a new dataset and then just merging the subset of data I want to train on.
I'm kind of new to developing end to end applications so I'm also learning how the predefined pipelines work as well. I'll take a look at the clear ml custom pipelines
The situation is such that I needed a continuous training pipeline to train a detector, the detector being Ultralytics Yolo V5.
To me, it made sense that I would have a training task. The whole training code seemed complex to me so I just modified it just a bit to fit my needs of it getting dataset and model from clearml. Nothing more.
I think created a task using clearml-task and pointed it towards the repo I had created. The task runs fine.
I am unsure at the details of the training code...
Yes it works, thanks for the overall help.
Have never done something like this before, and I'm unsure about the whole process from successfully serving the model to sending requests to it for inference. Is there any tutorial or example for it?
You can see there's no task bar on the left. basically I can't get any credentials to the server or check queues or anything.
Thank you, I'll start reading up on this once I've finished setting up the basic pipeline
I just made a custom repo from the ultralytics yolov5 repo, where I get data and model using data id and model id.
Tagging AgitatedDove14 SuccessfulKoala55 For anyone available right now to help out.
Ok. I kind of have a confusion now. Suppose I have an agent listening to some Queue X. If someone else on some other machine enqueues their task on Queue X, will my agent run it?
I'm getting this error.
clearml_agent: ERROR: Failed cloning repository.
- Make sure you pushed the requested commit:
- Check if remote worker has valid credentials
I did this but this gets me an InputModel. I went through the InputModel class but I'm still unsure how to get the actual tensorflow model.
This is the simplest I could get for the inference request. The model and input and output names are the ones that the server wanted.
I think maybe it does this because of cache or something. Maybe it keeps a record of an older login and when you restart the server, it keeps trying to use the older details maybe
Alright, but is it saved as a text file or pickle file?
Basically since I want to train AI Models right. I'm trying to set up the architecture where I can automate the process from data fetching to model training, and need GPU for training.
My current approach is, watch a folder, when there are sufficient data points, just move N of them into another folder and create a raw dataset and call the pipeline with this dataset.
It gets downloaded, preprocessed, and then uploaded again.
In the final step, the preprocessed dataset is downloaded and is used to train the model.
I however have another problem. I have a dataset trigger that has a schedule task.