Reputation
Badges 1
371 × Eureka!So I got my answer, for the first one. I found where the data is stored in the server
def watch_folder(folder, batch_size):
count = 0
classes = os.listdir(folder)
class_count = len(classes)
files = []
dirs = []
for cls in classes:
class_dir = os.path.join(folder, cls)
fls = os.listdir(class_dir)
count += len(fls)
files.append(fls)
dirs.append(class_dir)
if count >= batch_size:
dataset = Dataset.create(project='data-repo')
dataset.add_files(folder)
dataset.upload()
dataset.final...
there are other parameters for add_task as well, I'm just curious as to how do I pass the folder and batch size in the schedule_fn=watch_folder part
I'm using clear-ml agent right now. I just upload the task inside a project. I've used arg parse as well however as of yet, I have not been able find writable hyperparameters in the UI. Is there any tutorial video you can recommend that deals with this or something? I was following https://www.youtube.com/watch?v=Y5tPfUm9Ghg&t=1100s this one on youtube but I can't seem to recreate his steps as he sifts through his code.
wrong image. lemme upload the correct one.
It seems that is the case. Thank you for all your help guys.
CostlyOstrich36
Would you know what the pros would be to learning online other than the fact that the incoming data is as close to the current distribution of data based on time as possible for us. Also would those benefits worth it to train online?
I've finally gotten the triton engine to run. I'll be going through nvidia triton docs to find how to make an inference request. If you have an example inference request, I'll appreciate if you can share it with me.
I basically go to the model from the experiment first, then when in the model, I'm trying to download it but can't. I've screenshotted the situation.
Shouldn't I get redirected to the login page if i'm not logged in instead of the dashboard? 😞
I'll read the 3 examples now. Am I right to assume that I should drop Pipeline_Controller.py
Still unsure between finalize and publish? Since upload should upload the data to the server
Basically, right now when I save the model, it just goes in draft mode. What I want to do is that save the model only if it is better than the previous one, and once saved, publish it and have a name and tags that I want to add.
I don't think I changed anything.
My main query is do I wait for it to be a sufficient batch size or do I just send each image as soon as it comes to train
Thus I wanted to pass the model id from the prior step to the next one.
'dataset' is the name of my Dataset Object
they're also enqueued
I actually just asked about this in another thread. Here's the link. Asking about the usage of the upload_artifact
Ok since its my first time working with pipelines, I wanted to ask. Does the pipeline controller run endlessly or does it run from start to end with me telling it when to start based on a trigger?
Like there are files in a specific folder on Machine A. A script on Machine A, creates a Dataset, adds files located in that folder, and publishes it. Now can you look at that dataset on the server machine? Not from the ClearML interface but inside normal directories, like in /opt/clearml etc. this directory mentioned is just an example.
It'll be labeled in the folder I'm watching it.
I'll test it with the updated one.
Can you take a look here?
https://clearml.slack.com/archives/CTK20V944/p1637914660103300
This is where I've mentioned the anonymous task spawn issue. I kind of want to understand what's causing the problem, if it is a problem etc
And multiple agents can listen to the same queue right?
Basically the environment/container the agent is running in needs to have specific cuda installed. Is that correct CostlyOstrich36 ?
