Reputation
Badges 1
371 × Eureka!they're also enqueued
But what's happening is, that I only publish a dataset once but every time it polls, it gets triggered and enqueues a task even though the dataset was published only once.
Thank you for the help.
CostlyOstrich36 I'm observing some weird behavior. Before when I added tags to the model before publishing it, it worked fine and I could see the tags in the UI.
Now when I do it this way, tags aren't set. If I then run another code which gets the model, using ID, and then set tags, it worked fine. Let me share the codes.
So in my head, every time i publish a dataset, it should get triggered and run that task.
i think it downloads from the curl command
Also my execution just completed and as of yet, I can only see the hyperparameters as a report. not in a configurable form. I've just started with ClearML and am having these issues.
I'll look into those 3. Do those files use step 1, step 2 and step 3 files though?
because those spawned processes are from a file register_dataset.py , however I'm personally not using any file like that and I think it's a file from the library.
are there other packages other than venv required on the agent? Since I'm not sure exactly what packages do I need on the agent. Since the function normally wouldn't need venv. It just adds a number by 1
Basically since I want to train AI Models right. I'm trying to set up the architecture where I can automate the process from data fetching to model training, and need GPU for training.
It works, however it shows the task is enqueued and pending. Note I am using .start() and not .start_remotely() for now
So it won't work without clearml-agent? Sorry for the barrage of questions. I'm just very confused right now.
set the host variable to the ip assigned to my laptop by the network.
there are other parameters for add_task as well, I'm just curious as to how do I pass the folder and batch size in the schedule_fn=watch_folder part
Okay so when I add trigger_on_tags, the repetition issue is resolved.
How would the two be different? Other than I can pass the directory to local mutable copy
Thank you, I'll take a look
Let me share the code with you, and how I think they interact with eachother.
and then also write down my git username and password.
Lastly, I have asked this question multiple times, but since the MLOps process is so new, I want to learn from others experience regarding evaluation strategies. What would be a good evaluation strategy? Splitting the batch into train test? that would mean less data for training but we can test it asap. Another idea I had was training on the current batch, then evaluating it on incoming batches. Any other ideas?
AgitatedDove14 I'm also trying to understand why this is happening, is this normal and how it should be or am I doing something wrong
Let me tell you what I think is happening and you can correct me where I'm going wrong.
Under certain conditions at certain times, a Dataset is published, that activates a Dataset trigger. So if every day I publish one dataset, I activate a Dataset Trigger that day once it's published.
N publishes = N Triggers = N Anonymous Tasks, right?
I'll read the 3 examples now. Am I right to assume that I should drop Pipeline_Controller.py
Wait, so the pipeline step only runs if the pre execute callback returns True? It'll stop if it doesn't run?