Reputation
Badges 1
371 × Eureka!Just to be absolutely clear.
Agent Listening on Machine A with GPU listening to Queue X.
Task enqueued onto queue X from Machine B with no GPU.
Task runs on Machine A and experiment gets published to server?
Takes in a name and an artifact object.
Sorry for the late reply. The situation is that when I ran the task initially, it took arguments in using ArgParse. It took in a lot of arguments. Now my understanding is that add_step() clones that task. I want that to happen but I would like to be able to modify some of the values of the args, e.g epochs or some other argument.
So I just published a dataset once but it keeps scheduling task.
I'm still a bit confused around the fact that since my function runs once per hour, why are there indefinitely growing anonymous tasks, even after i've closed the main schedulers.
Is the only possible way to get a specific node, is to use one of the get_running_nodes or get_processed_nodes, and then checking every node in the list to see if the name matches the one we're looking for?
Alright. Anyway I'm practicing with the pipeline. I have an agent listening to the queue. Only problem is, it fails because of requirement issues but I don't know how to pass requirements in this case.
Thanks for the help. I'll try to continue working on the vm for now.
I'm dumping a dict to json, how can i register that dict as an artifact
Can you take a look here?
https://clearml.slack.com/archives/CTK20V944/p1637914660103300
This is where I've mentioned the anonymous task spawn issue. I kind of want to understand what's causing the problem, if it is a problem etc
Finalizes locks the model and publish I assume publishes it to the server
How would the two be different? Other than I can pass the directory to local mutable copy
Have never done something like this before, and I'm unsure about the whole process from successfully serving the model to sending requests to it for inference. Is there any tutorial or example for it?
Also the repository is on bitbucket which is why I set git_host to that.
So minimum would be 2 cores with 8 gigs for ram. I'm going to assume 4 cores and 16 gigs would be recommended.
I normally just upload the data to the ClearML server and then remove it locally from my machine but I understand that isn't what you want. A quick hack was the only thing I could come up with at the moment xd. Anyway you're welcome. Hope you find a solution.
I'm not using decorators. I have a bunch of function_steps followed by a normal task step, where I've passed a base_task_id.
I want to check the value of one of the functional steps, and if it holds true, I want to execute the task step otherwise I want the pipeline to end there, since the task step is the last one.
Set up is on a single machine, I have a nas mounted where I'm watching a folder, if there are sufficient images, it should publish the data but since I was using start_remotely, the code was running somewhere else and couldn't access folder.
i think it downloads from the curl command
It'll be labeled in the folder I'm watching it.
I just want to be able pass output from some step as input to some other step.
AgitatedDove14 Can you help me with this? Maybe something like storing the returned values or something in a variable outside the pipeline?
My use case is that the code using pytorch saves additional info like the state dict when saving the model. I'd like to save that information as an artifact as well so that I can load it later.
I just shared manually the logs because it had email and other details mentioned in the complete logs. If it helps, I'll share the logs as soon as I can.
If it helps, I can try and record my steps in a video.