Found it.
https://clear.ml/docs/latest/docs/guides/clearml-task/clearml_task_tutorial/
The second example here, executing a local script. I think that was it. Thank you for the help.
Should I just train for 1 epoch? Or multiple epochs? Given I'm only training on the new batch of data and not the whole dataset?
Thank you. I'll forward these requirements and wait for a response.
With online learning, my two main concerns are that the training would be completely stochastic in nature, I would not be able to split the data into train test splits, and that it would be very expensive and inefficient to train online.
Understandable. I mainly have regular image data, not video sequences so I can do the train test splits like you mentioned normally. What about the epochs though? Is there a recommended number of epochs when you train on that new batch?
But what's happening is, that I only publish a dataset once but every time it polls, it gets triggered and enqueues a task even though the dataset was published only once.
when you connect to the server properly, you're able to see the dashboard like this with menu options on the side.
Currently every 2000 iterations, a checkpoint is saved, that's just part of the code. Since output_uri = True, it gets uploaded to the ClearML server.
The scheduler is set to run once per hour but even now I've got around 40+ anonymous running tasks.
Tagging AgitatedDove14 SuccessfulKoala55 For anyone available right now to help out.
I'm on windows rn, and I work with clearml on ubuntu. I think it's 1.1.5rc4
let me check
It works, however it shows the task is enqueued and pending. Note I am using .start() and not .start_remotely() for now
Thank you for the help with that.
I checked the value is being returned, but I'm having issues accessing merged_dataset_id in the preexecute_callback like the way you showed me.
I then did what MartinB suggested and got the id of the task from the pipeline DAG, and then it worked.
AnxiousSeal95 I'm trying to access the specific value. I checked the type of task.artifacts and it's a ReadOnlyDict. Given that the return value I'm looking for is called merged_dataset_id, how would I go about doing that?
Thank you, this is a big help. I'll give this a go now.
I initially wasn't able to get the value this way.
AnxiousSeal95 I just have a question, can you share an example of accessing an artifact of a previous step in the pre execute callback?
From what I recall, I think resume was set to false, originally and in the cloned task.
I'm curious as to if this is buggy behavior or if it is expected?
There's a whole task bar on the left in the server. I only get this page when i use the ip 0.0.0.0
You can see there's no task bar on the left. basically I can't get any credentials to the server or check queues or anything.