Reputation
Badges 1
371 × Eureka!We want to get a clearer picture here to compare versioning with ClearML Data vs our own custom versioning
Then I can use ClearML-Data with it properly.
Can you give me an example url for the api call to stop_many?
basically don't want the storage to be filled up on the ClearML Server machine.
So in my head, every time i publish a dataset, it should get triggered and run that task.
I'd like to add an update to this, when I use schedule function instead of schedule task with the dataset trigger scheduler, it works as intended. It runs the desired function when triggered. Then is asleep again next time since no other trigger was fired.
I'll try to see how to use the sdk method you just shared
There's data when I manually went there. The directory was originally hidden my bad.
{"meta":{"id":"c3edee177ae348e5a92b65604b1c7f58","trx":"c3edee177ae348e5a92b65604b1c7f58","endpoint":{"name":"","requested_version":1.0,"actual_version":null},"result_code":400,"result_subcode":0,"result_msg":"Invalid request path /","error_stack":null,"error_data":{}},"data":{}}
I have a lot of anonymous tasks running which I would like to close immediately.
Let me tell you what I think is happening and you can correct me where I'm going wrong.
Under certain conditions at certain times, a Dataset is published, that activates a Dataset trigger. So if every day I publish one dataset, I activate a Dataset Trigger that day once it's published.
N publishes = N Triggers = N Anonymous Tasks, right?
Can you guys let me know what finalize and publish methods do?
Additional Error info
Launching job: ScheduleJob(name='Watch Checkbox Detector Buffer Folder', base_task_id='', base_function=<function watch_folder at 0x7f6e308b6840>, queue=None, target_project=None, single_instance=False, task_parameters=None, task_overrides=None, clone_task=True, _executed_instances=['140111227815680', '140111227815680', '140111227815680', '140111227815680'], execution_limit_hours=None, recurring=True, starting_time=datetime.datetime(2021, 11, 25, 9, 45, 41, 175873), min...
Here they are. I've created and published the dataset. Then when I try to get a local copy, the code works but i'm not sure how to proceed to be able to use that data.
the one mentioned on the page.
So I had an issue that it didn't add the tags for some reason. There was no error, just that there were no tags on the model.
Anyway in the resume argument, there is a default=False however const=True, what's up with that, or is const a separate parameter
Found it.
https://clear.ml/docs/latest/docs/guides/clearml-task/clearml_task_tutorial/
The second example here, executing a local script. I think that was it. Thank you for the help.
Should I just train for 1 epoch? Or multiple epochs? Given I'm only training on the new batch of data and not the whole dataset?
Thank you. I'll forward these requirements and wait for a response.
With online learning, my two main concerns are that the training would be completely stochastic in nature, I would not be able to split the data into train test splits, and that it would be very expensive and inefficient to train online.
Understandable. I mainly have regular image data, not video sequences so I can do the train test splits like you mentioned normally. What about the epochs though? Is there a recommended number of epochs when you train on that new batch?
But what's happening is, that I only publish a dataset once but every time it polls, it gets triggered and enqueues a task even though the dataset was published only once.
