Reputation
Badges 1
371 × Eureka!If I understood this correctly, so in case where we have defined steps in order as a parent child. If the parent had a pre execute callback return False, will all subsequent children nodes/steps not execute or will they just ignore it and still execute?
After the previous code, I got the model uploaded by the previous code using its ID. Now when I add tags here, they were visible in the UI
CostlyOstrich36 I'm observing some weird behavior. Before when I added tags to the model before publishing it, it worked fine and I could see the tags in the UI.
Now when I do it this way, tags aren't set. If I then run another code which gets the model, using ID, and then set tags, it worked fine. Let me share the codes.
Basically want the model to be uploaded to the server alongside the experiment results.
I basically go to the model from the experiment first, then when in the model, I'm trying to download it but can't. I've screenshotted the situation.
Since I want to save the model to the clearml server? What should the port be alongside the url?
And in that case, if I do, model.save('test'), it will also save the model to the clearml server?
tensorflow model.save, it says the model locally in saved model format.
Thank you, I found the solution to my issue, when I started reading at default output uri.
The server is on a different machine. I'm experimenting on the same machine though.
AgitatedDove14 CostlyOstrich36 I think that is the approach that'll work for me. I just need to be able to remove checkpoints I don't need given I know their name, from the UI and Storage.
For anyone facing a similar issue to mine and wanting the model to uploaded just like data is uploaded,
in the Task.init, set the output_uri = True.
This basically makes it use the default file server for clearml that you define in the clearml.conf file. Ty.
This here shows my situation. You can see the code on the left and the tasks called 'Cassava Training' on the right. They keep getting enqueued even though I only sent a trigger once. By that I mean I only published a dataset once.
Okay so when I add trigger_on_tags, the repetition issue is resolved.
However, since a new task started in the project, it would again start a new task.
and then also write down my git username and password.
If people feel it helps them, why not.
So I took dataset trigger from this and added it to my own test code, which needs to run a task every time this trigger is activated.
I'm using clearml installed via pip in a conda env. Do I find this file inside the environment directory?
Would you know what the pros would be to learning online other than the fact that the incoming data is as close to the current distribution of data based on time as possible for us. Also would those benefits worth it to train online?
there are other parameters for add_task as well, I'm just curious as to how do I pass the folder and batch size in the schedule_fn=watch_folder part
I think maybe it does this because of cache or something. Maybe it keeps a record of an older login and when you restart the server, it keeps trying to use the older details maybe
Additional Error info
Launching job: ScheduleJob(name='Watch Checkbox Detector Buffer Folder', base_task_id='', base_function=<function watch_folder at 0x7f6e308b6840>, queue=None, target_project=None, single_instance=False, task_parameters=None, task_overrides=None, clone_task=True, _executed_instances=['140111227815680', '140111227815680', '140111227815680', '140111227815680'], execution_limit_hours=None, recurring=True, starting_time=datetime.datetime(2021, 11, 25, 9, 45, 41, 175873), min...
We want to get a clearer picture here to compare versioning with ClearML Data vs our own custom versioning
Also what's the difference between Finalize vs Publish?
Agreed. The issue does not occur when I set the trigger_on_publish to True, or when I use tag matching.
Here they are. I've created and published the dataset. Then when I try to get a local copy, the code works but i'm not sure how to proceed to be able to use that data.