DilapidatedDucks58 I'm assuming clearml-server 1.7 ?
I think both are fixed in 1.8 (due to be released wither next week, or the one after)
? Do you have a link how to setup a task scheduler to run in service mode in k8s?
basically spin the agent pod and add an argument to the agent itself (this is the --service-mode)
https://clear.ml/docs/latest/docs/clearml_agent#services-mode
Thread is discussed here: None
I'm assuming some package imports absl (the TF define package) and that's the reason you see the TF defines). Does that make sense?
Actually unless you specifically detached the matplotlib automagic, any plt.show() will be automatically reported.
Okay so my thinking is, on the pipelinecontroller / decorator we will have:abort_all_running_steps_on_failure=False
(if True, on step failing it will abort all running steps and leave)
Then per step / component decorator we will havecontinue_pipeline_on_failure=False
(if True, on step failing, the rest of the pipeline dag will continue)
GiganticTurtle0 wdyt?
Hmmm, I'm not sure that you can disable it. But I think you are correct it should be possible. We will add it as another argument to Task.init. That said, FriendlyKoala70 what's the use case for disabling the code detection? You don't have to use it later, but it is always nice to know :)
2,3 ) the question is whether the serving is changing from one tenant to another, does it?
Exactly !
EnviousStarfish54
it seems that if I don't use plt.show() it won't show up in Allegro, is this a must?
Yes , at plt.show / plt.save Trains will capture the plot and send it to the backend.
BTW: when you hover over the empty plot area, do you see the plotly objects, or is it all blank ?
Hi @<1541954607595393024:profile|BattyCrocodile47>
Does clearML have a good story for offline/batch inference in production?
Not sure I follow, you mean like a case study ?
Triggering:
We'd want to be able to trigger a batch inference:
- (rarely) on a schedule
- (often) via a trigger in an event-based system, like maybe from AWS lambda function(2) Yes there is a great API for that, checkout the github actions it is essentially the same idea (RestAPI also available) ...
This really makes little sense to me...
Can you send the full clearml-session --verbose console output ?
Something is not working as it should obviously, console output will be a good starting point
JitteryCoyote63 could you test the latest RC ๐pip install clearml-agent==0.17.2rc4
when you are running the n+1 epoch you get the 2*n+1 reported
RipeGoose2 like twice the gap, i.e internally it adds the an offset of the last iteration... is this easily reproducible ?
I'd prefer to use config_dict, I think it's cleaner
I'm definitely with you
Good news:
newย
best_model
ย is saved, add a tagย
best
,
Already supported, (you just can't see the tag, but it is there :))
My question is, what do you think would be the easiest interface to tell (post/pre) store, tag/mark this model as best so far (btw, obviously if we know it's not good, why do we bother to store it in the first place...)
Weird that this code is also uploading to the 'Plots'. I replicated the same thing as my main script, but main script is still uploading to Debug Samples.
SmarmyDolphin68 are you saying the same code behaves differently ?
Hi PanickyMoth78
dataset name is ignored if
use_current_task=True
Kind of, it stores the Dataset on the Task itself (then dataset.name becomes the Task name), actually we should probably deprecate this feature, I think this is too confusing?!
What was the use case for using it ?
What if I register the artifact manually?
task.upload_artifact('local folder', artifact_object='
')
This one should be quite quick, it's updating the experiment
HealthyStarfish45 could you take a look at the code, see if it makes sense to you?
What I'm getting to, is maybe we build a template, then you could fill in the gaps ?
What do you already have working from the above steps ? and which parts are missing or we can think of automating ?
Hmm that sounds like the agent needs to access a vault with credentials per user, unfortunately this is not covered in the open-source ๐ I "think" this is supported in the enterprise version as part of the permission management
So like a UI for creating pipelines doing different things on the different solutions ?
I ended up using
task = Task.init(
continue_last_task
=task_id)
to reload a specific task and it seems to work well so far.
Exactly, this will initialize and auto log the current process into existing task (task_id). Without the argument continue_last_task ` it will just create a new Task and auto log everything to it ๐
Can I change the parameters before executing the draft task
Yes you can, after you clone the experiment everything becomes editable, so you can edit the config in the UI.
For example, let's assume I have config.yml, and in my code I do:my_file = task.connect_configuration('config.yml') with open(my_file, 'rt') as f: ...
Then after I clone it in the UI and edit the configuration, when it will be executed remotely,my_file
will contain the content of the configuration as s...
I would clone the first experiment, then in the cloned experiment, I would change the initial weights (assuming there is a parameter storing that) to point to the latest checkpoint, i.e. provide the full path/link. Then enqueue it for execution. The downside is that the iteration counter will start from 0 and not the previous run.
Hi GrotesqueDog77
and after some time I want to delete artifact with
You can simply upload with the same local file name and same artifact name, it will override the target storage. wdyt?