Reputation
Badges 1
25 × Eureka!Could you send the "installed packages" section of the Task that was created in the notebook ?
SlipperyDove40 following on the missing section name, this seems like backwards compatibility issue. Try calling with backwards_compatibility=False
my_params = Task.get_parameters(backwards_compatibility=False)
This should always add the section name prefix.
Out of interest, is there a reason these are read-only?
Yes, we should probably change that... they are designed to be pre-populated, but there should not be any reason you could not remove them
The code for these tasks is on github right?
Correct
Hmm that is a good idea, and I think you are correct, it cannot support it. But it will be easy to do, maybe adding an argument trigger_on_archive
? wdyt?
I suspect it failed to create one on the host and then mount into the docker
orchestration module
When you previously mention clone the Task I the UI and then run it, how do you actually run it?
regarding the exception stack
It's pointing to a stdout that was closed?! How could that be? Any chance you can provide a toy example for us to debug?
Hi @<1523701260895653888:profile|QuaintJellyfish58>
Is there a way or a trigger to detect when the number of workers in a queue reaches zero?
You mean to spin them down? what's the rational ?
Iād like to implement a notification system that alerts me when there are no workers left in the queue.
How are they "dropping" ?
Specifically to your question, let me check I'm sure there is an API that get's that data becuase you can see it in the UI š
And other question is clearml-serving ready for serious use?
Define serious use? KFserving support is in the pipeline, if that helps.
Notice that clearml-serving is basically a control plane for the serving engine, not to neglect the importance of it, the heavy lifting is done by Triton š (or any other backend we will integrate with, maybe Seldon)
So General would have created a General instead of Args?
yes,
This is a must, you have to specify the hyperparameters section you are referencing.
https://github.com/allegroai/clearml/blob/5a9155b2039413280f13dfded1121470c4c4323d/examples/pipeline/step2_data_processing.py#L21
This is actually:task.connect(args, name='General')
Basically there is no "random_state" only "General/random_state"
Make sense ?
BTW: in your code, you should probably replacedataset_task = Task.get_task(task_id=dataset.id)
with:dataset_task = dataset._task
Do you accidentally know if there are any plans for an implementation with the logger variable, so that in case of something it would be possible to write to different tables?
CheerfulGorilla72 what do you mean "an implementation with the logger variable" ? pytorch-lighting defaults to the TB logger, which clearml will automatically catch and log into the clearml-server, you can always add additional logs with clearml interface Logger.current_logger().report_???
What am I mis...
Could that be the proper way to install ?
https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md#3-install-wheels-for-linux
OddAlligator72 FYI you can also import / export an entire Task (basically allowing you to create it from scratch/json, even without calling Task.create)Task.import_task(...) Task.export_task(...)
No, clearml uses boto, this is internal boto error, which points bucket size limit, see the error itself
why would root cause the user to become nobody with group nogroup?
It is exactly the case, they inherit the cron service user (uid/gid) which would look like nobody/nogroup
Its stored on the Task, you can see it under the execution tab in the UI
Found the issue, fix in the next RC (soon to be out)
BroadMole98 Awesome, can't wait for your findings š
Hi OutrageousGiraffe8
I was not able to reproduce š
Python 3.8 Ubuntu + TF 2.8
I get both metrics and model stored and uploaded
Any idea?
[Assuming the above is what you are seeing]
What I "think" is happening is that the Pipeline creates it's own Task. When the pipeline completes, it closes it's own Task, basically making any later calls to Tasl.current_task() return None, because there is no active Task. I think this is the reason that when you are calling process_results(...) you end up with None.
For a quick fix, you can dopipeline = Pipeline(...) MedianPredictionCollector.process_results(pipeline._task)
Maybe we should...
Which would also mean that the system knows which datasets are used in which pipelines etc
Like input
artifacts per Task ?
Actually scikit implies joblib š (so you should use scikit, anyhow I'll make sure we add joblib as it is more explicit)
Hi DilapidatedDucks58
apologies, this thread slipped way.
I double checked, there server will not allow you to overwrite it (meaning to have it fixed will need to release a server version which usually takes longer)
That said maybe we can pass an argument to the "Task.init" so it ignores it? wdyt?
. Can I get gpu usage over time frame via API also?
task.get_reported_scalars
But this will get you All the scalars, I think the next version of the server supports asking a specific one as well.
How are you implementing the alert monitoring?
Is is a stateless process starting every X min, or is it a state-full process running and monitoring ?
WackyRabbit7 I do 'pkill -f trains' but it's the same... If you need to debug and test run with --foreground and just hit ctrl-c to end the process (it will never switch to background...). Helps?
It's dead simple to install:
Pip install trains-agent
the.n you can simply do:
Trains-agent execute --id myexperimentid
ThickDove42 If you need the name itself :events.plots[0]['metric'] events.plots[0]['variant']