Reputation
Badges 1
25 × Eureka!I thought this is the issue on the thread you linked, did I miss something ?
At the top there should be the URL of the notebook (I think)
Is this some sort of polling ?
yes
End of the day, we are just worried whether this will hog resources compared to a web-hook ? Any ideasΒ (edited)
No need to worry, it pulls every 30 sec, and this is negligible (as a comparison any task will at least send a write request every 30 sec, if not more)
Actually webhooks might be more taxing on the server, as you need to always have a webhook up (i.e. wasting a socket ...)
based on this:
https://clear.ml/docs/latest/docs/references/api/endpoints#post-debugping
" http://localhost:8080/debug.ping β
btw: What'd the usage scenario ?
JitteryCoyote63 it should just "freeze" after a while as it will constantly try to resend logs. Basically you should be fine π
(If for some reason something crashed, please let me know so we can fix it)
Thread is discussed here: None
Now I am passing it the same way you have mentioned, but my code still gets stuck as in above screenshot.
The screenshot shows warning from pyplot (matplotlib) not ClearML, or am I mising something ?
My guess is that it can't resolve credentials. It does not give me any pop up to login also
If it fails, you will get an error, there will never a popup from code π
... We need a more permanent place to store data
FYI you can store the "Dataset" itself on GS (instead of...
Yes in the UI, clone or reset the Task, then youcan edit the installed packages section under the Execution tab
TrickyRaccoon92 I'm not sure I follow, TB do show? and you want to add additional plotly plot ?
MysteriousBee56 what do you mean "delete a worker"
stop the agent running remotely ?
Hmm I see your point.
Any chance you can open a github issue with a small code snippet to make sure we can reproduce and fix it?
Does a pipeline step behave differently?
Are you disabling it in the pipeline step ?
(disabling it for the pipeline Task has no effect on the pipeline steps themselves)
The task pod (experiment) started reaching out to an IP associated with malicious activity. The IP was associated with 1000+ domain names. The activity was identified in AWS guard duty with a high severity level.
BoredHedgehog47 What is the pod container itself ?
EDIT:
Are you suggesting the default "ubuntu:18.04" is somehow contaminated ?
https://hub.docker.com/layers/library/ubuntu/18.04/images/sha256-d5c260797a173fe5852953656a15a9e58ba14c5306c175305b3a05e0303416db?context=explore
I prefer serving my models in-house and only performing the monitoring via ClearML.
clearml-serving is an infrastructure for you to run models π
to clarify, clearml-serving is running on your end (meaning this is not SaaS where a 3rd party is running the model)
By the way, I saw there is a project dashboard app which might support the visualization I am looking for. Is it suitable for such use case?
Hmm interesting, actually it might, it does collect matrices over time ...
Sure:Dataset.create(..., use_current_task=True)This will basically attach/make the main Task the Dataset itself (Dataset is a type of a Task, with logic built on top of it)
wdyt ?
Hi @<1631102016807768064:profile|ZanySealion18>
sorry missed that one
The cache doesn't work, it attempts to download the dataset every time.
just making sure the dataset itself contains all the files?
Once I used clearml-data add --folder * CLI everything works correctly (though all files recursively ended up in the root, I had luck all were named differently).
Not sure I follow here, is the problem the creation of the dataset of fetching it? is this a single version or multi...
Hi MagnificentSeaurchin79
This sounds like a deeper bug (of a sort), I think the best approach is to open a GitHub issue with some code that can reproduce this behavior, or at least enough information so that we could try to catch the bug.
This way we will make sure it is not forgotten.
Sounds good ?
Quite hard for me to try this right
π
How do I reproduce it ?
Hi @<1526371965655322624:profile|NuttyCamel41>
I think that the only way to actually get huge number of api calls is with a lot of machines.
For example, regardless of the amount of console-logs you print, it will only be a single call, as these are packages every 2-10 seconds. The same with metric reporting etc.
On the free tier you cal already test the amount of API calls, I think the mechanism is exactly the same
fyi: I would put this question in the channel
That experiment says it's completed, does it mean that the autoscaler is running or not?
Not running, it will be "running" if actually being executed
Glad to hear that! π
Hi @<1655744373268156416:profile|StickyShrimp60>
The best way is through APIs, you can query all the Tasks and then one by one use task.export_task with task.get_reported_scalars , task.get_reported_plots, task.get_reported_console_output, to get the details, after that you can recreate thee Task with import_task, and manually report the scalars/plots/console
btw: is self hosted server cheaper than the 15$ a month hos...
Great! btw: final v1.2.0 should be out after the weekend
You mean parameters of the pipeline? Is this a pipeline from Tasks or from function decorator?
That didnβt gave useful infos, was that docker was not installed in the agent machine x)
JitteryCoyote63 you mean "docker" was not installed and it did not throw an error ?
(2) yes weekdays with specific hour should do exactly that:)
(3) yes I see your point, maybe we should add boolean allowing you to run immediately?
Back to (1) , let me see if I can reproduce, anything specific I need to add to the schedule call?
Hi EnthusiasticCoyote38
But one one process finished it changed task status to complete. May be you know some save way to deal with such situation? Or maybe the best way to check task status before upload object?
Well, you can actually forcefully set the state of the Task to running, then add artifacts, then close it?
would that work?
` my_other_task.reload()
my_other_task.mark_started(force=True)
my_other_task.upload_artifact(...)
my_other_task.flush(wait_for_uploads=True)
my_othe...
ShakyJellyfish91 what exactly are you passing to Task.create?
Could it be you are only passing script= and leaving repo= None ?