cannot schedule new futures after interpreter shutdown
This implies the process is shutting down.
Where are you uploading the model? What is the clearml version you are using ? can you check with the latest version (1.10) ?
@<1532532498972545024:profile|LittleReindeer37> nice!!! π
Do you want to PR? it will be relatively easy to merge and test, and I think that they might even push it to the next version (or worst case quick RC)
This is not very clear from the documentation
ElegantCoyote26 which documentation are you referring to ?
Yes this seems like it is stuck, could you test with the demo server ?
(basically remove the clearml.conf it will connect automatically)
My task starts up and checks the mounted EFS volume for x data, if x data does not exist there, it then pulls x data from S3.
BoredHedgehog47 you can just use StorageManager and configure clearml cache for the EFS, it will essentially do the same π
Regrading helm chart with EFS,
you need to configure the clearml-glue pod template with the EFS mount
example :
https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/e7f647f4e6fc76f983d61522e635353005f1472f/examples/kubernetes/volu...
Yes, actually the first step would be a toggle button for regexp in the search, the second will be even more advanced search.
May I suggest you post it on the UI suggestion issue https://github.com/allegroai/trains/issues/81 ?
Is itΒ
CLEARML_CONFIG_FILE
? (I had to dig this from the GH codeΒ
Β )
Yes it is !
https://clear.ml/docs/latest/docs/faq#clearml-configuration
(I will make sure we add it to https://clear.ml/docs/latest/docs/configs/env_vars#server-connection as well π )
Should pass only_published:
https://github.com/allegroai/clearml/blob/071caf53026330f3bb8019ee5db3d039562072f3/clearml/model.py#L444
I think you are correct, we should move the definition so you can control it from the clearml.conf, make sense to you?
It's more or less here:
https://github.com/allegroai/clearml-session/blob/0dc094c03dabc64b28dcc672b24644ec4151b64b/clearml_session/interactive_session_task.py#L431
I think that just replacing the package would be enough (I mean you could choose hub/lab, which makes sense to me)
ReassuredTiger98 I β€ the DAG in ASCII!!!
port = task_carla_server.get_parameter("General/port")
This looks great! and will acheive exactly what you are after.
BTW: when you are done you can do :task_carla_server.mark_aborted(force=True)
And it will shutdown the Clara Task π
task = Task.get_task('task_id_here') task.mark_started(force=True) task.upload_artifact(..., wait_on_upload=True) task.mark_completed()
Hi SlipperyDove40
plotly is about 4Mb... trains about 0.5MB what'd the breakdown of the packages ? This seems far away from 250Mb limit
a task of queue B if the next task is of type A it will have to wait,
It seems you imply there are two types of Tasks and they need to be executed one after the other ?
It's a good abstraction for monitoring the state of the platform and call backs, if this is what you are after.
If you just need "simple" cron, then you can always just loop/sleep π
Thanks @<1689446563463565312:profile|SmallTurkey79> ! π
JuicyFox94 maybe you can help here?
Martin I told you I can't access the resources in the cluster unfortunately
π
so it seems there is some misconfiguration of the k8s glue, because we can see it can "talk" to the clearml-server, but it seems it fails to actually create the k8s pod/job. I would start with debugging the k8s glue (not the services agents). Regardless, I think the next step is to get a log of the k8s glue pod, and better understand the issue.
wdyt?
SourLion48 you mean the wraparound ?
https://github.com/allegroai/clearml/blob/168074acd97589df58436a3ec122a95a077620c2/docs/clearml.conf#L33
WickedGoat98 did you setup a machine with trains-agent pulling from the "default" queue ?
BoredHedgehog47 could it be "python" python points to python 2.7 inside your container, as opposed to python3 on your machine
(this error is python2 trying to run python 3 code)
https://stackoverflow.com/questions/20555517/using-multiple-versions-of-python"Training classifier with command:\n python -m sfi.imagery.models.bbox_predictorv2.train
Hi HealthyStarfish45
- is there an advantage in using tensorboard over your reporting?
Not unless your code already uses TB or has some built in TB loggers.
html reporting looks powerfull, can one inject some javascript inside?
As long as the JS is self contained in the html script, anything goes :)
Hmm can you try with additional configuration, next to "secure: true" in your clearml.conf, can you add "verify: false"
First I would check the CLI command it will basically prefill it for you:
https://clear.ml/docs/latest/docs/apps/clearml_task
Specifically to your question, working directory "." is the root of the git repo
But I would avoid adding it manually, use the CLI, it will either use ask you to provide info or take the git repo details from the local copy
I want to store only my raw data in my blob storage, and I want to create a Hyperdataset with all the artificats, metrics, frames,
Yes that's exactly how it works.
None
This line adds a reference to raw file (local/remote)
[https://github.com/allegroai/clearml/blob/1b474dc0b057b69c76bc2daa9eb8be927cb25efa[β¦]es/hyperdatasets/data-registration/register_dataset_wit...
BTW: the agent will resolve pytorch based on the install CUDA version.