Hi @<1666253626772819968:profile|SmoggyDog77> , this is indeed a very interesting and somewhat complicated use-case! I think ClearML can support all of your current needs and any that might rise in the future.
I would suggest to contact ClearML directly to get a better idea of how ClearML can assist your MLOps efforts.
The suggested method from my experience would be via the contact form - None
Hi @<1632913939241111552:profile|HighRaccoon77> , the most 'basic' solution would be adding a piece of code at the end of your script to shut down the machine but obviously it would be unpleasant to run locally without Task.execute_remotely()
- None
Are you specifically using Sagemaker? Do you have any api interface you could work with to manipulate shutdown of machines?
And if you run the same code locally everything is reported correctly?
Hi @<1584716355783888896:profile|CornyHedgehog13> , you can only see a list of files inside a dataset/version. I'm afraid you can't really pull individual files since everything is compressed and chunked. You can download individual chunks.
Regarding the second point - there is nothing out of the box but you can get a list of files in all datasets and then compare if some file exists in others.
Does that make sense?
Although I think a problem would be syncing the databases on different servers
Hi @<1697056708469198848:profile|HollowPeacock63> , not sure I understand. What exactly are you trying to do?
Hi @<1644147961996775424:profile|HurtStarfish47> , Do you have some basic code snippet that reproduces this behavior?
Hi @<1529271085315395584:profile|AmusedCat74> , the agent technically has two modes, daemon
and execute
(clearml-agent daemon/clearml-agent execute).
When in daemon mode the agent will start the docker container for example, install the agent inside and the agent inside will run in execute
mode
I'd suggest running Task.init
first and then exposing the dataset name using argparser afterwards
Hi @<1593051292383580160:profile|SoreSparrow36> , can I assume you're running a self hosted server? Is there any chance you were either using a very old SDK or old backend?
The default behavior now is to create pipeline tasks as hidden and only show them as part of the pipelines UI section.
Think of it this way. You have the pipeline controller which is the 'special' task that manages the logic. Then you have the pipeline steps. Both the controller and the steps need some agent to execute them. So you need an agent to execute the controller and also you need another agent to run the steps themselves.
I would suggest by clicking on 'task_one' and going into full details. My guess it is in 'enqueued' state probably to the 'default' queue.
@<1523704089874010112:profile|FloppyDeer99> , can you try upgrading your server? It appears to be pretty old version.
When looking at the user in MongoDB, is it some special user or just something regular?
@<1792727007507779584:profile|HollowKangaroo53> , is it a self hosted server?
Hi @<1686184974295764992:profile|ClumsyKoala96> , you can set CLEARML_API_DEFAULT_REQ_METHOD to POST
and that should work - None
Hi @<1702492411105644544:profile|YummyGrasshopper29> , it looks like the controller is running, but is there any agent listening to where the tasks are being pushed?
What setting do you have in this section of your clearml.conf
None
Hi @<1585441140176326656:profile|StrongDove49> , what do you mean finding the credentials for each user? You mean on the databases?
Can you please add a stand alone code snippet that reproduces this?
Just to make sure I understand the flow - you run an experiment and create it inside project 'my_example'
Afterwards you run a pipeline and you specify the controller 'my_example'.
This will make 'my_example' into a hidden project
Am I getting it right?
Yes, but then you need to manually inject those environment variables when running the agent
Hi @<1698868530394435584:profile|QuizzicalFlamingo74> , Try compression=False
It should be in top level, not environment or agent
Hi @<1627478122452488192:profile|AdorableDeer85> , can you provide a code snippet that reproduces this?
What happens if you remove the run_locally()
?
And just making sure - the first pipeline step isn't even pushed into a queue? It remains in 'draft' mode?
Hi @<1755401041563619328:profile|PungentCow70> , currently only by tags and project title/ds name. But I think it would be a cool capability. Maybe add a GitHub feature request on this?
You need to also spin up the ClearML server...
None
top right corner
I would suggest adding print outs during the code to better understand when this happens
Hi @<1554638179657584640:profile|GlamorousParrot83> , yes it is supported just like in the AWS autoscaler 🙂
MortifiedDove27 , in the docker ps command you added everything seems to be running fine