Reputation
Badges 1
85 × Eureka!my use case is more like 1st one where run the training at a certain given schedule
trains is run using docker-compose allegroai/trains-agent-services:latest
and allegroai/trains:latest
TimelyPenguin76 is there any way to do this using UI directly or as a schedule... otherwise i think i will run the cleanup_service as given in docs...
also one thing i noticed.. when i report confusion matrix and some other plots e.g. seaborn with matplotlib.. on server side i can the plots are there but not visible at all
seems like port forwarding had an issue.. fixed that.. now running test again to see if things workout as expected
ok will give it a try and let you know
i think for now it should do the trick... was just thinking about the roadmap part
not so sure.. ideally i was looking for some function calls which enables me to create a sort of DAG which get scheduled at given interval and DAG has status checks on up streams task ... so if upstream task fails.. downstream tasks are not run
you replied it already.. it was execute_remotely
called with exit_true
argument
thanks... i was just wondering if i overlooked any config option for that... as cpu_set
might be possibility to for cpu
looking at the above link, it seems i might be able to create it with some boilerplate as it has concept of parent and child... but not sure how status checks and dependency get sorted out
this looks good... also do you have any info/eta on next controller/service release you mentioning
couldn't find the licensing price for enterprise version
i ran it this week
i guess i was not so clear may be.. say e.g. you running lightgbm model training, by default it will take all the cpus available on the box and will run that many threads, now another task got scheduled on the same box now you have 2x threads with same amount of CPU to schedule on. So yes the jobs will progress but the progression will not be the same due to context switches which will happen way more than say if we have allowed on 1/2x threads for each job
AgitatedDove14 no it doesn't work
so as you say.. i don't think the issue i am seeing is due to this error
this is when executed from directly with task.init()
any logs i can check or debug my side
thanks for letting me know.. but it turns out after i have recreated my whole system environment from scratch, trains agent is working as expected..
thanks for the update... it seems currently i can not pass the http/s proxy parameters as when agent creates a new env and try to download some package its being blocked by our corp firewall... all outgoing connection needs to pass through a proxy.. so is it possible to specify that or environment variables to agent
an example achieving what i propose would be greatly helpful