Reputation
Badges 1
981 × Eureka!AgitatedDove14 So what you are saying is that since I have trains-server 0.16.1, I should use trains>=0.16.1? And what about trains-agent? Only version 0.16 is released atm, this is the one I use
AgitatedDove14 yes but I don't see in the docs how to attach it to the logger of the earlystopping handler
super, thanks SuccessfulKoala55 !
I didnโt use ignite callbacks, for future reference:
` early_stopping_handler = EarlyStopping(...)
def log_patience(_):
clearml_logger.report_scalar("patience", "early_stopping", early_stopping_handler.counter, engine.state.epoch)
engine.add_event_handler(Events.EPOCH_COMPLETED, early_stopping_handler)
engine.add_event_handler(Events.EPOCH_COMPLETED, log_patience) `
Hi TimelyPenguin76 ,
trains-server: 0.16.1-320
trains: 0.15.1
trains-agent: 0.16
Ok, this I cannot locate
So previous_task actually ignored the output_uri
SuccessfulKoala55 I deleted all :monitor:machine and :monitor:gpu series, but only deleted ~20M documents out of 320M documents in the events-training_debug_image-xyz . I would like now to understand which experiments contain most of the document to delete them. I would like to aggregate the number of document per experiment. Is there a way do that using the ES REST api?
ubuntu18.04 is actually 64Mo, I can live with that ๐
That's why I suspected trains was installing a different version that the one I expected
correct, you could also use
Task.create
that creates a Task but does not do any automagic.
Yes, I didn't use it so far because I didn't know what to expect since the doc states:
"Create a new, non-reproducible Task (experiment). This is called a sub-task."
In my github action, I should just have a dummy clearml server and run the task there, connecting to this dummy clearml server
is there a command / file for that?
What I put in the clearml.conf is the following:
agent.package_manager.pip_version = "==20.2.3" agent.package_manager.extra_index_url: [" "] agent.python_binary = python3.8
Hi SuccessfulKoala55 , will I be able to update all references to the old s3 bucket using this command?
BTW, is there any specific reason for not upgrading to clearml?
I just didn't have time so far ๐
I donโt think it is, I was rather wondering how you handled it to understand potential sources of slow down in the training code
very cool, good to know, thanks SuccessfulKoala55 ๐
Ha I see, it is not supported by the autoscaler > https://github.com/allegroai/clearml/blob/282513ac33096197f82e8f5ed654948d97584c35/trains/automation/aws_auto_scaler.py#L120-L125
Hi AgitatedDove14 , initially I was doing this, but then I realised that with the approach you suggest all the packages of the local environment also end up in the โinstalled packagesโ, while in reality I only need the dependencies of the local package. Thatโs why I use _update_requirements , with this approach only the package required will be installed in the agent
I also tried setting ebs_device_name = "/dev/sdf" - didn't work