Reputation
Badges 1
25 × Eureka!when you say "configuration files" are you referencing the dict in the mock example ?
Just making sure, the machine that you were running the "trains-init" on can access the API server ?
BeefyHippopotamus73 this error seems like it is coming from boto3, are you sure the credentials are properly configured and that you have read permission ?
Hi @<1523706645840924672:profile|VirtuousFish83>
Hmm so generally I think the answer is no... I mean you can download all scalars and re-report them with a different title/series, but I think you will not be able to delete a specific set, and the only way would be to reset the entire Task.
I'm curious what's the scenario here? is it like a typo you want to fix?
I presume is via theΒ
project_name
Β andΒ
task_name
Β parameters.
You are correct in your assumption, it only happens when you call Task.init but two distinctions:
ArgParser arguments are overridden (with trains-agent) even before Task.init is called Task.init when running under trains-agent will totally ignore the project/task name, it receives a pre-made task id, and uses it. So the project name and experiment are meaningless if you are running the tas...
JoyousKoala59 what is the Trains server you have? the link you posted is to upgrade from v0.15 to v0.16, not from trains to clearml
(This is why we recommend using pip, because it is stable and clearml-agent takes care of pytorch/cuda verions)
Is there a way to connect to the task without initiating a new one without overriding the execution?
You can, but not with automagic, you can manually send metrics/logs...
Does that help? or do we need the automagic?
I wonder if the try/except approach would work for XGboost load, could we just try a few classes one after the other?
right now I can't figure out how to get the session in order to get the notebook path
you mean the code that fires "HTTPConnectionPool" ?
Where did you add the Task.init call ?
YEY! π π
Hi VivaciousWalrus99
Could you attach the log of the run ?
By default it will use the python it is running with.
Any chance the original experiment was executed with python2 ?
Hi RobustRat47
What do you mean by "log space for hyperparameter" , what would be the difference ? (Notice that on the graph itself you can switch to log scale when viewing in the UI) ?
Or are you referring to the hyper parameter optimization, allowing you to add log space ?
And is this repo installed on the pipeline creating machine ?
Basically I'm asking how come it did not automatically detect it?
try Hydra/trainer.params.batch_size
hydra separates nesting with "."
Hi Guys, just curious here, what's was the final issue?
Also out of curiosity, what does that mean? "1.12.2 because some bug that make fastai lag 2x" ?
an implementation of this kind is interesting for you or do you suggest to fork
You mean adding a config map storing a default trains.conf for the agent?
HurtWoodpecker30
The agent uses the
requirements.txt
)
what do you mean by that? aren't the package listed in the "Installed packages" section of the Task?
(or is it empty when starting, i.e. it uses the requirements.txt from the github, and then the agent lists them back into the Task)
Checkout the trains-agent repo https://github.com/allegroai/trains-agent
It is fairly straight forward.
Hi DangerousDragonfly8
, is it possible to somehow extract the information about the experiment/task of which status has changed?
From the docstring of add_task_trigger```py def schedule_function(task_id): pass ```This means you are getting the Task ID that caused the trigger, now you can get all the info that you need with Task.get_task(task_id)
` def schedule_function(task_id):
the_task = Task.get_task(task_id)
# now we have all the info on the Task tha...
Thanks ShallowCat10 !
I'll make sure we fix it π
the issue was related to task.connect being called multiple times I guess.
This is odd?! how would that effect the crash?
Do notice that when you connect objects, each time you call connect you are basically deserializing the configuration from the backend into the code, maybe this somehow effected the object?
Expected behaviour is that it reads last iteration correctly. At least it is stated in docs so.
This is exactly what should happen, are you saying that for some reason it fails?
Hi @<1562610699555835904:profile|VirtuousHedgehong97>
I think you need to upgrade your self-hosted clearml-server, could that be the case?