Reputation
Badges 1
981 × Eureka!To clarify: trains-agent run a single service Task only
Isn't it overkill to run a whole ubuntu 18.04 just to run a dead simple controller task?
not really, because it is in the middle of the controller task, there are other things to be done afterwards (retrieving results, logging new artifacts, creating new tasks, etc)
AppetizingMouse58 the events_plot.json template misses the plot_len declaration, could you please give me the definition of this field? (reindexing with dynamic: strict fails with: "mapping set to strict, dynamic introduction of [plot_len] within [_doc] is not allowed )
Ok, I won't have time to venture to check the different database components, the first option (shuting down the server) sounds like the easiest option for me, I would then run manually the script once a month or so
That would be awesome, yes, only from my side I have 0 knowledge of the pip codebase ๐
Yes, in the Task being executed in the agents, I have:from trains import Task task = Task.init(...) task.get_logger().report_text(str(task.get_parameters()))
Installing collected packages: my-engine Attempting uninstall: my-engine Found existing installation: my-engine 1.0.0 Uninstalling my-engine-1.0.0: Successfully uninstalled my-engine-1.0.0 Successfully installed my-engine-1.0.0
I can also access these files directly if I enter the url in the browser
Note: I can verify that post_packages is well picked up by the trains-agent, since in the experiment log I see:agent.package_manager.type = pip agent.package_manager.pip_version = \=\=20.2.3 agent.package_manager.system_site_packages = true agent.package_manager.force_upgrade = false agent.package_manager.post_packages.0 = PyJWT\=\=1.7.1
Sorry, what I meant is that it is not documented anywhere that the agent should run in docker mode, hence my confusion
Yes AnxiousSeal95 , stopped instance meaning you donโt pay for it, but just its storage, as described https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Stop_Start.html . So AgitatedDove14 increasing the IDLE timeout would still make me pay for the instance while they are idle.
Do you get stopped instances instantely when you ask for them?
Well thatโs a good question, thatโs what I observed some time ago, but according to their https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/...
Interesting! Something like that would be cool yes! I just realized that custom plugins in Mattermost are written in Go, could be a good hackday for me ๐ to learn go
can it be that the merge op takes so much filesystem cache that the rest of the system becomes unresponsive?
Yes, super thanks AgitatedDove14 !
Iโll definitely check that out! ๐คฉ
No worries! I asked more to be informed, I don't have a real use-case behind. This means that you guys internally catch the argparser object somehow right? Because you could also simply use sys argv to find the parameters, right?
but the post_packages does not reinstalls the version 1.7.1
ha sorry itโs actually the number of shards that increased
I just checked if something changed in https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_config.html#web-login-authentication
AgitatedDove14 I cannot confirm at 100%, the context is different (see previous messages) but it could be the same bug behind the scene...
Sure! Here are the relevant parts:
` ...
Current configuration (clearml_agent v1.2.3, location: /tmp/.clearml_agent.3m6hdm1_.cfg):
...
agent.python_binary =
agent.package_manager.type = pip
agent.package_manager.pip_version = ==20.2.3
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 ...