Awesome, thanks WackyRabbit7 , AgitatedDove14 !
AgitatedDove14 This looks awesome! Unfortunately this would require a lot of changes in my current code, for that project I found a workaround π But I will surely use it for the next pipelines I will build!
when can we expect the next self hosted release btw?
So two possible cases for trains-agent-1: either:
It picks a new experiment -> show randomly one of the two experiments in the "workers" tab no new experiment in default queue to start -> show randomly no experiment or the one that it is running
AgitatedDove14 So Iβll just replace task = clearml.Task.get_task(clearml.config.get_remote_task_id())
with Task.init()
and wait for your fix π
I want in my CI tests to reproduce a run in an agent because the env changes and some things break in agents and not locally
Hi SuccessfulKoala55 , super thatβs what I was looking for
to pass secrets to each experiment
Note: I can verify that post_packages is well picked up by the trains-agent, since in the experiment log I see:agent.package_manager.type = pip agent.package_manager.pip_version = \=\=20.2.3 agent.package_manager.system_site_packages = true agent.package_manager.force_upgrade = false agent.package_manager.post_packages.0 = PyJWT\=\=1.7.1
yes what happens in the case of the installation with pip wheels files?
Basically what I did is:
` if task_name is not None:
project_name = parent_task.get_project_name()
task = Task.get_task(project_name, task_name)
if task is not None:
return task
Otherwise here I create the Task `
with what I shared above, I now get:docker: Error response from daemon: network 'host' not found.
yes but they are in plain text and I would like to avoid that
I also did run sudo apt install nvidia-cuda-toolkit
Hi PompousParrot44 , you could have a Controller task running in the services queue that periodically schedules the task you want to run
and with this setup I can use GPU without any problem, meaning that the wheel does contain the cuda runtime
nvm, bug might be from my side. I will open an issue if I find any easy reproducible example
Thanks! Corrected both, now its building
The simple workaround I imagined (not tested) at the moment is to sleep 2 minutes after closing the task, to keep the clearml-agent busy until the instance is shutted down:self.clearml_task.mark_stopped() self.clearml_task.close() time.sleep(120) # Prevent the agent to pick up new tasks
ok, but will it install as expected the engine and its dependencies?
Hi TimelyPenguin76 , I guess it tries to spin them down a second time, hence the double print
I also discovered https://h2oai.github.io/wave/ last week, would be awesome to be able to deploy it in the same manner
Hi AnxiousSeal95 , I hope you had nice holidays! Thanks for the update! I discovered h2o when looking for ways to deploy dashboards with apps like streamlit. Most likely I will use either streamlit deployed through clearml or h2o as standalone if ClearML won't support deploying apps (which is totally fine, no offense there π )
AgitatedDove14 yes! I now realise that the ignite events callbacks seem to not be fired (I tried to print a debug message on a custom Events.ITERATION_COMPLETED) and I cannot see it logged