Hi PompousParrot44 , you could have a Controller task running in the services queue that periodically schedules the task you want to run
Hi CostlyOstrich36 ! no I am running on venv mode
AgitatedDove14 So what you are saying is that since I have trains-server 0.16.1, I should use trains>=0.16.1? And what about trains-agent? Only version 0.16 is released atm, this is the one I use
I finally found a workaround using cache, will detail the solution in the issue 👍
and saved locally, which is why the second task, not executed in the same machine, cannot access the file
So the new EventsIterator
is responsible for the bug.
Is there a way for me to easily force the WebUI to always use the previous endpoint (v1.7)? I saw in the diff changes v1.1.0 > v1.2.0 that ES version was bumped to 7.16.2. I am using an external ES cluster, and its version is still 7.6.2. Can it be that the incompatibility comes from here? I’ll update the cluster to make sure it’s not the case
Thanks AgitatedDove14 !
Could we add this task.refresh()
on the docs? Might be helpful for other users as well 🙂 OK! Maybe there is a middle ground: For artifacts already registered, returns simply the entry and for artifacts not existing, contact server to retrieve them
It indeed has the old commit, so they match, no problem actually 🙂
yes, here is the error (the space at the end of the line is there)
` Applying uncommitted changes
Executing: ('git', 'apply'): b'error: corrupt patch at line 13\n'
Failed applying diff
trains_agent: ERROR: Failed applying git diff:
diff --git a/configs/2.2.2_from_scratch.yaml b/configs/2.2.2_from_scratch.yaml
index 9fece48..5816f78 100644
--- a/configs/2.2.2_from_scratch.yaml
+++ b/configs/2.2.2_from_scratch.yaml
@@ -136,7 +136,7 @@ data_processing:
optimizer:
type: 'RMSprop'
args:
- lr: 2.5e...
Oops, I spoke to fast, the json is actually not saved in s3
Still getting the same error, it is not taken into account 🤔
Yea again I am trying to understand what I can do with what I have 😄 I would like to be able to export as an environment variable the runtime where the agent is installing, so that one app I am using inside the Task can use the python packages installed by the agent and I can control the packages using clearml easily
AgitatedDove14 I eventually found a different way of achieving what I needed
in the UI the value is correct one (not empty, a string)
Hi TimelyPenguin76 ,
trains-server: 0.16.1-320
trains: 0.15.1
trains-agent: 0.16
SuccessfulKoala55
In the docker-compose file, you have an environment setting for the apiserver service host and port (CLEARML_ELASTIC_SERVICE_HOST and CLEARML_ELASTIC_SERVICE_PORT) - changing those will allow you to point the server to another ES service
The ES cluster is running in another machine, how can I set its IP in CLEARML_ELASTIC_SERVICE_HOST
? I would need to add host
to the networks of the apiserver service somehow? How can I do that?
ha wait, I removed the http://
in the host and it worked 🎉
Oof now I cannot start the second controller in the services queue on the same second machine, it fails with
` Processing /tmp/build/80754af9/cffi_1605538068321/work
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/tmp/build/80754af9/cffi_1605538068321/work'
clearml_agent: ERROR: Could not install task requirements!
Command '['/home/machine/.clearml/venvs-builds.1.3/3.6/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r'...
So two possible cases for trains-agent-1: either:
It picks a new experiment -> show randomly one of the two experiments in the "workers" tab no new experiment in default queue to start -> show randomly no experiment or the one that it is running
AgitatedDove14 So in the https://pytorch.org/ignite/_modules/ignite/handlers/early_stopping.html#EarlyStopping class I see that some infos are logged (in the __call__
function), and I would like to have these infos logged by clearml
Also tried task.get_logger().report_text(str(task.data.hyperparams))
-> AttributeError: 'Task' object has no attribute 'hyperparams'
AgitatedDove14 awesome! by "include it all" do you mean wizard for azure and gcp?
Opened an issue with the logs here > None
btw task._get_task_property('hyperparams')
also gives me ValueError: Task has no hyperparams section defined