I will check that. Do you think we could bypass it using Task.create
? And passing all the needed params?
I tried playing with those, but I do not succeed to have a role on the source code detection. I can modify the env variables, nothing happen on CLearML server unfortunately.
try these values:
os.environ.update({
'CLEARML_VCS_COMMIT_ID': '<commit_id>',
'CLEARML_VCS_BRANCH': 'origin/master',
'CLEARML_VCS_DIFF': '',
'CLEARML_VCS_STATUS': '',
'CLEARML_VCS_ROOT': '.',
'CLEARML_VCS_REPO_URL': '
',
})
task = Task.init(...)
No problem. I guess this might be a small visualisation bug, but I really have the impression that these workers still pick up tasks, which is strange. I should test again to be sure.
The flask
command is ran inside the git project, which is the strange behavior. It is executed in ~/code/repo/ as flask train ...
Hi @<1556812486840160256:profile|SuccessfulRaven86>
it does not when I run a flask command inside my codebase. Is it an expected behavior? Do you have some workarounds for this?
Hmm where do you have your Task.init ?
(btw: what's the use case of a flask app tracking?)
Then I deleted those workers,
How did you delete those workers? the autoscaler is supposed to spin the ec2 instances down when they are idle, in theory there is no need for manual spin down.
No Task.create is for creating an external Task not logging your own process,
That said you can probably override the git repo with env vars:
None
I cannot modify an autoscaler currently running
Yes this is a known limitation, and I know they are working on fixing it for the next version
We basically have flask commands allowing to trigger specific behaviors. ...
Oh I see now, I suspect the issue is that the flask command is not executed from within the git project?!
@<1556812486840160256:profile|SuccessfulRaven86> is the issue with flask
reproducible ? if so could you open a github issue, so we do not forget to look into it?
I have my Task.init
inside a train() function inside the flask command. We basically have flask commands allowing to trigger specific behaviors. When running it locally, everything works properly except the repository information. The use case is linked to the way our codebase works. For example, I am going to do flask train {arguments}
and it will trigger the training of a model (that I want to track).
I stopped the autoscaler and deleted it manually. I did it because I want to test and debug multiple configurations of the autoscaler and I cannot modify an autoscaler currently running (maybe because I am not the manager of the workspace).
@<1523701205467926528:profile|AgitatedDove14> If you have any other insights, pls do not hesitate! Thanks a lot