Reputation
Badges 1
662 × Eureka!Thanks! I'll wait for the release note/docs update π
It misses the repository information of course, but the 'configuration/Args' were logged. So something weird in identifying the repository
Because setting env vars and ensuring they exist on the remote machine during execution etc is more complicated π
There are always ways around, I was just wondering what is the expected flow π
Setting the endpoint will not be the only thing missing though, so unfortunately that's insufficient π
` # test_clearml.py
import pytest
import shutil
import clearml
@pytest.fixture
def clearml_task():
clearml.Task.set_offline_mode(True)
task = clearml.Task.init(project_name="test", task_name="test")
yield task
shutil.rmtree(task.get_offline_mode_folder())
clearml.Task.set_offline_mode(False)
class ClearMLTests:
def test_something(self, clearml_task):
assert True run with
pytest test_clearml.py `
UPDATE: Apparently the quotation type matters for furl
? I switched the '
to \"
and it seems to work now
Since the additional credentials are available to the autoscaler when it boots up (via the config file), I thought it could use those natively?
Does that make sense CostlyOstrich36 ? Any thoughts on how to treat this? For the time being I'm also perfectly happy to include something specific to extra_clearml_conf
, but I'm not sure how to set the sdk.aws.s3.credentials
to be a list of dictionaries as needed
Would be great if it is π We have few files that change frequently and are quite large in size, and it would be quite a storage hit to save all of them
I also tried setting agent.python_binary: "/usr/bin/python3.8"
but it still uses Python 2.7?
I believe it is maybe a race condition that's tangent to clearml now...
Aw you deleted your response fast CostlyOstrich36 xD
Indeed it does not appear in ps aux
so I cannot simply kill it (or at least, find it).
I was wondering if it's maybe just a zombie in the server API or similar
Yes, exactly. I have not yet had a chance to try this out -- should it work?
TimelyPenguin76 CostlyOstrich36 It seems a lot of manual configurations is required to get the EC2 instances up and running.
Would it not make sense to update the autoscaler (and example script) so that the config.yaml
that's used for the autoscaler service is implicitly copied to the EC2 services, and then any extra_clearml_conf
are used/overwritten?
I'm not sure I follow, how would that solution look like?
SuccessfulKoala55 could this be related to the monkey patching for logging platform? We have our own logging handlers that we use in this case
Oh and clearml-agent==1.1.2
That's what I thought too, it should only look for the CLEARML_TASK_ID
environment variable?
I'm running tests with pytest
, it consumes/owns the stream
Same result π This is frustrating, wtf happened :shocked_face_with_exploding_head:
This is also specifically the services queue worker I'm trying to debug π€
CostlyOstrich36 I'm not sure what is holding it from spinning down. Unfortunately I was not around when this happened. Maybe it was AWS taking a while to terminate, or maybe it was just taking a while to register in the autoscaler.
The logs looked like this:
- Recognizing an idle worker and spinning down.
2022-09-19 12:27:33,197 - clearml.auto_scaler - INFO - Spin down instance cloud id 'i-058730639c72f91e1'
2. Recognizing a new task is available, but the worker is still idle.
` 2022-09...
Is Task.create
the way to go here? π€
AgitatedDove14
I'll make a PR for it now, but the long story is that you have the full log, but the virtualenv
version is not logged anywhere (the usual output from virtualenv
just says which Python version is used, etc).
Could also be related to K8, so pinging JuicyFox94 just in case π
I know, that should indeed be the default behaviour, but at least from my tests the use of --python ...
was consistent, whereas for some reason this old virtualenv decided to use python2.7 otherwise π€¨