Reputation
Badges 1
20 × Eureka!2023-01-28 21:30:02
task 5b32a1cae2fb446b8ebdc6e9e33f3c77 pulled from fb3733fc812e48389850cdf0104782be by worker jktblrlap446:0
2023-01-28 21:30:09
Running Task 5b32a1cae2fb446b8ebdc6e9e33f3c77 inside docker: python3.7 arguments: []
2023-01-28 21:30:10
Executing: ['docker', 'run', '-t', '--gpus', 'all', '-v', '/run/user/29999/keyring/ssh:/run/user/29999/keyring/ssh', '-e', 'SSH_AUTH_SOCK=/run/user/29999/keyring/ssh', '-l', 'clearml-worker-id=jktblrlap446:0', '-l', 'clearml-parent-worker-id=jk...
CostlyOstrich36 I have clone the astroid code from https://github.com/thepycoder/asteroid_example.git and am trying to run get_data.py , process_data.py, model_traing.py . i have connected to clearml server , there status is showing running for around 3 hr.
AgitatedDove14 from where i get queue name.can you please tell me i am beginner to use clearml.
i follwed steps like below..
pip install clearml-agent
clearml-agent init
clearml-agent execute --id <taskid>
@<1523701070390366208:profile|CostlyOstrich36> caught exception in thread [main]",
"stacktrace": ["org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];",
"at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-7.16.2.jar:7.16.2]",
"at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) ~[elasticsearch-7.16.2.jar:7.16.2]",
"at or...
i gone through docs, doing the same,but getting error
Hi @<1523701087100473344:profile|SuccessfulKoala55> ,No its taking different version of pacakges if u compare with local execution.
Hi CostlyOstrich36 I already given alias.
dataset = Dataset.get(
dataset_project=global_config.PROJECT_NAME,
dataset_name='raw_asteroid_dataset',
alias='my_raw_dataset')
@<1523701087100473344:profile|SuccessfulKoala55> i want agent should take packages which is there in my system.how can i do that?
@<1523701087100473344:profile|SuccessfulKoala55> I have copied code from None , there is one file pipeline.py in which we have two function pipe.start_locally(run_pipeline_steps_locally=True) and pipe.start(). when i am running code with pipe.start_locally its working fine with package SQLAlchemy version of 1.4.39.when i am trying to run with pipe.strat(), along with running agent it creating virtula environment with SQLAIchemy of versi...
no worries.issue got resolved.
my elastic container is restarting
AgitatedDove14 i have clone the code from https://github.com/thepycoder/asteroid_example.git ,and runing pipeline.py file using above command. and in clearml app in console i am getting error " clearml.automation.job - WARNING - Error enqueuing Task <clearml.task.Task object at 0x7fc815fadfa0> to CPU Queue: Could not find queue named "CPU Queue". please see the attach snap
means,clearml and docker should be in same path?