I Want To Run My Clearml Task On An Agent In K8S Together With A Memory Profiler (Maybe

Answered

I want to run my clearml task on an agent in k8s together with a memory profiler (maybe https://github.com/plasma-umass/scalene or https://github.com/pythonspeed/filprofiler ). The problem is that they both require you to run it as scalene my_clearml_task.py or fil-profile run my_clearml_task.py . Any ideas on how to do this?

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

Votes Newest

Answers 30

the task is running, but no log output from fil-profiler (when ran totally locally, then it does some logging at the very beginning)

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

hmm that is odd.
Can you send the full log ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 I did exactly that.

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

for some reason, when I ran it previous time, then repo, commit and working dir were all empty

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

but this time they were all present, and the command was run as expected:

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

=fil-profile= Preparing to write to fil-result/2021-08-19T20:23:30.905
=fil-profile= Wrote memory usage flamegraph to fil-result/2021-08-19T20:23:30.905/out-of-memory.svg
=fil-profile= Wrote memory usage flamegraph to fil-result/2021-08-19T20:23:30.905/out-of-memory-reversed.svg

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

FiercePenguin76 in the Tasks execution tab, under "script path", change to "-m filprofiler run catboost_train.py".
It should work (assuming the "catboost_train.py" is in the working directory).

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Adding venv into cache: /root/.clearml/venvs-builds/3.8 Running task id [aa2aca203f6b46b0843699d1da373b25]: [.]$ /root/.clearml/venvs-builds/3.8/bin/python -u '/root/.clearml/venvs-builds/3.8/code/-m filprofiler run catboost_train.py'

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

I got it working!

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

Adding venv into cache: /root/.clearml/venvs-builds/3.8
Running task id [8c65e88253034bd5a8dba923062066c1]:
[pipelines]$ /root/.clearml/venvs-builds/3.8/bin/python -u -m filprofiler run catboost_train.py

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

So maybe the path is related to the fact I have venv caching on?

hmmm could be...
Can you quickly disable the caching and try ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

but this will be invoked before fil-profiler starts generating them

I thought it will flush in the background 😞
You can however configure the profiler to a specific folder, then mount the folder to the host machine:
In the "base docker args" section add -v /host/folder/for/profiler:/inside/container/profile

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Does it had any errors in the local run up to the task.execute_remotely call?

You can try hack it, in the UI, under EXECUTION tab, add this prefix (-m scalene) to the script path, something like: - scalene my_clearml_task.py , can you try with it? (make sure you install scalene or have it under your installed packages)

  				
Posted 
	3 years ago

					More  		
  Report
		
					TimelyPenguin76
				
					0
					 Administrator

btw, you can also run using python -m filprofiler run my_clearml_task.py

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

[.]$ /root/.clearml/venvs-builds/3.8/bin/python -u '/root/.clearml/venvs-builds/3.8/code/-m filprofiler run catboost_train.py'

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

now the problem is: fil-profiler persists the reports and then exits

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

and I have no way to save those as clearml artifacts

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

I guess that’s the only option, thanks for your help

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

how do you run it locally? the same?

  				
Posted 
	3 years ago

					More  		
  Report
		
					TimelyPenguin76
				
					0
					 Administrator

[.]$ /root/.clearml/venvs-builds/3.8/bin/python -u '/root/.clearml/venvs-builds/3.8/code/-m filprofiler run catboost_train.py' doesn’t look good

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

yes, same

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

So maybe the path is related to the fact I have venv caching on?

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

and I have no way to save those as clearml artifacts

You could do (at the end of the code
task.upload_artifact('profiler', Path('./fil-result/'))wdyt?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

But here you can see why it didn’t succeed

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

No worries, I'll see if I can replicate it anyhow

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

nope, I need to contact devops team for that, that can happen not earlier than Monday

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

but this will be invoked before fil-profiler starts generating them

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

not a full log yet (will have to inspect it to not have any non-public info), but something potentially interesting

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

so probably, my question can be transformed into: “Can I have control over what command is used to start my script on clearml-agent”

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

“assuming the “catboost_train.py” is in the working directory” - maybe I get this part wrong?

  				
Posted 
	3 years ago

					More  		
  Report
		
					FiercePenguin76
				
					0
					 × 1

Write your answer

1K Views

30 Answers

3 years ago

2 years ago