I Have Set

Answered

I Have Set

I have set

export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true
export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=true

in my entrypoint.sh (which runs clearml-agent daemon --queue $QUEUES --create-queue --cpu-only --foreground )

but it appears that tasks still take a long time to set up environments. I expected the whole process to be skipped and for the preinstalled python deps in the docker image (which is running this entrypoint script) to be used.

From task pickup to task "run python file" can be several minutes... which is greater than some of the tasks take themselves.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

Votes Newest

Answers 54

i really dont see how this provides any additional context that the timestamps + crops dont but okay.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

def seeing some that took 7-8 mins whereas others 2-3...

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

of what task? i'm running lots of them and benchmarking execution times. would you like to see a best case or worst case scenario? (ive kept some experiments for each).

and yeah, in those docs you just linked, "boolean" vars like CLEARML_AGENT_GIT_CLONE_VERBOSE explicitly say true so I ended up trying that pattern. but originally i did try 1. let me go back to that now. thank you.

overall I've seen some improvements in execution time using the suggestions in this thread (tysm!) - the preinstalled libs seem to be helping, though some things are still just unbearably slow (one of my larger pipelines took > 1 h to generate a DAG before even starting...).

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

from the logs, it feels like after git clone, it spend minutes without outputting anything. AgitatedDove14 Do you know what is the agent suppose to do after git clone ?
I guess a check that all packages is installed ? But then with CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1, what is the agent doing ??

  				
Posted 
	10 months ago

					More  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

this bug: None

  				
Posted 
	10 months ago

					More  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

oh yes. Using env until the next message is 2 minutes.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

i just ran a pipeline that took about 2h (more than half this time was just the DAG), with about a hundred tasks. i'm taking a look at them now to see what the logs show for runtimes.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

i was having a ton of git clone issues - disabled caching entirely... wonder if that may help too.

tysm for your help! will report back soon.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

ah I see. thank you very much!

trying export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=$(which python)
but I still see Environment setup completed successfully
(it is printed after Running task id )

it still takes a full 3 minutes between task pulled by worker until Running task id
is this normal? What is happening in these few minutes (besides a git pull / switch)?

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

ha! yup. that was it exactly. I posted about it too None lol

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

yeah... still seeing variances from 1m to 10m for the same task. been testing parallel execution for hours.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

yeah, still noticing that it can be multiple minutes before something starts...
like... what is happening in this time (besides a git clone), now that I set both

export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true
export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=$(which python)

update: it's now been six mins and the task still isn't done. this should have run through in like a minute total end-to-end

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

normally when new package need to be install, it shows up in the Console tab

  				
Posted 
	10 months ago

					More  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

I know that git clone and pip verify all installed is normal. But for some reason in Michael screenshot, I don't see those steps ...

  				
Posted 
	10 months ago

					More  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

SmallTurkey79 could you attach the full log of the Task?
also I would recommend "export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1" (not true )
Usually binary env vars are 0/1
(I can see that the docs here: None
never mention it, I'll ask them to add that)

  				
Posted 
	10 months ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

is there a way for me to toggle CLEARML's log level? I'm doing some manual task-debugging in ipython and think it would be helpful to see network requests and timeouts if they're occurring.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

i just need to understand what I should be expecting. I thought from putting it into queue in UI to "running my code remotely" (esp with packages preloaded) should be fairly fast turnaround - certainly not three minutes... i'll have to change my whole pipeline design if this is the case)

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

you should be able to see int the Console tab that show what is happening

  				
Posted 
	10 months ago

					More  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

hard to see with your croppout here an there ...

  				
Posted 
	10 months ago

					More  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

i would love some advice on that though - should I be using services mode + docker and some max # of instances to be spinning up multiple tasks instead?

my thinking was to avoid some of the docker overhead. but i did try this approach previously and found that the container limit wasn't exactly respected.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

but pretty reliably some proportion of tasks still just take a much longer time. 1m - 10m is a variance i'd really like to understand.

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

I think a proper screenshot of the full log with some information redacted is the way to go. Otherwise we are just guessing in the dark

  				
Posted 
	10 months ago

					More  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

I'm just working on speeding up the time from "queue experiment" to "my code actually runs remotely" - as of yesterday things would sit for many minutes at a time. trying to see if venv is the culprit .

  				
Posted 
	10 months ago

					More  		
  Report
		
					SmallTurkey79
				
					0
					 × 1

of what task? i'm running lots of them and benchmarking

If you are skipping every installation it should be the same
because if you set CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1 it will not install Anything at all
This is why it's odd to me...
wdyt?

  				
Posted 
	10 months ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Show more results

Write your answer

53K Views

54 Answers

10 months ago