Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Again! I Am Doing Batch Inference From A Parent Task (That Defines A Base Docker Image). However, I'Ve Encountered An Issue Where The Task Takes Several Minutes (Approximately 3-5 Minutes) Specifically When It Reaches The Stage Of "Environment Setup Co

Hi again!
I am doing batch inference from a parent task (that defines a base docker image). However, I've encountered an issue where the task takes several minutes (approximately 3-5 minutes) specifically when it reaches the stage of
"Environment setup completed successfully
Starting Task Execution:"
I'm uncertain about the cause of this delay. Could you advise if this is considered normal?
Thank you in advance.

  
  
Posted 4 months ago
Votes Newest

Answers 11


Hi @<1529633468214939648:profile|CostlyElephant1>
what seems to be the issue? I could not locate anything in the log

"Environment setup completed successfully
Starting Task Execution:"

Do you mean it takes a long time to setup the environment inside the container?

CLEARML_AGENT_SKIP_PIP_VENV_INSTALL and CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL,

It seems to be working, as you can see no virtual environment is created, the only thing that is installed is the cleartml-agent that is missing form the base container

  
  
Posted 2 months ago

Hello @<1523701205467926528:profile|AgitatedDove14> , thank you for addressing my concern. It seems that the aspect of avoiding the venv is functioning correctly, and everything within the container is properly configured to initiate. However, there is still a delay of approximately 2 minutes between the completion of setup, thus the appearance of the console log indicating "Starting Task Execution" and the actual commencement of the inference logic. During this period, no additional logs are generated.

  
  
Posted 2 months ago

However, there is still a delay of approximately 2 minutes between the completion of setup,

Where is that delay in the log?
(btw: it seems your container is missing clearml-agent & git, installing those might add some time)

  
  
Posted 2 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> !
Thanks againg for following up this thread.

Perhaps It is not clear to read the delay in the log, but is just after "Starting Task Execution:"

Environment setup completed successfully
Starting Task Execution:

Here this new entry in the log is 2 min after env completed =>
1702378941039 box132 DEBUG 2023-12-12 11:02:16,112 - clearml.model - INFO - Selected model id: 9be79667ca644d7dbdf26732345f5415

So, the environment is created OK, but then it takes a consistent 2 min before actually starting the task.
No additional info is printed in the log during this wait time.

No worries for missing clearml-agent & git, we managed to avoid the venv creation as the docker container has the environment already set up. So I think our issue is not related to the venv set up itself, but the start after this.
We are looking around the "clearml_agent execute" command, as it is called right away after env set up...?

  
  
Posted 2 months ago

Hi @<1523701205467926528:profile|AgitatedDove14>
Yes, it was indeed in our code! after looking in depth, the loading of .cu and .cpp files was the root of the issue, slowing down the batch inference. Thanks a lot for your support!!

  
  
Posted 2 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> would you be so kind to take a look at this issue?
we still have 2 minutes between the log of
"

Starting Task Execution:

"
and actually starting our inference logic. We have no extra info from the log to check to improve this slow task start time.
thanks a lot for any feedback!

  
  
Posted 2 months ago

Here this new entry in the log is 2 min after env completed =>

1702378941039 box132 DEBUG 2023-12-12 11:02:16,112 - clearml.model - INFO - Selected model id: 9be79667ca644d7dbdf26732345f5415

This seems to be something in your code, just add print("starting") in your entry python file, Before any imports (because they might actually do something)
Because form the agent's perspective after printing Starting Task Execution: it literally calls the python script, nothing else ...

  
  
Posted 2 months ago

Also, as can be seen in docker args, I tried using CLEARML_AGENT_SKIP_PIP_VENV_INSTALL and CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL, to avoid installing packages as the container conatins everything needed to run the task, but not sure that it had any effect.

  
  
Posted 4 months ago

Hi @<1523701070390366208:profile|CostlyOstrich36>
Did you have the time to check the full log I sent you?
Most appreciated!

  
  
Posted 4 months ago

Hi @<1529633468214939648:profile|CostlyElephant1> , it looks like thats the environment setup. Can you share the full log?

  
  
Posted 4 months ago

Here it is @<1523701070390366208:profile|CostlyOstrich36>
Thanks for your feedback

  
  
Posted 4 months ago
270 Views
11 Answers
4 months ago
2 months ago
Tags