Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey, Guys! I Have A Problem. I Launched Clearml Server And Trying To Run A Worker On Another Machine. When I Run

Hey, guys! I have a problem. I launched ClearML Server and trying to run a worker on another machine. When I run clearml-agent init it can't verify credentials and create a clearml.conf file. However clearml-init command successfully accepted my creds and created clearml.conf file. Then when I try to run clearml-agent daemon -d it just sits there, no output and it doesn't appear in workers sections in web UI.

How do I make my worker run properly? Can I see daemon output logs somewhere?
Any help is much appreciated. Thank you!

  
  
Posted one year ago
Votes Newest

Answers 39


@<1523701070390366208:profile|CostlyOstrich36>
Should I leave as is or fill the values in docker-compose for agent-services? I set it to localhost since agent-services is running together with other clearml containers on one machine. Not sure why do you have to fill those values.
CLEARML_HOST_IP: "<my_clearml_server_ip>"
CLEARML_WEB_HOST: " None "
CLEARML_API_HOST: " None "
CLEARML_FILES_HOST: " None "

  
  
Posted one year ago

@<1523701087100473344:profile|SuccessfulKoala55>
I managed to create clearml.conf file with clearml-agent init after fixing proxy problem. And now trying to run daemon with this conf file. I suspect something is missing from it since request validator fails with missing attribute

  
  
Posted one year ago

Also, previous problem was in incorrect proxy configuration on agent machine

  
  
Posted one year ago

@<1523701087100473344:profile|SuccessfulKoala55>
So, I did it with debug and got this stacktrace error:
type_checker=validator.TYPE_CHECKER.redefine_many({
AttributeError: type object 'Draft4Validator' has no attribute 'TYPE_CHECKER'

  
  
Posted one year ago

Can you try running clearml-agent --debug daemon --foreground ?

  
  
Posted one year ago

But from what you're saying it seems like the agent simply cannot communicate with the server and what you see is simply the agent waiting indefinitely

  
  
Posted one year ago

@<1526734383564722176:profile|BoredBat47> the agent-services is probably not configured (it needs key and secret to the clearml server to be configured in the docker-compose)

  
  
Posted one year ago

@<1523701070390366208:profile|CostlyOstrich36>
What agent-services is doing on start up? Seems like something is preventing it from properly working. I already added a command to entrypoint to configure pip.conf since we have to use a trusted mirror to download python packages. Also I managed to connect local agent to ClearML server by using 127.0.0.1 host in credentials. Still no luck with remote agent

  
  
Posted one year ago

I looked through agent-services logs and found new error I haven't seen before:
clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://<my_ip>:8008 ?

  
  
Posted one year ago

The terminal hangs on the command

  
  
Posted one year ago

Console output of clearml-agent init with no clearml.conf:
...
ClearML Hosts configuration:
Web App: None
API: None
File Store: None

Verifying credentials ...
Error: could not verify credentials: key=ak secret=sk
...
Console output of clearml-agent daemon --foreground with clearml.conf created by clearml-init is missing. No output.
...

  
  
Posted one year ago

Console output of clearml-agent daemon --foreground ?

  
  
Posted one year ago

Can you please attached the console output again?

  
  
Posted one year ago

clearml-agent daemon --foreground

  
  
Posted one year ago

What command did you use?

  
  
Posted one year ago

It works like I mentioned before: the terminal jumps on a new line and sits there, no output after that, nothing is happening in the console. But if you go to UI you see that "Last used" is updating

  
  
Posted one year ago

Actually the agent will use the default values for the agent section if you have a clearml.init file - what do you get if you run the agent like that?

  
  
Posted one year ago

@<1523701087100473344:profile|SuccessfulKoala55>
When I run clearml-agent init I don't have a file prior to this. I tried running agent daemon with clearml.conf created by clearml-init but that doesn't work since it has no agent section, right? I know I can add it myself but I think clearml-agent init should function too

  
  
Posted one year ago

do you have this file in your home folder?

  
  
Posted one year ago

Is clearml-init also has to connect to the ClearML server to successfully finish?

Yes, it verifies the credentials in the same way, and creates a clearml.conf file when done

  
  
Posted one year ago

Hi, sorry for the delay 😞

  
  
Posted one year ago

Sorry for bothering but I am really lost, I think I exhausted all my options. I really have no clue what is going on.

  
  
Posted one year ago

@<1523701087100473344:profile|SuccessfulKoala55> I provided following env vars:
CLEARML_HOST_IP: "<my_ip>"
CLEARML_WEB_HOST: " http://<my_ip>:8080 "
CLEARML_API_HOST: " http://<my_ip>:8008 "
CLEARML_FILES_HOST: " http://<my_ip>:8081 "
CLEARML_API_ACCESS_KEY: <my_access_key>
CLEARML_API_SECRET_KEY: <my_secret_key>
also I changed IP in entrypoint from apiserver:8008 to <my_ip>:8008

Yes, I run both commands from the same place — dedicated user on my worker machine. Is clearml-init also has to connect to the ClearML server to successfully finish?

  
  
Posted one year ago

BoredBat47 what did you provide in the docker-compose to the services agent?
Also, you said that clearml-init worked but clearml-agent init did not - did you run both from the same place?

  
  
Posted one year ago

CostlyOstrich36 Any thoughts?

  
  
Posted one year ago

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 21354 Warning: Transient problem: HTTP error Will retry in 10 seconds. 10 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 21345 Warning: Transient problem: HTTP error Will retry in 10 seconds. 9 retries Warning: left. 100 100k 100 100k 0 0 10238 0 0:00:10 0:00:10 --:--:-- 21345 Warning: Transient problem: HTTP error Will retry in 10 seconds. 8 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 7 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26958 Warning: Transient problem: HTTP error Will retry in 10 seconds. 6 retries Warning: left. 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 26951 Warning: Transient problem: HTTP error Will retry in 10 seconds. 5 retries Warning: left. 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 26958 Warning: Transient problem: HTTP error Will retry in 10 seconds. 4 retries Warning: left. 100 100k 100 100k 0 0 10235 0 0:00:10 0:00:10 --:--:-- 26951 Warning: Transient problem: HTTP error Will retry in 10 seconds. 3 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 2 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 1 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead:

  
  
Posted one year ago

I think so, yes

  
  
Posted one year ago

CostlyOstrich36 Am I right I should also provide this URLS in agent-services section in docker-compose file?
CLEARML_HOST_IP: ${CLEARML_HOST_IP:-}
CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
CLEARML_API_HOST: http://apiserver:8008

  
  
Posted one year ago

Also services agent is not related to regular agent executions

  
  
Posted one year ago

BoredBat47 , can you add the logs?

  
  
Posted one year ago
42K Views
39 Answers
one year ago
one year ago
Tags
Similar posts