Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey, Guys! I Have A Problem. I Launched Clearml Server And Trying To Run A Worker On Another Machine. When I Run

Hey, guys! I have a problem. I launched ClearML Server and trying to run a worker on another machine. When I run clearml-agent init it can't verify credentials and create a clearml.conf file. However clearml-init command successfully accepted my creds and created clearml.conf file. Then when I try to run clearml-agent daemon -d it just sits there, no output and it doesn't appear in workers sections in web UI.

How do I make my worker run properly? Can I see daemon output logs somewhere?
Any help is much appreciated. Thank you!

  
  
Posted one year ago
Votes Newest

Answers 39


Console output of clearml-agent daemon --foreground ?

  
  
Posted one year ago

do you have this file in your home folder?

  
  
Posted one year ago

CostlyOstrich36 Any thoughts?

  
  
Posted one year ago

clearml-agent daemon --foreground

  
  
Posted one year ago

I looked through agent-services logs and found new error I haven't seen before:
clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://<my_ip>:8008 ?

  
  
Posted one year ago

@<1523701087100473344:profile|SuccessfulKoala55>
When I run clearml-agent init I don't have a file prior to this. I tried running agent daemon with clearml.conf created by clearml-init but that doesn't work since it has no agent section, right? I know I can add it myself but I think clearml-agent init should function too

  
  
Posted one year ago

Hi, sorry for the delay 😞

  
  
Posted one year ago

CostlyOstrich36 Yep, it seems it was the case. I did not provide credentials for API in docker compose. I did that but now agent-services just keeps restarting. I looked into containers logs and it seems to be a proxy error. Why this container is trying to connect somewhere?

  
  
Posted one year ago

Can you please attached the console output again?

  
  
Posted one year ago

What version of clearml and clearml-agent are you using, what OS? Can you add the line you're running for the agent?

  
  
Posted one year ago

Can you try running clearml-agent --debug daemon --foreground ?

  
  
Posted one year ago

It works like I mentioned before: the terminal jumps on a new line and sits there, no output after that, nothing is happening in the console. But if you go to UI you see that "Last used" is updating

  
  
Posted one year ago

What command did you use?

  
  
Posted one year ago

CostlyOstrich36 Am I right I should also provide this URLS in agent-services section in docker-compose file?
CLEARML_HOST_IP: ${CLEARML_HOST_IP:-}
CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
CLEARML_API_HOST: http://apiserver:8008

  
  
Posted one year ago

I think so, yes

  
  
Posted one year ago

The strange thing also is that I see that the credentials are being used in web UI: last used timestamp is updated constantly to present time. So apparently daemon is trying to do something but can't launch properly all the way

  
  
Posted one year ago

But from what you're saying it seems like the agent simply cannot communicate with the server and what you see is simply the agent waiting indefinitely

  
  
Posted one year ago

@<1523701087100473344:profile|SuccessfulKoala55>
So, I did it with debug and got this stacktrace error:
type_checker=validator.TYPE_CHECKER.redefine_many({
AttributeError: type object 'Draft4Validator' has no attribute 'TYPE_CHECKER'

  
  
Posted one year ago

Also, previous problem was in incorrect proxy configuration on agent machine

  
  
Posted one year ago

BoredBat47 what did you provide in the docker-compose to the services agent?
Also, you said that clearml-init worked but clearml-agent init did not - did you run both from the same place?

  
  
Posted one year ago

but without -d

  
  
Posted one year ago

@<1526734383564722176:profile|BoredBat47> the agent-services is probably not configured (it needs key and secret to the clearml server to be configured in the docker-compose)

  
  
Posted one year ago

Sorry for bothering but I am really lost, I think I exhausted all my options. I really have no clue what is going on.

  
  
Posted one year ago

@<1523701070390366208:profile|CostlyOstrich36>
Should I leave as is or fill the values in docker-compose for agent-services? I set it to localhost since agent-services is running together with other clearml containers on one machine. Not sure why do you have to fill those values.
CLEARML_HOST_IP: "<my_clearml_server_ip>"
CLEARML_WEB_HOST: " None "
CLEARML_API_HOST: " None "
CLEARML_FILES_HOST: " None "

  
  
Posted one year ago

Console output of clearml-agent init with no clearml.conf:
...
ClearML Hosts configuration:
Web App: None
API: None
File Store: None

Verifying credentials ...
Error: could not verify credentials: key=ak secret=sk
...
Console output of clearml-agent daemon --foreground with clearml.conf created by clearml-init is missing. No output.
...

  
  
Posted one year ago

Sorry, forgot to mention. I used the command with --foreground tag. It is the same. Terminal just sits at a new line, no logs, no worker in UI

  
  
Posted one year ago

CostlyOstrich36 Seems like on my server agent-services container is missing. It's not running. Could it be the issue?

  
  
Posted one year ago

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 21354 Warning: Transient problem: HTTP error Will retry in 10 seconds. 10 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 21345 Warning: Transient problem: HTTP error Will retry in 10 seconds. 9 retries Warning: left. 100 100k 100 100k 0 0 10238 0 0:00:10 0:00:10 --:--:-- 21345 Warning: Transient problem: HTTP error Will retry in 10 seconds. 8 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 7 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26958 Warning: Transient problem: HTTP error Will retry in 10 seconds. 6 retries Warning: left. 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 26951 Warning: Transient problem: HTTP error Will retry in 10 seconds. 5 retries Warning: left. 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 26958 Warning: Transient problem: HTTP error Will retry in 10 seconds. 4 retries Warning: left. 100 100k 100 100k 0 0 10235 0 0:00:10 0:00:10 --:--:-- 26951 Warning: Transient problem: HTTP error Will retry in 10 seconds. 3 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 2 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 1 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead:

  
  
Posted one year ago

Hi BoredBat47 , use the --foreground tag to see the logs 🙂

  
  
Posted one year ago

Also services agent is not related to regular agent executions

  
  
Posted one year ago
44K Views
39 Answers
one year ago
one year ago
Tags
Similar posts