
Reputation
Badges 1
104 × Eureka!Console output of clearml-agent init
with no clearml.conf:
...ClearML Hosts configuration:
Web App:
NoneAPI:
NoneFile Store:
None
Verifying credentials ...
Error: could not verify credentials: key=ak secret=sk
...
Console output of clearml-agent daemon --foreground
with clearml.conf created by clearml-init
is missing. No output.
...
The strange thing also is that I see that the credentials are being used in web UI: last used timestamp is updated constantly to present time. So apparently daemon is trying to do something but can't launch properly all the way
Also, previous problem was in incorrect proxy configuration on agent machine
@<1523701070390366208:profile|CostlyOstrich36> Yes, I know. Above I posted a link where there's a solution. DB request to elastic to change those URLs. My question is: where to send this DB request? What endpoint? Request provided in FAQ in incomplete. It lacks URL where to send the request to.
curl --header "Content-Type: application/json" \
--request POST \
--data '{
"script": {
"source": "ctx._source.url = ctx._source.url.replace('
.<OLD_ADDRESS>', '
...
Try to run docker ps
and check if all of your clearml containers up and running (should be 8 total)
@<1722061389024989184:profile|ResponsiveKoala38> Sure, I'll get back to you as it finishes
@<1523701070390366208:profile|CostlyOstrich36>
What agent-services is doing on start up? Seems like something is preventing it from properly working. I already added a command to entrypoint to configure pip.conf since we have to use a trusted mirror to download python packages. Also I managed to connect local agent to ClearML server by using 127.0.0.1 host in credentials. Still no luck with remote agent
Sorry for bothering but I am really lost, I think I exhausted all my options. I really have no clue what is going on.
Console output of clearml-agent daemon --foreground
?
~/.local/bin/clearml-agent daemon --foreground
@<1523701070390366208:profile|CostlyOstrich36> Old debug samples. My URL for files server has changed, and old debug sampled are not shown.
clearml-agent daemon --foreground
@<1722061389024989184:profile|ResponsiveKoala38> Thanks a lot for the help. Keep up the good work!
482e96243041 allegroai/clearml:latest "python3 -m jobs.asy…" 18 months ago Up 7 weeks 8008/tcp, 8080-8081/tcp async_delete
26c677f2b70f allegroai/clearml:1 "/opt/clearml/wrappe…" 18 months ago Up 16 months 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp clearml-webserver
- `7e2cf4462f44 allegroai/clearml:1 "/opt/clearml/wrappe…" 18 months ago Up 7 months 0.0.0.0:8008->8008/tcp, :::8008->8008/tcp, 8080-8081/tcp clearml-apiserv...
@<1722061389024989184:profile|ResponsiveKoala38> There were errors everywhere, in almost every container. I decided to manually move working clearml image from old installation and run from it. It worked. It's not the latest ClearML but it'll do for now. Thanks for your help! 💪
@<1722061389024989184:profile|ResponsiveKoala38> Hello. It seems that it didn't work for me. I made a backup, moved it to another machine and tried to run clearml service (latest docker compose). Now, I have async-delete, apiserver, mongo, fileserver, elastic constantly restarting
Thanks a lot. I see that ClearML apiserver is up for 7 months, could it be that it runs on a version that was recent 7 month ago?
@<1722061389024989184:profile|ResponsiveKoala38> Now I can see the images where previously it was placeholders with text "Unable to upload the images"
@<1523701087100473344:profile|SuccessfulKoala55>
So, I did it with debug and got this stacktrace error:type_checker=validator.TYPE_CHECKER.redefine_many({
AttributeError: type object 'Draft4Validator' has no attribute 'TYPE_CHECKER'
It's the same request you provided just without "case_sensitive" option and with my endpoints @<1722061389024989184:profile|ResponsiveKoala38>
I don't think so. Just some info about cluster state in there
All log entries have "level": "INFO"
So, right now I have old deployment. It's working good, it's not corrupted. Service versions I shared above (output of docker ps). My goal is to move everything to another machine. Yes, I want to have a new deployment with all previous data. Basically, it's backup and restore task. The problem was that old docker compose file doesn't work as is. Maybe because when I run it on a new machine clearml:1 is pulling the latest version and elastic version is set to one that is no longer supported.
@<1523701087100473344:profile|SuccessfulKoala55> I figured where to find a region but we don't have an AWS dashboard. We have a custom S3 solution for our own enterprise servers like many companies do, data is not stored on amazon servers. That is why we have and endpoint which is an URL starting with http://
If I would connect to our bucket via boto3 I would pass endpoint to a client session with endpoint_url
SmugDolphin23 That fixed the issue, thank you very much!
@<1722061389024989184:profile|ResponsiveKoala38> Thank a lot! I am gonna upgrade ClearML using this link: None
I looked through agent-services logs and found new error I haven't seen before:clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://<my_ip>:8008 ?
CostlyOstrich36
The error appears regardless of --foreground tag. This is not full stacktrace, I will provide it with the next message.
clearml 1.9.0
clearml-agent 1.5.1
Ubuntu1 8.04.6 LTS