
Reputation
Badges 1
104 × Eureka!Traceback (most recent call last):
File "/home/<home>/.local/bin/clearml-agent", line 8, in <module>
sys.exit(main())
File "/home/<home>/.local/lib/python3.8/site-packages/clearml_agent/__main__.py", line 83, in main
return run_command(parser, args, command_name)
File "/home/<home>/.local/lib/python3.8/site-packages/clearml_agent/__main__.py", line 46, in run_command
return func(**args_dict)
` File "/home/<home>/.local/lib/python3....
@<1523701087100473344:profile|SuccessfulKoala55>
When I run clearml-agent init
I don't have a file prior to this. I tried running agent daemon with clearml.conf
created by clearml-init
but that doesn't work since it has no agent section, right? I know I can add it myself but I think clearml-agent init
should function too
It's the same request you provided just without "case_sensitive" option and with my endpoints @<1722061389024989184:profile|ResponsiveKoala38>
Console output of clearml-agent init
with no clearml.conf:
...ClearML Hosts configuration:
Web App:
NoneAPI:
NoneFile Store:
None
Verifying credentials ...
Error: could not verify credentials: key=ak secret=sk
...
Console output of clearml-agent daemon --foreground
with clearml.conf created by clearml-init
is missing. No output.
...
@<1523701087100473344:profile|SuccessfulKoala55> I figured where to find a region but we don't have an AWS dashboard. We have a custom S3 solution for our own enterprise servers like many companies do, data is not stored on amazon servers. That is why we have and endpoint which is an URL starting with http://
If I would connect to our bucket via boto3 I would pass endpoint to a client session with endpoint_url
CostlyOstrich36 Am I right I should also provide this URLS in agent-services section in docker-compose file?
CLEARML_HOST_IP: ${CLEARML_HOST_IP:-}
CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
CLEARML_API_HOST: http://apiserver:8008
curl -XPOST -H 'Content-Type: application/json' 'localhost:9200/events-training_debug_image-*/_update_by_query?conflicts=proceed' -d'{
"script": {
"source": "ctx._source.url = ctx._source.url.replace('http://<MY_OLD_ADDRESS>', '
.<NEW_ADDRESS>')",
"lang": "painless"
},
"query": {"prefix": {"url": {"value": "http://<MY_OLD_ADDRESS>"}}}
}'
` s3 {
# S3 credentials, used for read/write access by various SDK elements
# default, used for any bucket not specified below
key: "mykey"
secret: "mysecret"
region: " ` ` "
credentials: [
{
bucket: "mybucket"
key: "mykey"
secret: "mysecret"
region: " ` ` "
}, `
@<1722061389024989184:profile|ResponsiveKoala38> It fixed the issue!
Yeah, I mean fresh installation using old docker compose file. Just without backups (/clearml/data). So it seems the solution to me should be:
- Migrate to the latest version of elastic on old installation
- Make a backup
- Deploy latest ClearML installation with that backup
It works like I mentioned before: the terminal jumps on a new line and sits there, no output after that, nothing is happening in the console. But if you go to UI you see that "Last used" is updating
CostlyOstrich36
The error appears regardless of --foreground tag. This is not full stacktrace, I will provide it with the next message.
clearml 1.9.0
clearml-agent 1.5.1
Ubuntu1 8.04.6 LTS
@<1523701087100473344:profile|SuccessfulKoala55> Hey, Jake, getting back to you. I couldn't be able to resolve my issue. I can access my bucket by any means just fine, e.g. by S3 CLI client. All the tools I use require 4 params: AK, SK, endpoint, bucket. I wonder why ClearML doesn't have explicit endpoint
parameter and you have to use output_uri
for it and why is there a region
when other tools don't require it.
SmugDolphin23 Got it. Now I am a bit confused about region parameter in s3 section. Amazon docs say that region could be a regular URL with protocol like https://etc.etc which my endpoint actually is. I plugged it in s3 section in clearml.conf. Should it stay that way?
Hi @<1722061389024989184:profile|ResponsiveKoala38> , I am using those specific versions because my previous ClearML installation runs with such versions, they are in docker compose file. Version of ClearML image is 1. Afaik the latest is 1.16.2. My goal is to move ClearML to a different machine so I need to stick to those versions
@<1722061389024989184:profile|ResponsiveKoala38> There were errors everywhere, in almost every container. I decided to manually move working clearml image from old installation and run from it. It worked. It's not the latest ClearML but it'll do for now. Thanks for your help! 💪
Should I remove "case-sensitive" option from a query?
SmugDolphin23 That fixed the issue, thank you very much!
So, right now I have old deployment. It's working good, it's not corrupted. Service versions I shared above (output of docker ps). My goal is to move everything to another machine. Yes, I want to have a new deployment with all previous data. Basically, it's backup and restore task. The problem was that old docker compose file doesn't work as is. Maybe because when I run it on a new machine clearml:1 is pulling the latest version and elastic version is set to one that is no longer supported.
I looked through agent-services logs and found new error I haven't seen before:clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://<my_ip>:8008 ?
@<1523701070390366208:profile|CostlyOstrich36>
@<1523701070390366208:profile|CostlyOstrich36> You mean using port in credentials.host
?
session = boto3.Session(
aws_access_key_id=self.access_key,
aws_secret_access_key=self.secret_key)
My question could be this: what's get plugged into endpoint_url in boto3 client inside ClearML?
@<1523701087100473344:profile|SuccessfulKoala55>
So, I did it with debug and got this stacktrace error:type_checker=validator.TYPE_CHECKER.redefine_many({
AttributeError: type object 'Draft4Validator' has no attribute 'TYPE_CHECKER'
@<1523701070390366208:profile|CostlyOstrich36>
What agent-services is doing on start up? Seems like something is preventing it from properly working. I already added a command to entrypoint to configure pip.conf since we have to use a trusted mirror to download python packages. Also I managed to connect local agent to ClearML server by using 127.0.0.1 host in credentials. Still no luck with remote agent
Can a problem be that backups are made while ClearML was running, not stopped, like docs suggest? @<1523701070390366208:profile|CostlyOstrich36>