Reputation
Badges 1
282 × Eureka!Ok. Problem was resolved with latest version of clearml-agent and clearml.
Ah ok. So it will be fixed on the ClearML server web UI as well? (See screenshots).
Hi SuccessfulKoala55 , thanks. Opened issue on the CLearml-Agent GH at https://github.com/allegroai/clearml-agent/issues/67
Can i dig into the mongodb or ES to pull these data?
Thanks TimelyPenguin76 , is there an env var for the S3 connection as well?
Hi,
I'm running on Dell ECS storage appliance, which offers S3 compatibility.
yes http://ECS.ai is the DNS name of the server.
ClearML-models is the bucket.
Let me try with ip:port.
Can this issue be solved with vault? It doesn't make sense to expose secrets like that.
[root@2c7498711bef elasticsearch]# curl
`
{
"index" : "events-training_stats_scalar-d1bd92a3b039400cbafc60a7a5b1e52b",
"shard" : 0,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "CLUSTER_RECOVERED",
"at" : "2021-05-22T11:33:38.932Z",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisi...
Hi TimelyPenguin76 ,
If you notice in the last screenshot, it state the bucket name to be http://ecs.ai . It then it tries to open http://s3.amazonaws.com/ecs.ai/clearml-models/artifact/uploading_file?X-Amz-Algorithm= ....
I also see this on my logs, noting that the config is read in but its still printing the supposedly hidden keys on the logs and UI.agent.hide_docker_command_env_vars.enabled = true agent.hide_docker_command_env_vars.extra_keys.0='TRAINS_AGENT_GIT_USER' ..... docker_cmd=harbor.ai/public/detectron2:v3 --env TRAINS_AGENT_GIT_USER=gituser
Hi SuccessfulKoala55 ,i managed to install clearml-agent==1.0.1rc5. However, the same issues occur.
Its 1.0.0. As printed on the top of the logs in ClearML Server UI.
Hi, by agent logs i suppose you meant the logs from the ClearML server console panel?
Ok thanks.
I'm also noticing a lot of this while the k8s glue is running.Ex: Expecting value: line 1 column 1 (char 0) K8S Glue pods monitor: Failed parsing kubectl output:
Do you mean this?Removing containers section: [{'image': 'clearml-agent:latest"', 'env': [{'name': 'PIP_INDEX_URL', 'value': '
'},
What's the diff between template-yaml and --overrides-yaml? I used the latter to ensure the gpu is passed in.
I think in general, the 'published' action can be considered an 'approval'. The question is, how do we control who has the authority to 'publish'? The Web UI today does not support any uploads outside of the coding environment, would be nice it would be supported. But for now, the only workaround is to include parameters that stores document urls in the user properties.
Hi SuccessfulKoala55 , just to add, my clearml.conf (client) and clearml.agent.conf (agent) can have differing values. I'm not sure which one takes precedence and if this could be the cause.
Hi, i changed it, but it still point to https://files.pythonhosted.org/packages .
Hi. nice read. Your permalink is wrong though, here's the right one.
https://cpatrickalves.com/mlops-what-it-is-and-why-does-it-matter
Is there anyway to see an error log from that?
Create immutable and differentiable versions on-prem or in the cloud with our data agnostic solution.
Hi AgitatedDove14 , i've got the same error. It would appear that the code references clearml_agent/helper/base.py
which i believe is part of clearml-agent v0.17.1. Could that be the issue?
Yes! I definitely think this is important, and hopefully we will see something there
(or at least in the docs)
Hi AgitatedDove14 , any updates in the docs to demonstrate this yet?
Ok thanks, looking forward to it. Would you advise on the bug you encountered?
Hi, i tried the k8s-glue on my k8s setup and needed some clarifications on some of the arguments.
--queue. Does this only refer to default and service? How can i create new queue to which it can sync with the ClearML server? --ports-mode. I'm not sure what ports mode does. doc says "add a label to the pod which can be used as service". Which pod is it referring to in the first place? All args pertaining to --ports-mode. (E.g. base-pod-num, gateway-address...etc) --overrides-yaml. What is the ...
Sorry i don't quite understand this. The task itself was submitted as I run the code on the client. I suppose the dependancies requirements would be copied over as the experiment is cloned?