GreasyPenguin66

3 Questions, 17 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Badges 1

17 × Eureka!

Questions 3
Answers 17

0 Votes

15 Answers

2K Views

0 Votes 15 Answers 2K Views

Hey! I Stumbled Upon Some Errors With My Workers Monitoring. I Checked Logs In My K8S Pods For Apiserver And Elasticsearch And It Seems The Problem Is There. These Are The Logs: Apiserver Logs [2021-04-23 06:19:50,209] [9] [Error] [Trains.Service_Repo] Re

Hey! I stumbled upon some errors with my workers monitoring. I checked logs in my k8s pods for apiserver and elasticsearch and it seems the problem is there....

clearml

4 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hi! I Have A Question Concerning Dynamic Environment Variables. I Managed To Create Some Env Variables From The Apiserver.Conf And Now I Would Like To Set Some Env Variables For My Main Clearml.Conf File. However I Am Not Sure What Is The Proper Way. I T

Hi! I have a question concerning dynamic environment variables. I managed to create some env variables from the apiserver.conf and now I would like to set so...

clearml

4 years ago

0 Votes

14 Answers

2K Views

0 Votes 14 Answers 2K Views

Hi! I Deployed Clearml Server Along With Jupyterhub On Azure K8S (Aks). The Way It Works Is That Every User Is Assigned A New Pod That Is Spawned With A Docker Image Of A Choice (One Of Them With Clearml Sdk Installed). I Managed To Configure Most Of The

Hi! I deployed clearml server along with jupyterhub on Azure K8s (AKS). The way it works is that every user is assigned a new pod that is spawned with a dock...

clearml

4 years ago

0 Hey! I Stumbled Upon Some Errors With My Workers Monitoring. I Checked Logs In My K8S Pods For Apiserver And Elasticsearch And It Seems The Problem Is There. These Are The Logs: Apiserver Logs [2021-04-23 06:19:50,209] [9] [Error] [Trains.Service_Repo] Re

And also another question came to my mind. When changing any deployment for clearml like apiserver or mongo or elasticsearch etc. do I have to redeploy everything from the scratch? I had some problems previously when changing something in apiserver forced me to redeploy everything in order for clearml to work properly. And I am wondering whether you have maybe some guidelines for that.

4 years ago

0 Hi! I Have A Question Concerning Dynamic Environment Variables. I Managed To Create Some Env Variables From The Apiserver.Conf And Now I Would Like To Set Some Env Variables For My Main Clearml.Conf File. However I Am Not Sure What Is The Proper Way. I T

great! And the container name would be inferenced from the default_output_uri?

4 years ago

Hi AgitatedDove14 . I am using jupyterhub on k8s and I spawn a pod for every singleuser. I have a custom dockerfile with clearml installed however I dont want to copy the clearml.conf file in the dockerfile and instead I would prefer to pass some neccessary configurations as ENV variables. Is it possible?

4 years ago

perfect. Thank you very much!

4 years ago

Unfortunately the problem was not resolved nor by changing the vm memory settings back to 2 gb and by going back from azurefiles persistent volumes to hostPath. Seems odd as I did not have any of these issues before. I thought it might come from the changes in PV and elasticsearch settings but going back to the original settings did not resolve the issue. Shouldn't I be using the latest tag for clearml?

4 years ago

As for the clearml server version by latest tag I meant v 0.17.0

4 years ago

Yes they are. With mongo I had a problem connected with azurefiles and mongo who did not approve to mount azurefiles under /data/db as it could not initialize. The solution for that was to mount the azurefiles under different path and then specify command for mongo with path to the data so that it could initialize properly. However when I deleted a kubernetes cluster, created a new one and I redeployed clearml I had issues coming not from mongo anymore but from apiserver that was failing with...

4 years ago

Hi SuccessfulKoala55 Thanks for the response. For elastic I am using the image http://docker.elastic.co/elasticsearch/elasticsearch:7.6.2 the one that is in manifests in clearml repo. As for the clearml images I am using the latest tags everywhere. Let me restore the vm settings for elastic and I'll let you know ;)

4 years ago

0 Hi! I Deployed Clearml Server Along With Jupyterhub On Azure K8S (Aks). The Way It Works Is That Every User Is Assigned A New Pod That Is Spawned With A Docker Image Of A Choice (One Of Them With Clearml Sdk Installed). I Managed To Configure Most Of The

Yes. So jupyter lab is run as a tini command inside a pod. I attach a ps aux command from my pod

4 years ago

Hi AgitatedDove14 . I'm just writing to explain what was the problem. Basically our setup - jupyterhub on k8s with kubespawner that was spawning a pod for each single user notebook, uses docker images that are based on jupyter/docker-stacks.

The problem was that the token for jupyterhub api was not propagated to the spawned pod so whenever clearml was trying to access jupyter/user/api/sessions endpoint it would be redirected for authorization to jupyterhub api and then fail due to the lack ...

4 years ago

Hey SuccessfulKoala55 Thank you for your answers I really appreciate it. As for elasticsearch it was indeed the index error that was created before. The reason for that is that I was trying to setup a backup for elasticsearch and mongodb using azurefiles. So the scenario is I'm using persistent volumes on k8s that are using azure file shares as storage. Then I can rebuild my cluster and use the exact same storage so that the data is persistent and I can restore my application from the last ...

4 years ago

Sry I had to rebuild my image to install curl. Yes the command works.

4 years ago

It works as well . As for rebuilding the image I was not a root nor a sudoer so I had either to rebuild docker image and set it to root or to install the package while rebuilding 😉

4 years ago

yes

4 years ago

That were my thoughts too. But the jupyter/base_notebook from docker stacks that they recommend to use and from which my image inherits did not include the token in the jupyter lab run command. I don't know whether it was a bug or an intentional choice, however I was either going to change the base image, or to add a token in a postStart hook. I decided to go with the second option 😉

4 years ago

Anyway thank you so much for the help it seems like not a problem connected with clearml. I ll post my solution once I solve the problem 😉

4 years ago

I'm sorry i was wrong. Niether of the commands give positive response. I actually get 404 page ... Sry I assumed I got a lot of data so it meant it was ok. But now I read into it

4 years ago