Reputation
Badges 1
35 × Eureka!nvm I fixed it thank you @<1523701087100473344:profile|SuccessfulKoala55> ๐
Everything works fine when I access the /files endpoint from a browser after logging in via the webserver. However, when I run an experiment using the ClearML Python package and it tries to upload the artifacts to the /files endpoint, it gets blocked with a 401 Unauthorized error.
Iโve checked the ClearML Python package and it seems to be attaching the auth headers correctly. However, the JWT token itโs using is different from the one used by the browser, and itโs not being accepted by the s...
When I try to run clearml-init on another device:
Verifying credentials ...
Error: could not verify credentials
Has anyone encountered this error and know how to fix?
Yes @<1523701087100473344:profile|SuccessfulKoala55> , I'm using a Lua script which makes a light api call to /users.get_current_user to check the validity of the JWT, not just its existence, before access the /files endpoint. This way, trying to access before logging in (no presence of token/invalid token) gets blocked, and logging in (presence of valid token in request header) grants access to the files. Works all fine and dandy in the browser, but I'm now realizing the application itself ...
Yes @<1523701087100473344:profile|SuccessfulKoala55> , basically like this:
local token = ngx.req.get_headers()["Authorization"] or ngx.var.cookie_clearml_token_basic
if not token then
ngx.exit(ngx.HTTP_UNAUTHORIZED)
return
end
Is there any information on how the ClearML python package authenticates itself with the ClearML server? I am almost certain that the endpoint I am using to check the token validity can only handle the browser session, and fails when the package tries to access files, i.e. the server isn't correctly setup to validate the JWT from the ClearML Python package.
So I read that in the ClearML docs that the file server has no security whatsoever and that you guys recommend to use object storage (s3/azure/etc). I currently do not have resources to use those sadly. However, I do know that the webserver has JWT auth, and so the proxy uses some lua scripting to verify a valid token before access the /files endpoint.
Looking at this for reference:
None
Yea I ran a system reset that removes all pods, contianers, images, networks, volumes. I was wondering if it was possible to repopulate MongoDB if I had the fileserver intact. If it's not possible to repopulate MongoDB, it's not the end of the world as I only had a few experiments on it.
Does ClearML have configuration settings or environment variables that allow you to specify additional headers for HTTP requests?
Unfortunately I don't think I can run in --docker
mode, I'm on a RHEL 8 system which only supports podman (sucks I know but it's a university thing), is there any other options? I was thinking if there was some way to pull the python version from task metadata I could dynamically craete/reuse conda envs with the right environment.
Hi @<1523701087100473344:profile|SuccessfulKoala55> How necessary is it to backup redis? What's it used for in the application? I found it quite simple to backup mongo using mongodump
, but I'm having some issues with gracefully creating a redis backup.
It works fine in the browser when you try to access None , it gets blocked when you haven't logged in via the /login page first. I didn't forsee how the package may need to access this endpoint, meaning it'll also need a valid JWT in it's header.
So I've figured out the issue. After pasting the configuration credentials (the api {} section in the conf), I am assuming that the clearml service needs to connect to the api service somehow. However, because I've deployed it under my the umbrella of my university, trying to access our self-hosted ClearML takes you to a None MFA page. I think the clearml service is getting blocked here. Is there any workaround for this?
Hey @<1523701205467926528:profile|AgitatedDove14> , I think the 'execute' function from the clearml-agent is great. I've been testing/using it for a few days, and, while it's a little more hands-on, it has been an amazing workaround for us uni students who have no budget ๐ . That said, I've been using clearml-agent execute <job_id>
to great workaround for us uni students who have no budget . That said, I've been using clearml-agent execute <job_id> t run jobs on an HPC node. That sa...
Oh I see, I confused with what "Agent Orchestration" meant on the website. Is the clearml-agent queue not available in the open source?
I see that you can do clearml-agent daemon --queue
, so do you think it would be possible to spin up another daemon, which listens to this daemon, which then runs a slurm job?
I want to emphasize that I do not mean to undermine your enterprise tier, but I am just trying to work with the limitations of the resources my university, which means I have to use...
Hey @<1523701070390366208:profile|CostlyOstrich36> , do you know of a way to use Podman instead of Docker?
Hey @<1612982606469533696:profile|ZealousFlamingo93> , I had a similar problem with Gitlab tokens not working with the Agent. My issue was slightly different with the error being clearly a permissions issue with no alternative options, but I see that your output is suggesting to check if your remote-worker had valid credentials as well along with the making sure you have the right commit.
I resolved the issue by making a gitlab token with a developer role. I found that with private Gitlab r...
Hi @<1523701087100473344:profile|SuccessfulKoala55> thanks for the response. Now my question is more of how does ClearML agent know which env to use? Will it make envs using conda
as an exe if the packages it needs for a project aren't fulfilled by an existing env?
Ok thanks @<1523701070390366208:profile|CostlyOstrich36> . When I launched the docker-compose on my Ubuntu machine, I never even set the variable. We've since migrated to RHEL 8 so I'm using podman to launch it and this is the first I'm ever seeing this error. I found docs on it, so I set it using export ClEARML_FILES_HOST="
None "
. For whatever reason that didn't work so I've just resorted to hardcoding...
Hi @<1523701070390366208:profile|CostlyOstrich36> , upon taking a closer look at the documentation, I think I should be using None . I'm going to take a deeper look into the dev tools to see whats up, thanks for the pointer.
Hi @<1523701070390366208:profile|CostlyOstrich36> , is there a public endpoint available for token validation for users? The endpoint I am trying to call is intended only for system/root. When I try to validate a token which is stored in my request header, it is returning error 403. Is there a different way to use this endpoint, or a different one that doesn't require...
Hi @<1523701087100473344:profile|SuccessfulKoala55> , thanks for the heads up. I was able to get my script running yesterday using that exact endpoint. Took a while to find it in the dev tools ๐ . Thanks for all your help!
No worries, using this lesson to put in backup procedures in place haha
@<1576381444509405184:profile|ManiacalLizard2> Thought so, thanks for the clarification!
Ok thanks I'll look into that! I'm still wondering though, is there no endpoint that returns info on the python executable? I went through ClearML Task endpoints and ClearML api client for tasks as well and I couldn't find one that returns the environment, e.g. the python executable or like a requirements.txt of the experiment. Is there something I am overlooking? Or are is ...
I'm also using these versions of ClearML
I'm using the latest ClearML image from docker; I'm running it on a linux machine and followed those steps for the most part when deploying. Here is the stack trace from the client:
Hey @<1523701087100473344:profile|SuccessfulKoala55> thanks for the reply and apologies for the delayed response. I added an nginx webserver on top of the default open-source clearml to route requests from external clients to the host machine's ports 8080, 8008, and 8081. To browse to the UI, I use my personal domain (which resolves to the machines static IP), e.g. None I've redacted it here for privacy if you don't mind.
The only port exposed to the internet is p...
Sounds good. I've looked into the configuration settings information, is this a problem because I didn't set the CLEARML_FILES_HOST env variable in an env file somewhere? With the autogenerated credientials, the api_server
gets generated, but it points to None instead, so I have to fix it. Whereas, the files_server
key does not even appear.