Reputation
Badges 1
35 × Eureka!No worries, using this lesson to put in backup procedures in place haha
No, I'm using conda environment for execution. I've set the package manager to conda, and provided the path to a pre-built conda environment. Hence, I've also set conda_env_as_base_docker to true as well.
Makes sense. Howeveer, I'm not able to get docker on my distro, rather I am usng apptainer. Do you know of any alternatives?
Hey @<1523701087100473344:profile|SuccessfulKoala55> thanks for the reply and apologies for the delayed response. I added an nginx webserver on top of the default open-source clearml to route requests from external clients to the host machine's ports 8080, 8008, and 8081. To browse to the UI, I use my personal domain (which resolves to the machines static IP), e.g. None I've redacted it here for privacy if you don't mind.
The only port exposed to the internet is p...
Yes @<1523701087100473344:profile|SuccessfulKoala55> , I'm using a Lua script which makes a light api call to /users.get_current_user to check the validity of the JWT, not just its existence, before access the /files endpoint. This way, trying to access before logging in (no presence of token/invalid token) gets blocked, and logging in (presence of valid token in request header) grants access to the files. Works all fine and dandy in the browser, but I'm now realizing the application itself ...
Is there any information on how the ClearML python package authenticates itself with the ClearML server? I am almost certain that the endpoint I am using to check the token validity can only handle the browser session, and fails when the package tries to access files, i.e. the server isn't correctly setup to validate the JWT from the ClearML Python package.
Sounds good. I've looked into the configuration settings information, is this a problem because I didn't set the CLEARML_FILES_HOST env variable in an env file somewhere? With the autogenerated credientials, the api_server
gets generated, but it points to None instead, so I have to fix it. Whereas, the files_server
key does not even appear.
So I read that in the ClearML docs that the file server has no security whatsoever and that you guys recommend to use object storage (s3/azure/etc). I currently do not have resources to use those sadly. However, I do know that the webserver has JWT auth, and so the proxy uses some lua scripting to verify a valid token before access the /files endpoint.
It works fine in the browser when you try to access None , it gets blocked when you haven't logged in via the /login page first. I didn't forsee how the package may need to access this endpoint, meaning it'll also need a valid JWT in it's header.
Hi @<1523701070390366208:profile|CostlyOstrich36> , thanks for replying. I'm referring to the upload option from local device. When I click this button a file explorer pops up so I'm assuming it's intended to allow image uploads from my local machine
I'm running the default ports, 8080, 8081, and 8008 for the webserver, fileserver, and apiserver, respectively. I just modifed the deployment and sat a proxy in front of it so I could expose it to the internet and use it elsewhere. My current configuration involves doing this, where I change the values for the api_server
and files_server
to use the URL endpoints instead of the ports directly:
api {
web_server:
api_server:
files_server:
``...
No I don't. The key files_server doesn't exist in the credentials. When I create credentials I've been correcting the endpoints for the api_server and files_server each time.
I'm using the latest ClearML image from docker; I'm running it on a linux machine and followed those steps for the most part when deploying. Here is the stack trace from the client:
I'm also using these versions of ClearML
@<1576381444509405184:profile|ManiacalLizard2> Thought so, thanks for the clarification!
Hi @<1523701070390366208:profile|CostlyOstrich36> , is there a public endpoint available for token validation for users? The endpoint I am trying to call is intended only for system/root. When I try to validate a token which is stored in my request header, it is returning error 403. Is there a different way to use this endpoint, or a different one that doesn't require...
Hi @<1523701087100473344:profile|SuccessfulKoala55> thanks for the response. Now my question is more of how does ClearML agent know which env to use? Will it make envs using conda
as an exe if the packages it needs for a project aren't fulfilled by an existing env?
Hey @<1523701070390366208:profile|CostlyOstrich36> , do you know of a way to use Podman instead of Docker?
Hey @<1612982606469533696:profile|ZealousFlamingo93> , I had a similar problem with Gitlab tokens not working with the Agent. My issue was slightly different with the error being clearly a permissions issue with no alternative options, but I see that your output is suggesting to check if your remote-worker had valid credentials as well along with the making sure you have the right commit.
I resolved the issue by making a gitlab token with a developer role. I found that with private Gitlab r...
Ok thanks I'll look into that! I'm still wondering though, is there no endpoint that returns info on the python executable? I went through ClearML Task endpoints and ClearML api client for tasks as well and I couldn't find one that returns the environment, e.g. the python executable or like a requirements.txt of the experiment. Is there something I am overlooking? Or are is ...
So I've figured out the issue. After pasting the configuration credentials (the api {} section in the conf), I am assuming that the clearml service needs to connect to the api service somehow. However, because I've deployed it under my the umbrella of my university, trying to access our self-hosted ClearML takes you to a None MFA page. I think the clearml service is getting blocked here. Is there any workaround for this?
Looking at this for reference:
None
When I try to run clearml-init on another device:
Verifying credentials ...
Error: could not verify credentials
Has anyone encountered this error and know how to fix?
Unfortunately I don't think I can run in --docker
mode, I'm on a RHEL 8 system which only supports podman (sucks I know but it's a university thing), is there any other options? I was thinking if there was some way to pull the python version from task metadata I could dynamically craete/reuse conda envs with the right environment.
Oh I see, I confused with what "Agent Orchestration" meant on the website. Is the clearml-agent queue not available in the open source?
I see that you can do clearml-agent daemon --queue
, so do you think it would be possible to spin up another daemon, which listens to this daemon, which then runs a slurm job?
I want to emphasize that I do not mean to undermine your enterprise tier, but I am just trying to work with the limitations of the resources my university, which means I have to use...
Does ClearML have configuration settings or environment variables that allow you to specify additional headers for HTTP requests?
Everything works fine when I access the /files endpoint from a browser after logging in via the webserver. However, when I run an experiment using the ClearML Python package and it tries to upload the artifacts to the /files endpoint, it gets blocked with a 401 Unauthorized error.
I’ve checked the ClearML Python package and it seems to be attaching the auth headers correctly. However, the JWT token it’s using is different from the one used by the browser, and it’s not being accepted by the s...
Hi @<1523701070390366208:profile|CostlyOstrich36> , upon taking a closer look at the documentation, I think I should be using None . I'm going to take a deeper look into the dev tools to see whats up, thanks for the pointer.
Ok thanks @<1523701070390366208:profile|CostlyOstrich36> . When I launched the docker-compose on my Ubuntu machine, I never even set the variable. We've since migrated to RHEL 8 so I'm using podman to launch it and this is the first I'm ever seeing this error. I found docs on it, so I set it using export ClEARML_FILES_HOST="
None "
. For whatever reason that didn't work so I've just resorted to hardcoding...