I guess I am out of ideas. The config is wrong somewhere. Maybe double check all the configs? Itβs taking the value from somewhere!
I don't know how to get past this? My k8 pods shouldn't need to reach out to the public file server URL.
URLs that it was uploaded with, as that URL could change.
How would that change, the actual files are there ?
Just curious, if https://github.com/allegroai/clearml-helm-charts/blob/19a6785a03b780c2d22da1e79bcd69ac9ffcd839/charts/clearml-agent/values.yaml#L50 is a value I can set, where is it used? It would be great if it overrides the Dataset.get embedded url parsed from my clearml conf file
file and redirect the public url to k8 dns url?
Yes! that would work, Nice!
You can add it into the extra_docker_shell_script
it will be executed in any pod the clearml-glue will spin (obviously this needs to be configured on the pod running the clearml k8s glue)
https://github.com/allegroai/clearml-agent/blob/ba2db4e727b90e595df2b13f458d9580659bf12e/docs/clearml.conf#L152
is this a config file on your side or something I can change, if we had enterprise version?
Yes, this is one of the things you can configure
So this is an additional config file with enterprise?
Extension to the "clearml.conf" capabilities
Is this new config file deployable via helm charts?
Yes, you can also set it company/user wide using the clearml Vault feature (again enterprise, sorry π )
BoredHedgehog47 that actually depends on the container, are you running as root inside the container ?
if not I think the easiest hack is to always map /etc/hosts as a k8s secret file?
Okay so I just tried this and immediately I'm getting errors Failed to establish a new connection:
because the file server URL in my clearml.conf is the k8 dns name. So I'm sort of stuck because if I revert it to the public DNS name, then upon Dataset.get
I will get same failure.
ahhh its possible my clearml.conf was using the public urls when I made it. Let me try this
Is this a config file on your side or something I can change, if we had enterprise version?
The pods should be able to use internal DNS names
When a remote task runs
Dataset.get()
it is not using the correct URL
BoredHedgehog47 it will get the link the data was Registered with, when creating the Dataset.
This has Nothing to do with the local configuration, it can point to any arbitrary file location on the internet.
It was created there, because at the time of the dataset creation someone (manually or via the config) set a specific host as the file location, and to that host the files were uploaded (again you can have a mixture of hosts, the backend is not aware of them, they just keep it as string)
Make sense ?
Just curious, if
is a value I can set, where is it used?
It is used when Creating a dataset from inside the cluster (i.e. when launching using the clearml k8s glue),
it will have No effect on what users have on their local machines
i.e. they can always point to a diff server.
That said, when users create their initial clearml.conf and copy paste the info from the web UI, this value (or it might be another one, I'll double check later) will set the initial configuration the copy paste into their local clearml.conf.
Which I think what happened is you had a diff one set when you spinned it initially, this was the configuration you used to create your local clearml.conf, then you changed it but your clearml.conf stayed the same.
wdyt?
IMO, the dataset shouldnt be tied to the clearml.conf URLs that it was uploaded with, as that URL could change. It should respect the file server URL the agent has.
If you submit your task using the CLI does it still work?
You could change infrastructure or hosting, and now your data is associated with the wrong URL
Yeah that makes sense, so have it on a specific dns name? (this is usually the case with k8s deployments)
AgitatedDove14 Will I need sudo permissions if I add this script to extra_docker_shell_script
echo "192.241.xx.xx venus.example.com venus" >> /etc/hosts
When I exec into the pod, it says I need sudo, but wondering if extra_docker_shell_script
is executed as sudo already?
Are there any work arounds to this issue? Our team is evaluating this product to potentially buy enterprise license. If we can't fetch data this is a problem.
he problem is due to tight security on this k8 cluster, the k8 pod cannot reach the public file server url which is associated with the dataset.
Understood, that makes sense, if this is the case then the path_substitution
feature is exactly what you are looking for
When I deployed the webserver, I changed the value https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml/values.yaml#L36 to be the public file server URL. Then in the UI, I copied the blob from the settings/API keys. Which had the public URLs. After that I did my data uploads which worked fine as they used public URLs. The problem is due to tight security on this k8 cluster, the k8 pod cannot reach the public file server url which is associated with the dataset.
I think the best change would to respect the value set https://github.com/allegroai/clearml-helm-charts/blob/19a6785a03b780c2d22da1e79bcd69ac9ffcd839/charts/clearml-agent/values.yaml#L50 so you could change it down the road if infra/hosting changes. Also in this case, I'm uploading the data to the public file server URL, but my k8 pod can't reach that for security reasons.
So this is an additional config file with enterprise? Is this new config file deployable via helm charts?
So you could change it down the road if infra/hosting changes.
Internally this is doable and Enterprise edition supports it, at the end this is stored in DBs π
Also in this case, I'm uploading the data to the public file server URL, but my k8 pod can't reach that for security reasons.
Yes, this is solvable as well (again sorry for pointing it, but only in the enterprise version), where you can specify per client or globally:
` path_substitution = [
# Replace registered links with local prefixes,
# Solve mapping issues, and allow for external resource caching.
# {
# registered_prefix = " ` ` "
# local_prefix = "file:///mnt/shared/bucket/research
# },
# {
# registered_prefix = "file:///mnt/shared/folder/"
# local_prefix = "file:///home/user/shared/folder"
# }
] `Wdyt?
You could change infrastructure or hosting, and now your data is associated with the wrong URL
I suppose a short term hack would to just edit the /etc/hosts
file and redirect the public url to k8 dns url?