Reputation
Badges 1
25 × Eureka!I see...
Current (and this will change soon) the entire delta is stored in a single file, so there is no real way to download a "subset" of the data, only a parent version π
Lets say that this small dataset has a ID ....
Yes this would be exactly the way to do so:
` param ={'dataset': small_train_dataset_id_here}
task.connect(param)
dataset_folder = Dataset.get(param['dataset']).get_local_copy()
... Locally it will use the small_train_dataset_id_here ` , then whe...
Hi @<1614069770586427392:profile|FlutteringFrog26>
So since you have the Task id. you do:
task = Task.get_task("task id here")
Then to get the models
models = task.models["output]
the models is a list And a dict, if you want the lats one you do last_model = models[-1] if you know the best model name you do model = models["best model"] (notice the model name is the exact one you see in the UI. Once you have the model object you can get a copy with `model.get_lo...
so i end up having to clone the other ones manually in my code
Hi ConvolutedChicken69
Yes the problem is that there is no standard for multi repo environments
The best solution I can come up with is using git-submodules or packaging the auxiliary repo as wheels. wdyt?
Hmm I see, add this for example
extra_docker_shell_script: ["rm ~/.bashrc", "echo removed bashrc"]
@<1787653555927126016:profile|SoggyDuck67> notice the binary field in the Task "execution" tab, if for some reason it says "python3.10" it will try to use pytho 3.10 when running it.
That said if it does not find the request python version, it should output a warning and default to the python installed.
If you can provide the full log it will be helpful to see what happened there
Hi PerplexedCow66
I would like to know how to serve a model, even if I do not use any serving engine
What do you mean no serving engine, i.e. custom code?
Besides that, how can I manage authorization between multiple endpoints?
Are you referring to limiting access to all the endpoints?
How can I manage API keys to control who can access my endpoints?
Just to be clear, accessing the endpoints has nothing to do with the clearml-server credentials, so are you asking how to...
ShallowGoldfish8 this call does that:
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/examples/advanced/execute_remotely_example.py#L127
I "think" you are referring to the venvs cash, correct?
If so, then you have to set it in the clearml.conf running on the host (agent) machine, make sense ?
I can then programmatically choose which file to import with importlib. Is there a way to tell clearml programmatically to analyze the files, so it can built up the requirements correctly?
Sadly no π
It analyzes the running code, then if it decides it is not a self contained script it will analyze the entire repo ...
I just saw thatΒ
Task.create
Β takes
Task.create is Not Task.init. It is meant to allow you to create new Tasks (think Jobs) from ...
Hi FierceHamster54
Thanks for bringing it up π
... in term of secret managements/key-value stores
Currently the open-source version does not include the Vault support (e.g. secret management), this is something they added to the enterprise version a few versions away, and as far as I understand this is a per user/project/company granularity feature (i.e. company wide merging with project merging with user specific).
Is this what you are looking for or am I missing something ?
Also what's the additional p doing at the last line if the screenshot ?
Have to get glue setup, which I couldnβt understand fully, so thatβs a different topic
I suggest using the apply template setup (basically you provide a Job/Service template, and it uses that to setup k8s jobs based on the Tasks coming in from the specific queue)
I can see all the steps like git clone,
git clone has nothing to do with "env setup" this is brining the code, you cannot skip that one, that said, this is why the git itself is cached on the host machine, so it is fast
... There may be some odd package that need to be installed because one of our DS is experimenting ... But all that we can see what is happening.
even if everything is preinstalled, it Verifies the packages match, this might take a long time. It's just pip being ...
This code will give you one graph titled "loss" with two series: (1) trains (2) loss
Long story short, work in progress.
BTW: are you referring to manual execution or trains-agent ?
Hi TrickyRaccoon92
If you are reporting to tensor-board, then "iteration" equals step. Is this the case?
so the docker didnt use the dns of the host?
I'm assuming it is not configured on your DNS, otherwise it would have been resolved...
Sounds great! let me know what you find out π
Hmm maybe we should add a test once the download is done, comparing the expected file size and the actual file size, and if they are different we should redownload ?
DefeatedCrab47 yes that is correct. I actually meant if you see it on the tensorboard's UI π
Anyhow if it there, you should find it in the Tasks Results Debug Samples
I don't see any requests
This points to configuration, specifically maybe it is directed to a different server?!
BTW copying the cmd line assumes that you are running it in the same machine...
Yes, that means the nvidia drivers are present (as you mentioned the GPU seems to be detected).
Could you check you have libnvidia-ml.so.1 inside the container ?
For example in /usr/lib/nvidia-XYZ/
JitteryCoyote63
are the calls from the agents made asynchronously/in a non blocking separate thread?
You mean like request processing on the apiserver are multi-threaded / multi-processed ?
ReassuredTiger98 you mean when calling clearml-init ? or default value?
Hi VexedCat68
can you supply more details on the issue ? (probably the best is to open a github issue, and have all the details there, so we have better visibility)
wdyt?
Hi @<1526371965655322624:profile|NuttyCamel41>
How are you creating the model? specifically what do you have in "config.pbtxt"
specifically any python code should be in the pre/post processing code (actually not running on the GPU instance)
Bugs, definitely GitHub, this is the easiest to track.
Documentation, if these are small issues, Slack is fine, otherwise, GitHub issue.
Regrading the documentation, we are working on another iteration of improvement, but if you find inaccuracies/broken links please report π