Hi @<1546303293918023680:profile|MiniatureRobin9> could it be the pipeline logic is created via the clrarml-task CLI? If this is the case, I think this is an edge case we should fix. Basically it creates a Task instead of pipeline, which in.essence only effects the UI. To solve it, just run the pipeline locally, notice that by default when you start it, it will actually stop the local run and relaunch itself on an agent.
Also, could you open a GitHub issue so we add a flag for it?
what do you have here in your docker compose :
None
you need to set
CLEARML_DEFAULT_BASE_SERVE_URL:
So it knows how to access itself
None
See: Add an experiment hyperparameter:
and add gpu
: True
Glad to hear that! 🙂
I can but that is not a configuration we would want to run with in production
Agreed, I just want to isolate the issue. I think this is the bottom python interface missing some configuration or environment variables
HugeArcticwolf77 from the CLI you cannot control it (but we could probably add that), from code you can:
https://github.com/allegroai/clearml/blob/d17903d4e9f404593ffc1bdb7b4e710baae54662/clearml/datasets/dataset.py#L646
pass compression=ZIP_STORED
Hi UnevenDolphin73
This differentiable storage - does it only work on file additions/removal, or also on intra-file changes?
This is on a file level, meaning you change a single byte in the file, the entire file will be packaged in the new version.
Make sense ?
That wasn't scheduled by ClearML).
This means that from Clearml perspective they are "manual" i.e the job it self (by calling Task.init) create the experiment in the system, and fills in all the fields.
But for a k8s job, I'm still unsuccessful.
HelpfulDeer76 When you say "unsuccessful" what exactly do you mean ?
Could it be they are reported to the clearml demo server (the default server if no configuration is found) ?
In the UI you can edit the base container image + add "SETUP SHELL SCRIPT", with any missing "apt update && apt-get install -y ..."
Hi ShinyWhale52
Every execution of the pipeline (by definition) will create a new job based on the pipeline steps
This is the reason you see all the steps twice (the default assumption is you wish to re-run the step, as this is part of the processing workflow (e.g. training a model)
the model has been overwritten. I guess this is due to this instruction:
This is because you are storing it locally to the same path, it just reflects the fact you just overwrote your model.
To create a...
WackyRabbit7 hmmm seems like non regular character inside the diff.
Let me check something
So we basically have two options, one is when you call Dataset.get_local_copy()
, we register it on the Task automatically, the other is a more explicit, with something like:ds = Datasset.get(...) folder = ds.get_local_copy() task.connect(ds, name=train) ... ds_val = Datasset.get(...) folder = ds_val.get_local_copy() task.connect(ds_val, name=validate)
wdyt?
Hi ReassuredTiger98
I do not want to share with the clearml-agent workstations.
Long story short, no 😞
The agent is responsible to spin all jobs, regardless of users, basically it has to have a read-only user for all the repositories. I "think" the enterprise version has a vault feature, that allows you to store these kind of secrets on the User itself.
What exactly is the use case?
${PWD} works!
This will be resolved every call to Task.init (so I would recommend against it), how about "$HOME/" ?
Do we support GPUs in a) docker mode b) k8s glue?
yes on both
Is there a good reference to get started with k8s glue?
A few folks here already set it up, do you have a k8s cluster with GPU support ?
Hi SubstantialElk6
I can't see that is was removed, could you send the full log ?
WackyRabbit7 How do I reproduce it ?
see here the docker_setup_bash_script
argument
None
It will be executed (no need for the #!/bin/bash
btw) before starting to setup the env inside the container, so apt-get and the like can be executed if needed. Notice that if this is something that Always needs to be executed, you can put the same list of commands here: [None](https://github.com/allegroai/clearml-agen...
My question is what happens if I launch in parallel multiple doit commands that create new Tasks.
Should work out of the box.
I would like to confirm that current_task ...
Correct.
If you passed the correct path it should work (if it fails it would have failed right at the beginning).
BTW: I think it is clearml-agent --config-file <file here> daemon ...
Hi ShortElephant92
This isn't an issue if the user is using a Service Account JSON Key,
Are you saying that when you are using GS python sdk directly it works?
For context, the google cloud storage SDK allows an authorized user credentials.
ClearML actually uses the google python SDK, the JSON is just a way to pass the credentials to the google SDK, I'm not sure it points to "service account"? where did that requirement came from ?
is it from here ` Service account info was n...
okay, let me know if it works
I am creating this user
Please explain, I think this is the culprit ...