Reputation
Badges 1
25 × Eureka!So the way it works anything in the " extra_docker_shell_script
" section is executed inside the container everytime the container spins. I'm thinking that theextra_docker_shell_script
will pull the environment file from an S3 bucket and apply all "secrets" (or secrets are embedded into the startup bash script, like "export AWS_SECRET=abcdef"), that said this will not be on a per user basis π
Does that help?
GiganticTurtle0 I'm not sure I follow, what do you mean by indexing the arguments? Can you post a short usage example ?
Hi IrateBee40
What do you have in your ~/clearml.conf
?
Is it pointing to your clearml-server ?
An upload of 11GB took around 20 hours which cannot be right.
That is very very slow this is 152kbps ...
task.wait_for_status() task.reload() task.artifacts["output"].get()
in order to work with ssh cloning, one has to manually install openssh-client to the docker image, looks like that
Correct, you have to have SSH inside the container so that git can use it.
You can always install with the following setup inside your agent's clearml.conf:extra_docker_shell_script: ["apt-get install -y openssh-client", ]
https://github.com/allegroai/clearml-agent/blob/73625bf00fc7b4506554c1df9abd393b49b2a8ed/docs/clearml.conf#L145
Hi JealousParrot68
You mean by artifact names ?
if fails duringΒ
add_step
Β stage for the very first step, becauseΒ
task_overrides
Β contains invalid keys
I see, yes I guess it it makes sense to mark the pipeline as Failed π
Could you add a GitHub issue on this behavior, so we do not miss it ?
If this doesn't help.
Go to your ~/clearml.conf
file, at the bottom of the file you can add agent.python_binary
and change it to to the location of python3.6 (you can run which python3.6
to get the full path):agent.python_binary: /full/path/to/python3.6
@<1546303293918023680:profile|MiniatureRobin9>
, not the pipeline itself. And that's the last part I'm looking for.
Good point, any chance you want to PR this code snippet ?
def add_tags(self, tags):
# type: (Union[Sequence[str], str]) -> None
"""
Add Tags to this pipeline. Old tags are not deleted.
When executing a Pipeline remotely (i.e. launching the pipeline from the UI/enqueuing it), this method has no effect.
:param tags: A li...
GiddyTurkey39
I would guess your VM cannot access the trains-server
, meaning actual network configuration issue.
What are VM ip and the trains-server IP (the first two numbers are enough, e.g. 10.1.X.Y 174.4.X.Y)
Hmm, it might be sub-sampling on large scalar plots (so that we do not "kill" the ui), but I remember that it only happens above 50k samples. (when you zoom in, do you still get the 0.5 values?)
I'm assuming those errors are from the triton containers? where you able to run the simple pytorch mnist example serving from the repo?
Hi @<1560798754280312832:profile|AntsyPenguin90>
The image itself is uploaded in a blackground process, flush just triggers the starting of the process.
Could it be that it is showing a few seconds after?
You cannot change the user once you have mount the shared folder with wither CIFS or NFS
current task fetches the good Task
Assuming you fork the process than the gloabl instance" is passed to the subprocess. Assuming the sub-process was spawned (e.g. POpen) then an environement variable with the Task's unique ID is passed. then when you call the "Task.current_task" it "knows" the Task was already created and it will fetch the state from the clearml-server and create a new Task object for you to work with.
BTW: please use the latest RC (we fixed an issue with exactly this...
Hi @<1707565838988480512:profile|MeltedLizard16>
Maybe I'm missing something but gust add to your YOLO code :
from clearml import Dataset
my_files_folder = Dataset.get("dataset_id_here").get_local_copy()
what am I missing?
check if the fileserver docker is running with docker ps
Clearml 1.13.1
Could you try the latest (1.16.2)? I remember there was a fix specific to Datasets
Hi @<1730033904972206080:profile|FantasticSeaurchin8>
Is this only relates to this
https://github.com/coqui-ai/Trainer/issues/7
Or is it a clearml sdk issue?
WickedGoat98 Same for me, let me ask the UI guys, I think this is a UI bug.
Also maybe before you post the article we could release a fix to both, what do you think?
EDIT:
Never mind π i just saw the medium link, very cool!!!
SmallDeer34 the function Task.get_models() incorrectly returned the input model "name" instead of the object itself. I'll make sure we push a fix.
I found a different solution (hardcoding the parent tasks by hand),
I have to wonder, how does that solve the issue ?
Thanks SmallDeer34 !
This is exactly what I needed
quick question:CLEAR_DATA="./data/dataset_for_modeling"
Should I pass the folder of the extracted zip file (assuming train.txt is the training dataset) ?
Hi SmallDeer34
Can you see it in TB ? and if so where ?
Yeah I think this is a UI bug, any chance you mind opening a GitHub issue ?
the parent task ids is what I originally wanted, remember?
ohh I missed it π
Thanks SmallDeer34 , I think you are correct, the 'output' model is returned properly, but "input" are returned as model name not model object.
Let me check something
WackyRabbit7 I might be missing something here, but the pipeline itself should be launched on the "pipelines" queue, is the pipeline itself running? or is it the step itself that is stuck in ""queued" state?