Hmm let me check something
what if the preexisting venv is just the system python? my base image is python:3.10.10 and i just pip install all requirements in that image. Does that not avoid venv still?
it will basically create a new venv inside the container forking the existing preinistalled stuff (i.e. the new venv already has everything the python system has preinstalled)
then it will call "pip install" on all the "installed packages of the Task.
Which should just check everything is there and install nothing...
Hi ScaryKoala63
Which versions are you using (clearml / lightning) ?
but I am think they done it for a reason no?
Not a very good one, they just installed everything under the user and used --user for the pip.
It really does not matter inside a docker, the only reason one might want to do that is if you are mounting other drives and you want to make sure they are not accessed with "root" user, but with 1000 user id.
Thanks JitteryCoyote63 let me double check if there is a reason for that (there might be one, not sure)
You can check the keras example, run it twice, on the second time it will continue from the previous checkpoint and you will have input and output model.
https://github.com/allegroai/clearml/blob/master/examples/frameworks/keras/keras_tensorboard.py
Yey! MysteriousBee56 kudos on keep trying!
I'll make sure we report those errors, because this debug process should have much shorter 🙂
Thanks GrievingTurkey78 , this is exactly what I was looking for!
Any chance you can open a GitHub issue ( jsonargparse
+ lighting support) ?
I really want to make sure this issue is addressed 🙂
BTW: this is only if jsonargparse is installed:
https://github.com/PyTorchLightning/pytorch-lightning/blob/368ac1c62276dbeb9d8ec0458f98309bdf47ef41/pytorch_lightning/utilities/cli.py#L33
WackyRabbit7 the auto detection will only import direct packages you import (so that we do not end up with bloated venvs)
It seems that the transformers
library does not have it as a requirements, otherwise it would have pulled it...
In your code you can always do either:import torch
orTask.add_requirements('torch')
Maybe that's the issue :
https://github.com/googleapis/python-storage/issues/74#issuecomment-602487082
Will this still be considered asÂ
global site-packages
This is a pip settings, I "think" it inherits from the local user's installation, but I would actually install with "sudo pip" that will definitely be "inherited"
GentleSwallow91 how come it does not already find the correct pytorch version inside the docker ? whats the clearml-agent version you are using ?
JitteryCoyote63 not yet 😞
I actually wonder how poplar https://github.com/pallets/click is ?
EnviousStarfish54
and the 8 charts are actually identical
Are you plotting the same plot 8 times?
EnviousStarfish54 thanks again for the reproducible code, it seems this is a Web UI bug, I'll keep you updated.
PS. I just noticed that this function is not documented. I'll make sure it appears in the doc-string.
What's the host you have in the clearml.conf ?
is it something like " http://localhost:8008 " ?
Hi @<1523701504827985920:profile|SubstantialElk6>
I would split the first stage into two. The first one passing data to the others, the second as "monitoring ", Wdyt?
Hi SillyPuppy19
I think I lost you half way through.
I have a single script that launches training jobs for various models.
Is this like the automation example on the Github, i.e. cloning/enqueue experiments?
flag which is the model name, and dynamically loading the module to train it.
a Model has a UUID in the system as well, so you can use that instead of name (which is not unique), would that solve the problem?
This didn't mesh well with Trains, because the project a...
@<1556812486840160256:profile|SuccessfulRaven86> is the issue with flask
reproducible ? if so could you open a github issue, so we do not forget to look into it?
Also there was a truck that worked in the previous big, could you zoom out in the browser, and see if you suddenly get the plot?
Verified, and already fixed with 1.0.6rc2
agentservice...
Not related, the agent-services job is to run control jobs, such as pipelines and HPO control processes.
Hi MuddySquid7 issue is verified, v1.1.1 will be released in a few hours with a fix.
Thank you for noticing!
I mean test with:pipe.start_locally(run_pipeline_steps_locally=False)
This actually creates the steps as Tasks and launches them on remote machines