FiercePenguin76 the git repo should detect only clearml as required python package
Basically the steps are:
decide if the initial python entry script is a standlone script (i.e. no local imports) in the git repo (in your example "task_with_deps.py") If this is a "standlone script" only look for imports inside the calling python script, and list those packages under "installed packages" If this is Note a standalone script, go over All the python files inside the repository, look for "i...
Do note that the needed module is just a local folder with scripts.
Oh that is the issue, is it in the git repo ?
Hi OutrageousSheep60
AS-IS
- without compressing or breaking it up into chunks.
So for that I would suggest to manually archive it, and upload as external link?
Or are you saying you want to control the compression used by Dataset class ?
https://github.com/allegroai/clearml/blob/72d9b22e0d27f317a364acfeacbcf5c70f852e8c/clearml/datasets/dataset.py#L603
If it helps, you can override it on the clients with an OS environment CLEARML_FILES_HOST
(basically python abusing types/casting where the value can be both str/bool on the same argparser aergument)
Hi LethalDolphin75
I think you are right there isn't one (although I remember a discussion about it...)
Anyhow it will be very easy to implement, just inherit from:
https://github.com/allegroai/clearml/blob/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/clearml/automation/parameters.py#L111
And return the power of the parent value here:
https://github.com/allegroai/clearml/blob/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/clearml/automation/parameters.py#L146
And
https://github.com/allegroai/...
Yes it does, but these files must be committed to begin with, basically think 'git diff' output is stored and then the agent applies it
Hi HandsomeCrow5 .
Remember the debug images are events with links to the actual images, so you first have to get the events and then you can download the images with https://allegro.ai/docs/examples/examples_storagehelper/#storagemanager (which by definition has the credentials, because it was able to upload them π
To get the events:from trains.backend_api.session.client import APIClient client = APIClient() client.events.debug_images(task='aabbcc')
with ?
multipart: false
secure: false
If so, can you post here your aws.s3 section of the clearml.conf? (of course replacing the actual sensitive information with *s)
No it will not π the closer is closer to the actual print.
That said, I'm sure it would not be complicated to add.
But I have to wonder, this will really create a mess in the console log, so if someone wants it, it will be global (i.e. also in the visible console. not only in the backend), so the case where the console on the machine itself is "clean" but the backend log is full of debug stuff is not clear to me
VictoriousPenguin97 I'm not sure there is an easy solution, basically you have to edit both MongoDB (artifacts) and Elastic (think debug samples) π
if the first task failed - then the remaining task are not schedule for execution which is what I expect.
agreed
I'm just surprised that if the first task is
aborted
instead by the user,
How is that different from failed? The assumption is if a component depends on another one it needs its output, if it does not then they can run in parallel. What am i missing?
Hi @<1542316991337992192:profile|AverageMoth57>
Not sure I follow how the integration what you have in mind regarding Gerrit integration None
Sounds interesting ...
wdyt?
but I need to dig digger into the architecture to understand what we need exactly from k8s glue.
Once you do, feel free to share, basically there are two options , use the k8s scheduler with dynamic pods, or spin the trains-agent as a service pod, and let it spin the jobs
SubstantialElk6 This seems to be the issuecp: failed to access '/root/default_clearml.conf': Permission denied clearml_agent: ERROR: Could not find task id=024a421c0e174650a1c7ff64af756c26 (for host: )Notice it seems it just cannot read the clearml.conf , wdyt?
I don't have the compose file, or at least can't seem to find it inΒ
/opt
you can manually take down all dockers with:docker psthen docker stop <container id> for each container id
RipeGoose2
HTML file is not a standalone and has some dependencies that require networking..
Really? I thought that when jupyter converts its own notebook it packages everything into a single html, no?
Okay this is very close to what the agent is building:
Could you start a new conda env,
then install cudatoolkit=11.1
then run:
conda env update -p <conda_env_path_here> --file the_env_yaml.yml
Glad to hear that! π
Hi FierceHamster54
Do I need to instantiate a task inside my component ? Seems a bit redundant....
Yes, so the idea is that the Task (along the code) will be automatically linked with the output model, for better traceability.
That said you can "import" a model into the system (i.e. it was created somewhere else and you want to register it with InputModel.import_model
https://clear.ml/docs/latest/docs/clearml_sdk/model_sdk#importing-models
I guess "Input" from that perspecti...
...I'm not sure I follow, the clearml-task is designed to always be used so that at the end the agent will be running the Task. What am I missing?
Hi @<1547028074090991616:profile|ShaggySwan64>
I have to admit that personally I do not know pdm , could you share links, and help us understand what is the value over pip/poetry/conda ?
... these nested components are not tagged with 'pipe: <pipeline_task_id>'. I assume this should not be like that, right?
Helper functions are not "component", they are actually files that will be accessible when running the component itself.
am I missing something ?
'
' error [Errno 13] Permission denied:
Seems like a permission issue ?
Try to remove your entire clearml cache folder None
If you have idea on where to start looking for a quick win, I'm open to suggestions π
BTW: the cloning error is actually the wrong branch, if you take a look at your initial screenshot, you can see the line before last branch='default' which I assume should be branch='master' (The error itself is still weird, but I assume that this is what git is returning)
CloudyHamster42 what's the trains-server version ?
RipeGoose2 models are automatically registered
i.e. added to the models artifactory, but it only points to where the files are stored
Only if you are passing the output_uri argument to the Task.init, they will be actually uploaded.
If you want to disable this behavior you can passTask.init(..., auto_connect_frameworks={'pytorch': False})
. Does
Task.connect
send each element of the dictionary as a separate api request? Has anyone else encountered this issue?
Hi SuperiorPanda77
the task.connect ends up as a single call with all the data being sent on a single request.
That said, maybe the connect dict is not the best solution for thousand key dictionary ...
Maybe artifact, or connect_configuration are better suited ?
wdyt?