MelancholyElk85 if you are manually adding models OutputModel, then when you call update_weights(...)
upload will start in the background (if the process ends it will wait until the upload is competed). You can also specify auto_delete_file
which will delete the local copy once the upload completes
So from foo.mod import
"translates" to foo-mod @ git+
None ..
?
Also, How do I make the files other than entry script visible to the job?
The assumption for clearml (regradless on how you create a Task) is that you code is either a standlone script (or jupyter notebook) or inside a git repository. In case of a git repository cleamrl-agent will clone the git repository of the code, apply the uncommitted changes and run your code.
Hi LudicrousDeer3
I have to admit I cannot remember one in the wild (I might be wrong though).
What's the specific use case you had in mind ?
'
' error [Errno 13] Permission denied:
Seems like a permission issue ?
Try to remove your entire clearml cache folder None
Yes, no reason to attach the second one (imho)
you can also just create a venv and run the tests there (with the latest python package) ?
Was wondering how it can handle 10s, 100s of models.
Yes, it supports dynamically loading/unloading models based on requests
(load balancing multiple nodes is disconnected from it, but assuming they are under diff endpoints, the load balancer can be configured to route accordingly)
You can run this code from anywhere. The 'base_task_id' is actually the pipeline controller Task ID.
BTW: Next version will have a nicer interface to query it, but this code will work on the current version
BTW: what's the use case? Why do you need to open two Tasks in the same code/script ?
Hi CourageousDove78
Not the cleanest, but you can basically pass everything here:
https://allegro.ai/clearml/docs/rst/references/clearml_api_ref/index.html#post--tasks.get_all
Reasoning is that it is passed almost as is to the server for the actual query.
Hi VivaciousWalrus21 I tested the sample code, and the gap was evident in Tensorboard as well. This is not clearml generating this jump this is internal (like the auto de/serialization and continue of the code base)
Hi UpsetCrocodile10
First, I perform many experiments in one process, ...
How about this one:
https://github.com/allegroai/trains/issues/230#issuecomment-723503146
Basically you could utilize create_function_task
This means you have Task.init() on the mainn "controller" and each "train_in_subset" as a "function_task". Them the controller can wait on them, and collect the data (like the HPO does.
Basically:
` controller_task = Task.init(...)
children = []
for i, s in enumer...
It’s the correct way to do it, right?
Yep 🙂 that said this is not running as a service you will need to spin it on your machine. that said you can definitely connect it with the free SaaS server, and spin the serving on your machine with docker-compose
Why can't it be updated after creation?
You can but then you have to rerun it again. I mean technically this is obviously solvable, but the idea was to make it simple to use, and since we "assume" in most cases there is a single Task per execution, it made sense. wdyt?
Hi ShallowArcticwolf27
Does the
clearml-task
cli command currently support remote repositories with that are intended to be used with ssh
It does 🙂
but the
git@
prefix used for gitlab's ssh it seems to default to looking for the repository locally
git@ is always the prefix for SSH repositories (it does not actually mean it uses it, it's what git will return when asked on the origin of the repository. The agent knows (if SSH credentials ...
Hi @<1724960475575226368:profile|GloriousKoala29>
Is there a way to aggregate the results, such as defining an iteration as the accuracy of 100 samples
Hmm, i'm assuming what you actually want is to store it with the actual input/output and a score, is that correct?
That is awesome!
If you feel like writing a bit about the use-case and how you solved it, I think AnxiousSeal95 will be more than happy to publish something like that 🙂
UnevenDolphin73 I have a suspicion we have a few terms mixed:
hyperparameters :
These are essentially key/value.
when you call Task. connect (dict_with_params), clearml will flatten the dict and you end up with key/value
configuration objects :
These are actually blobs of text, the UI will show as is
When you call my_local_file=Task. connect_configuration (name, "path/to/config/file")
The entire Content of the config file is stored on the Task object itself.
Back to the use case, instead ...
Could you right click on the failed experiment , select reset and send it again for execution?
Could that error be a random network issue ?
(Basically this seems like a generic network error not actually related to the trains-agent)
Is the trains-agent
running in docker mode or venv mode?
Hi @<1610083503607648256:profile|DiminutiveToad80>
<h1>Request Entity Too Large</h1>
What's the size of the file? how are you running your clearml-server?
Hi CooperativeFox72
But my docker image has all my code and all the packages it needed I don't understand why the agent need to install all of those again? (edited)
So based on the docker file you previously posted, I think all your python packages are actually installed on the "appuser" and not as system packages.
Basically remove the "add user" part and the --user
from the pip install.
For example:
` FROM nvidia/cuda:10.1-cudnn7-devel
ENV DEBIAN_FRONTEND noninteractive
RUN ...
Hmm you mean like overrides ?
Maybe store both before/after resolving ?
(Although that might be confusing? as the before solve should actually be readonly)
if you have an automation process, then you should have the Task object, no?
then you have task.id
What am I missing here?