@<1710827340621156352:profile|HungryFrog27> the venv-build folder is supposed to be deleted after each task is done. How did you end up with leftovers? Could it be windows was failing to delete it for some reason? That actually connects with you initial issue no?
Hi GreasyPenguin14
Yes, I think you are right the series name should be next to the title. Let me check it...
I'm trying to queue a task in python but I'd like to reuse the prior task ID.
is it your own Task? i,,e, enqueue yourself, if this is the case use task.execute_remotely it will do just that.
If this is another Task, then if it is aborted then you can just enqueue it, by definition it will continue with the Same Task ID.
SubstantialElk6 feel free to tweet them on their very inaccurate comparison table π
Hi @<1730033904972206080:profile|FantasticSeaurchin8>
You mean in the UI , or when reporting on the SDK?
save off the "best" model instead of the last
Should be relatively easy to update on the main Task the model with the best performance, no?
- Artifacts and models will be uploaded to the output URI, debug images are uploaded to the default file server. It can be changed via the Logger.
- Hmm is this like a configuration file?
You can do.
local_text_file = task.connect_configuration('filenotingit.txt')
Then open the 'local_text_file' it will create a local copy of the data in runtime, and the content will be stored on the Task itself. - This is how the agent installs the python packages, but if the docker already contactains th...
Simple git clone on that repo works well
On the machine running the trains-agent ?
What exactly do you mean by docker run permissions?
IrritableJellyfish76 point taken, suggestions on improving the interface ?
You need to mount it to ~/clearml.conf (i.e. /root/clearml.conf)
Hi SkinnyPanda43
Let's say that I install the shared libs with pip in editable mode on my development evironment, how does the clearml-agent will handle those libraries if I submit a job
So installing packages from local folders with "-e" is in general ill-advised.
But using a full git path should work out of the box. for example if you install pip install https://github.com/user/repo/repo.git then the agent will be able to install it on the remote machine. The main challenge...
Yes this is definitely the issue, the agent assume the docker user is "root".
Let me check something
Hi FloppyDeer99
What is the meaning of no real scheduling
I think the meaning is that from the moment a k8s job is created, the k8s is in charge of actually spinning the container. Since k8s has no real priority/order the scheduling order is not guaranteed form this point.
The idea of the cleaml-k8s -glue is that the glue will launch a job on the k8s cluster only if it is sure there are enough resources to actually spin the job now (as opposed to, sometime in the future), this mea...
JitteryCoyote63 okay... but let me explain a bit so you get a better intuition for next time π
The Task.init call, when running remotely, assumes the Task object already exists in the backend, so it ignores whatever was in the code and uses the data stored on the trains-server, similar to what's happening with Task.connect and the argparser.
This gives you the option of adding/changing the "output_uri" for any Task regardless of the code. In the Execution tab, change the "Output Destina...
Hi SmarmyDolphin68
You have two options:
Automatically upload the models when training pass output_uri to Task.init. For example output_uri=True will upload to the clearml-server, output_uri=' s3://bucket/folder ' will upload to S3 etc. Manually upload a model that you have locally: https://github.com/allegroai/clearml/blob/9ff52a8699266fec1cca486b239efa5ff1f681bc/examples/reporting/model_config.py#L37
It only happens in the clearml environment, works fine local.
Hi BoredHedgehog47
what do you mean by "in the clearml environment" ?
seems to run properly now
Are you saying the problem disappeared ?
Hi ElegantCoyote26
is there a way to get a Task's docker container id/name?
you mean like Task.get_task("task_id_here").get_base_docker() ?
ow a Task's results page also has a plot for this, but I guess it's at the machine level and not the task level?
This is actually on the container level, meaning checked from inside the container. It should be what you are looking for
Hmm this is odd, could you provide the pipeline code maybe?
currently I'm doing it by fetching the latest dataset, incrementing the version and creating a new dataset version
This seems like a very good approach, how would you improve ?
If it cannot find the Task ID I'm guessing it is trying to connect to the demo server and not your server (i.e. configuration is missing)
Hi @<1663354518726774784:profile|CrookedSeal85>
I am trying to optimize storage on my ClearML file server when doing a lot of experiments.
This is not straight forward, you will need to get a list of all the events via
None
filter on image events
and then delete the the URL you are getting via the StorageManager.
But to be honest, why not just direct it to S3 or something like that ?
So βwaitβ is a better metaphore for me
So I would do something like (I might have a few typos but that's the gist):
def post_execute_callback_example(a_pipeline, a_node):
# type (PipelineController, PipelineController.Node) -> None
print('Completed Task id={}'.format(a_node.executed))
# wait until model is tagged, then pass it as argument
while True:
found = Moodel.query_models(...) # model filter here, inlucing tag and project
if found:
...