Reputation
Badges 1
25 × Eureka!no, using the system drivers
I see now.
Let's assume you know which snapshot that was:
` prev_task = Task.get_task(task_id='the_first_training_task_id')
get the second from last checkpoint
task.models['output'][-2].url
prev_scalars = prev_task.get_reported_scalars()
new_task = Task.init('example', 'new task')
logger = new_task.get_logger()
do some fpr loop and report the prev_scalars with logger.report_scalars
new_task.flush(wait_for_uploads=True)
new_task.set_initial_iteration(22000)
start the train `
OddAlligator72 sure thing π
This should sort it out:Task.init('examples', 'train', continue_last_task=True)
If you want to continue a specific Task:continue_last_task='task_id_here'
Getting the previous model:last_checkopoint = task.models['output'][-1]
What do you think?
OddAlligator72 let's separate the two issues:
Continue reporting from a previous iteration Retrieving a previously stored checkpointNow for the details:
Are you referring to a scenario where you execute your code manually (i.e. without the trains-agent) ?
OutrageousSheep60 before I can answer, maybe you can explain why "zipping" them does not fit your workfow ?
Hi UnevenDolphin73
If you "remove" the lock file the agent will default to pip.
You can hack it with uncommitted changes section?
Hi ShinyPuppy47 ,
Yes that is correct. Use Task.init for automagic logging
Hi WorriedParrot51
Let me shed some light on this complicated mechanism, because this is not very straight forward.
Basically the agent signals the trains package it should ignore the code calls, and use a specific Task in the backend (i.e. if in manual mode, the trains package logs the data into the trains-server, in agent mode (remote mode), it does the opposite and takes the data from the trains-server "into" the code)
Specifically, just like in manual mode, calling argparse.parse is be...
Do you have a specific numpy version you are installing ? why is it trying to install the wheel from code?
Hi @<1562610699555835904:profile|VirtuousHedgehong97>
I think you need to upgrade your self-hosted clearml-server, could that be the case?
Hi RipeGoose2
Yes the slide feature is definelty on the do do list (a lot of users asked for it).
Unfortunately other than actually PR-ing to the UI repo, there is no easy way to add customization (If you have an idea on how we could have an easy interface, that would be great.)
I'll check what's the status with the slider, maybe we will be lucky enough to see it in he next update π
The base task is self-contained i.e. it downloads training/eval directly data and has direct access to it
I think this is the main issue, how come it does not catch it? Are you using argparser ?
Hi UnsightlyBeetle11
Is it possible to report the model's architecture (PyTorch model) automatically on ClearML, as we do it via Netron or other neural network visualisation tools?You mean like the actual network layout? Unfortunately, there is currently no option to do that, you can however manually store a plot/image that represents it
BTW:I think that at the beginning Netron was somehow integrated, but it was rarely used and support for it was not trivial so it was phased out. You can ho...
Hi @<1539055479878062080:profile|FranticLobster21>
Like this?
https://github.com/allegroai/clearml/blob/4ebe714165cfdacdcc48b8cf6cc5bddb3c15a89f[β¦]ation/hyper-parameter-optimization/hyper_parameter_optimizer.py
[https://github.com/allegroai/clearml/blob/4ebe714165cfdacdcc48b8cf6cc5bddb3c15a89f[β¦]ation/hyper-parameter-opt...
Thanks, yes you are correct the color is derived from the series name, so I guess the issue is the name+Id is not kept in full screen
Damn, okay I'll make sure we fix the order.
Could you verify the ~= works as intended (if the order id correct)
but now sinceΒ
Task.current_task()
Β doesn't work on the pipeline object we have a serious problem
How is that possible ?
Is there a small toy code that can reproduce it ?
Hi OutrageousSheep60
AS-IS
- without compressing or breaking it up into chunks.
So for that I would suggest to manually archive it, and upload as external link?
Or are you saying you want to control the compression used by Dataset class ?
https://github.com/allegroai/clearml/blob/72d9b22e0d27f317a364acfeacbcf5c70f852e8c/clearml/datasets/dataset.py#L603
OutrageousSheep60 so this should work, no?ds.upload(output_url='gs://<BUCKET>/', compression=0, chunk_size=100000000000)
Notice the chunk size is the maximum size (in bytes) per chunk, so it should basically very large
OutrageousSheep60 so if this is the case I think you need to add "external links" i.e. upload the individual files to GCS, then register the links to GCS, does that make sense ?
Hmm let me check first when it is going to upgraded and if there is a workaround
StraightDog31 can you elaborate? where are the parameters stored? who is trying to access them, and maybe for what purpose ?
So the thing is clearml
automatically detects the last iteration of the previous run, my assumption you also add it hence the double shift.
SourOx12 could that be it?
@<1523701868901961728:profile|ReassuredTiger98> in the UI can you see it in the "installed packages" section under the Execution Tab ?
@<1523710674990010368:profile|GreasyPenguin14> what do you mean "but I do I get the... " ?
Configuring git user/pass will allow you to launch Tasks from private repositories on the services queue (the agent is part of the docker-compose).
That said, this is not a must, worst case you'll get an error when git fails to clone your repo :)
Hi @<1697056701116583936:profile|JealousArcticwolf24>
Awesome deployment π€©
Yes if you need another scalable model serving you can just run another instance of the clearml-serving-inference
https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/docker/docker-compose.yml#L77
So you end up with two of them, one per models environ...