Reputation
Badges 1
149 × Eureka!AgitatedDove14 are models technically Task
s and can they be treated as such? If not, how to delete a model permanently (both from the server and from AWS storage)?
so that the way of doing it would be like this:all_models = Model.query_models(projeect_name=..., task_name=..., tags=['running-best-checkpoint']) all_models = sorted(all_models, key=lambda x: extract_epoch(x)) for model in all_models[:-num_to_preserve]: Model.remove(model, delete_weights_file=True)
@<1523701435869433856:profile|SmugDolphin23> could you please review it further? Is it acceptable to be merged?
I just happened to spawn multiple OutputModels
within a single script which is being run in a single task. That is, I see dozens of models in Models
tab in web UI. What I want is to delete most of them (along with the files in S3), preserving the spawning task
Searching by model ID is good idea, but how do I fetch it from the code? In principle, InputModels are rarely defined automatically, so I could look up for the ID manually...
yeah, I mean I need to get the model to get its ID, but I need to get ID to get the model
So, to summarize:
PipelineController works with default image, but it incurs overhead 4-5 min It doesn't work with any other image
I can add issue on Github
I found out this happens with any other image except the default one, regardless of whether I set it with pipe._task.set_base_docker
The image is not needed to run the pipeline logic, I do it just to reduce overhead. Otherwise it would take too long to just build the default image on every launch
If I keep track of 3 OutputModels
simultaneously, the weights would need to shift between them every epoch (like, updated weights for top-1, then top-1 becomes top-2, top-2 becomes top-3 etc)
if I just use plain boto3 to sync weights to/from S3, I just check how many files are stored in the location, and clear up the old ones
CostlyOstrich36 thank you for the answer! Maybe I just can delete old models along with corresponding tasks, seems to be easier
is there a some sort of OutputModel.remove
method? Docs say there isn't
(this is an answer to the previous message)
I see the task on the web UI, but get Fetch experiment failed
when I click on it, as I described. It even fetches the correct ID by it's name. I'm almost sure it will be present in mongodb
And can I store models with no attachment to tasks? For example, original pretrained checkpoints
What exactly we need to copy? I believe we have already copied everything, but it keeps throwing "Fetch experiment failed" error
SuccessfulKoala55 Turns out we have copied elasticsearch database as well. Also it seems that the error is thrown only for experiments with artifacts
AgitatedDove14 not exactly:
input: just a checkpoint file
output: a clearml model entity + stored weights on S3
And I don't see any new projects / subprojects where that dataset creation Task is stored
Previously I had a separate, manually created project where I stored all newly created datasets for my main project. Very neat
Now the task is visible only in the "All experiment" section, but there is no separate project in the web ui where I could see it...