data:image/s3,"s3://crabby-images/ea8fc/ea8fc4a242d3fbf9f124d8906a48b69b89ea53a2" alt="Profile picture"
Reputation
Badges 1
25 × Eureka!SubstantialElk6 if you call Task.init with continue_last_task=<task_id> it will automatically add the last_iteration of the previous run, to any logging/report so you never overwrite the previous reports π
Thanks GorgeousMole24
That is a very good point! passing to product guys
his means that you guys internally catch the argparser object somehow right?
Correct π this is how you get the type checking casting abilities, and a few other perks
Hi TrickyRaccoon92
... would any running experiment keep a cache of to-be-sent-data, fail the experiment, or continue the run, skipping the recordings until the server is back up?
Basically they will keep trying to send data to server until it is up again (you should not loose any of the logs)
Are there any clever functionality for dumping experiment data to external storage to avoid filling up the server?
You mean artifacts or the database ?
Hmm TrickyRaccoon92 take a look at the cleanup service, I think you can hack it so instead of deleting the artifacts, it will archive them somewhere (also you can change the filter, maybe only perform on experiments with specific user tag)
What do you think?
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
I pull all the parameters, and then manually filter on the HP keys (manually=I have to plug them in, they are not part of optimizer object)
So is this an improvement to optimizer._get_child_tasks_ids(...)
interface ?
e.g. return a structure like:[ { 'id': task_id, 'hp1': value, 'hp2': value, 'hp3': value, 'objective': dict(title='title', series='series', value=42 }, ]
Hi @<1634001100262608896:profile|LazyAlligator31>
Is this because the code repo is being recreated in this directory?
Yes this is correct π
Basically the entire code base + venv is installed there, to make sure it does not intyerfere with the "system" preinstalled environment
(it also allows for caching on the host machine π )
You can always log it manually:from clearml import InputModel input_model = InputModel.import_model(weights_url='/tmp/keras_example/weight.6.hdf5')
Let me check something
Welp, it's been a day with the new settings, and stats went up 140K for API calls
... going to check again tomorrow to see if any of that was spill over from yesterday
140K calls a day, how often are you sending scalars ? how long is it running? how many experiments are running ?
Tried context provider for Task?
I guess that would only make sense inside notebooks ?!
Any chance you can zip the entire folder? I can't figure out what's missing, specifically "from config_files" , i.e. I have no packages nor file named config_files
Hmm check if this one works:optimizer._get_child_tasks_ids( parent_task_id=optimizer._job_parent_id or optimizer._base_task_id, order_by=optimizer._objective_metric._get_last_metrics_encode_field(), additional_filters={'page_size': int(top_k), 'page': 0})
If it does, let's PR it as a dedicated function
ColossalDeer61 FYI all is fixed now π
Which means you currently save the argument after resolving and I'm looking to save them explicitly so the user will not forget to change some dependencies.
That is correct
I'm looking to save them explicitly so the user will not forget to change some dependencies.
Hmm interesting point. What's the use case for storing the values before the resolving ?
Do we want to store both ?
The main reason for storing the post resolve values, is that you have full visibility to the actual...
(I'll make sure it is added to the docstring because apparently it was not there
RattySeagull0 I think you are correct, python 3.6 is the installed inside the docker. Is it important to have 3.7 ? You might need another docker (or change the installation script and install python 3.7 inside)
i keep getting an failed getting token error
MiniatureCrocodile39 what's the server you are using ?
Notice the order here:Task.add_requirements("tensorflow") task = Task.init(...)
clearml doesnβt do any βmagicβ in regard to this for tensorflow, pytorch etc right?
No π and if you have an idea on how, that will be great.
Basically the problem is that there is no "standard" way to know which layer is in/out
let me check a sec
using this is it possible to add to requirements of task with task_overrides?
Correct, but you will be replacing (not adding) requirements
ChubbyLouse32 could it be the configuration file is not passed to the agent machine itself ?
(were you able to run anything against this internal server? I mean to connect to it from code, clearml/cleamrl-agent) ?
The latest RC (0.17.5rc6) moved all logs into separate subprocess to improve speed with pytorch dataloaders