Reputation
Badges 1
131 × Eureka!My agents are stared through systemd so maybe I should specify the env in the service file the clearml.conf
file looks like it has a section to do it properly (see 2nd point above)
And this is a standard pro saas deployment, the autoscaler scale up was triggered by the remote execution attempt of a pipeline
Okay, thank you for the explanations!
Well aside from the abvious removal of the line PipelineDecorator.run_locally()
on both our sides, the decorators arguments seems to be the same:@PipelineDecorator.component( return_values=['dataset_id'], cache=True, task_type=TaskTypes.data_processing, execution_queue='Quad_VCPU_16GB', repo=False )
And my pipeline controller:
` @PipelineDecorator.pipeline(
name="VINZ Auto-Retrain",
project="VINZ",
version="0.0.1",
pipeline_execution_queue="Quad_V...
print(f"start_date: {start_date} end_date: {end_date}") time_range = pd.date_range(start=start_date, end=end_date, freq='D').to_pydatetime().tolist()
Ah thank you I'll try that ASAP
In the meantime is there some way to set a retention policy for the dataset versions ?
Nice, thank you for the reactivity ❤
As specified in the initial message, the instance type used is e2-standard-4
Or do I have to add pipeline step to prune ancestors that are too old ?
Nope same result after having deleted .clearml
Okay thanks! Please keep me posted when the hotfix is out on the SaaS
Hey CostlyOstrich36 did you find anything on interest on the issue ?
Well at this point I might as well try to write a PR implementing the behavior I described above
Would gladly try to run it on a remote instance to verify the thesis on some local cache acting up but unfortunately also ran into an issue with the GCP autoscaler https://clearml.slack.com/archives/CTK20V944/p1665664690293529
Have you identified yet if it was a strictly internal issue or should I continue my investigation on my side ?
Old tags are not deleted. When executing a Task (experiment) remotely, this method has no effect).
This description in the add_tags()
doc intrigues me tho, I would like to remove a tag from a dataset and add it to another version (eg: a used_in_last_training
tag) and this method seems to only add new tags.
Ohwow, okay Ill test it with another type
Thanks a lot @<1523701435869433856:profile|SmugDolphin23> ❤
AnxiousSeal95 Okay it seems to work with a compute optimized c2-standard-4
instance
I checked the 'CPU-only' option in the auto-scaler config but that's seemed logic at the time
This is an instance than I launched like last week and was running fine until now, the version is v1.6.0-335
CostlyOstrich36 Having the same issue running on a remote worker, even tho the line works correctly on python interpreter and the component run correctly in local debug mode (but not standard local mode):
` File "/root/.clearml/venvs-builds/3.10/code/generate_dataset.py", line 18, in generate_dataset
time_range = pd.date_range(start=start_date, end=end_date, freq='D').to_pydatetime().tolist()
File "/root/.clearml/venvs-builds/3.10/lib/python3.10/site-packages/pandas/core/indexes/date...
`
Oct 24 12:12:51 clearml-worker-446f930fe7ce4aabb597c73b3d98c837 google_metadata_script_runner[1473]: startup-script: (Reading database ... #015(Reading database ... 5%#015(Reading database ... 10%#015(Reading database ... 15%#015(Reading database ... 20%#015(Reading database ... 25%#015(Reading database ... 30%#015(Reading database ... 35%#015(Reading database ... 40%#015(Reading database ... 45%#015(Reading database ... 50%#015(Reading database ... 55%#015(Reading database ... 60%#015(Rea...
The value of start_date
and end_date
seems to be None
Fix confirmed on our side CostlyOstrich36 thanks for everything!
Oh wow, would definitely try it out if there were an Autoscaler App integrating it with ClearML