A few more details on the New RCÂ (1.1.2rc0) change set:
Upload dataset now supports chunksize, for multi-part upload/download (useful with large datasets)
backwards compatibility, i.e. parent datasets do not have to support multi-part datasets
Notice multi-part datasets should be accessed with latest RCcleaml-data upload --chunk-size Dataset().upload(..., chunk_size=None)
Get Dataset support partial download (i.e. for debugging, or for more efficient multi-node support)
Notice total number of parts equals to the total number of the parts (including parent version)cleaml-data get --num-parts X --part y Dataset().get_local_copy(..., part=None, num_parts=None,)
Nested pipeline.decorators - I.e. pipeline steps calling other pipeline steps.
class methods to be used inside pipelines to access the pipeline Task (log/artifacts)Pipeline.get_logger() Pipeline.upload_artifact()
Add configuration_objects to the pipeline step override options:pipeline.add_step(..., configuration_overrides={'General': dict(key='value'), 'extra': 'raw text here, like YAML'})
Automatically log steps, metrics/artifacts/models on the pipeline itselfpipeline.add_step(..., monitor_metrics, monitor_artifacts, monitor_models)