Hi @<1523702000586330112:profile|FierceHamster54>
Nope π nothing to worry about.
That said do notice the open-source file-server is not secure, this does not mean it will spill data on the server, but it does mean that you should probably put it behind a VPN or use S3/GCP/Azure if this is open to the public internet
Hi @<1727497172041076736:profile|TightSheep99>
I think you are correct! it will use the internal individual file upload retry but does not let you control it.
Could you please open a github issue so that we do not forget to add it?
SubstantialElk6 I just realized 3 weeks passed, wow!
So the good news we have some new examples:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_functions.py
The bad news the documentation was postponed a bit, as we are still messaging the interface (the community is constantly pushing for great ideas and uses cases , and they are just too good to miss out π )...
I see... let me check something
π’ any chance you have a toy pytest that replicates it ?
is there a way to visualize the pipeline such that this step is βstuckβ in executing?
Yes there is, the pipelline plot (see plots section on the Pipeline Task, will show the current state of the pipeline.
But I have a feeling you have something else in mind?
Maybe add Tag on the pipeline Task itself (then remove it when it continues) ?
I'm assuming you need something that is quite prominent in the UI, so someone knows ?
(BTW I would think of integrating it with the slack monitor, to p...
Thanks FrothyShark37
I just verified, this would work as well, I suspect what was missing is the plt.show
call, this is the actual call that triggers clearml
Seems like
Task.create
is the correct use-case then, since again this is about testing flows using e.g. pytest,
Make sense
This seems to be fine for now, ...
Sounds good! thanks UnevenDolphin73
Hi UpsetCrocodile10
execute them and return scalars.
This should be a good start (I hope π )
` for child in children:
put the Task into an execution queue
Task.enqueue(child, queue_name='my_queue_here')
wait for the task to finish
child.wait_for_status(status=['completed'])
reload all the metrics
child.reload()
get the metrics
print(child.get_last_scalar_metrics()) `
Hi @<1556812486840160256:profile|SuccessfulRaven86>
Every clearml-serving session (you can have multiple different "sessions") is assumed to be homogeneous, this would mean it will serve the same models on as many nodes as possible supporting multiple models per pod.
In your example I think the easiest is to create two serving sessions one with a node selector for the 24GB node and another for the 16GB node, wdyt?
What is the Model url?print(model.url)
Would you have an example of this in your code blogs to demonstrate this utilisation?
Yes! I definitely think this is important, and hopefully we will see something there π (or at least in the docs)
SubstantialElk6 this is odd, how are they passed ? what's the exact setup ?
you can run md5 on the file as stored in the remote storage (nfs or s3)
s3 is implementation specific (i.e. minio weka wassaby etc, might not support it) and I'm actually not sure regrading nfs (I mean you can run it, but it actually means you are reading the data, that said, nfs by definition I'm assuming is relatively fast access)
wdyt?
So I had to add it explicitly via a docker init script
Oh yes, that makes sense, can't think of a better hack other than sys.path.append(os.path.join(os.path.dirname(__file__), "src"))
Hmm yes this is exactly what should not happen π
Let me check it
The --template-yaml allows you to use foll k8s YAML template (the overrides is just overrides, which do not include most of the configuration options. we should probably deprecate it
LOL that's the spirit , making your team happy is key to success in adoption π
Can you see that the environment is actually being passed ?
@PipelineDecorator.component(repo="..")
The imports are not recognized - they are not on the pythonpath of the task that the agent starts.
RoughTiger69 add the imports inside the functions itself, you cal also specify the, on the component@PipelineDecorator.component(..., package=["pcakge", "package==1.2.3"])
or@PipelineDecorator.component(...): import pandas as pd # noqa ...
Hi RoughTiger69
A. Yes makes total sense . Basically you can use Task.export Task.import to do achieve this process (notice we assume the dataset artifacts links are available on both, usually this is the case)
B. The easiest way would be to use Process , then one subprocess is exporting from dev , where the credentials and configuration is passed with os environment. The another subprocess imports it to the prod server (again with os environment pointing to the prod server). Make sense?
BitterLeopard33
How to create a parent-child Dataset with a same dataset_id and only access the child?
Dataset ID is unique, the child will have a different UID. The name of the Dataset can the the same though.
Specifically to create a child Dataset:
https://clear.ml/docs/latest/docs/clearml_data#datasetcreatechild = Dataset.create(..., parent_datasets=['parent_datast_id'])
Are there any ways to access the parent dataset(assuming its large and i dont want to download it)
...
Hi RoughTiger69
I'm actually not sure about DVC support as well, see in these links, syncing and registering is a link, not creating an immutable copy.
And the sync between the local and remote seems like it is downloading the remote and comparing to the local copy.
Basically adding remote source Does not mean DVC will create an immutable copy of the content, it's just a pointer to a bucket (feel free to correct me if I misunderstood their capability)
https://dvc.org/doc/command-reference/...
When i say accessing, it means i want to use the data for training(without actually getting a local copy of it ).
How can you "access" it without downloading it ?
Do you mean train locally on a subset, then on the full dataset remotely ?
WackyRabbit7 I'll make sure it is fixed
Actually this is by default for any multi node training framework torch DDP / openmpi etc.
OutrageousSheep60 so this should work, no?ds.upload(output_url='gs://<BUCKET>/', compression=0, chunk_size=100000000000)
Notice the chunk size is the maximum size (in bytes) per chunk, so it should basically very large
I found something btw, let me check...