clearml==0.17.4
Great, and as you pointed changes are detected.
If I make some changes in the original file it gets tracked (picture below)... but what if I use a completely new and git-untracked py script?
Basically it will capture the output of "git diff" so if you have a new file but it is Not added, then no changes will appear.
That said, this is usually the exception, most files that are not added to the git should not be logged, hence the logic.
Anyhow the idea is that t...
For example:examples/k8s_glue_example.py --queue k8s_gpu - --namespace pod-clearml-conf ~/trains.conf --template-yaml example/base.yml
Any chance @<1578918150261444608:profile|RoundJellyfish71> you can open a GitHub issue so that we can track it? (I think this is indeed a good idea)
Hi UpsetBlackbird87
This is an Optuna decision on how many concurrent tests to run simultaneously.
You limited it to 100, but remember Optuna does a Bayesian optimization process, where it decides on the best set of arguments based on the performance of the previous set, this means it will first try X trials, then decide on the next batch.
That said you can a pruner to Optuna specifying how it should start
https://optuna.readthedocs.io/en/v1.4.0/reference/pruners.html#optuna.pruners.Median...
BitterLeopard33
How to create a parent-child Dataset with a same dataset_id and only access the child?
Dataset ID is unique, the child will have a different UID. The name of the Dataset can the the same though.
Specifically to create a child Dataset:
https://clear.ml/docs/latest/docs/clearml_data#datasetcreatechild = Dataset.create(..., parent_datasets=['parent_datast_id'])
Are there any ways to access the parent dataset(assuming its large and i dont want to download it)
...
It would be nice to have some documentation proclaiming how randomness behaves when running tasks (in all their variations). E.g. Should I trust seeds to be reset or should I not assume anything and do my own control over seeds.
That is a good point, I'll make sure we mention it somewhere in the docs. Any thoughts on where?
I located the issue, I'm assuming the fix will be in the next RC 🙂
(probably tomorrow or before the weekend)
Create a new version of the dataset by choosing what increment in SEMVER standard I would like to add for this version number (major/minor/patch) and uploadOh this is already there
` cur_ds = Dataset.get(dataset_project="project", dataset_name="name")
if version is not given it will auto increase based on semantic versions incrementing the last number 1.2.3 -> 1.2.4
new_ds = Dataset.create(dataset_project="project", dataset_name="name", parents=[cur_ds.id]) `
SmarmySeaurchin8 could you test with the latest RCpip install clearml==0.17.5rc2
This is very odd ... let me check something
What's the clearml-server version ?
Does clearml resolve the CUDA Version from driver or conda?
Actually it starts with the default CUDA based on the host driver, but when it installs the conda env it takes it from the "installed packages" (i.e. the one you used to execute the code in the first place)
Regrading link, I could not find the exact version bu this is close enough I guess:
None
ReassuredTiger98 maybe we should add an option to send a text next to the abort?
(Actually it is just a matter of passing the argument)
wdyt?
@<1532532498972545024:profile|LittleReindeer37> nice!!! 😍
Do you want to PR? it will be relatively easy to merge and test, and I think that they might even push it to the next version (or worst case quick RC)
Pycharm does get confused sometimes
Yep I think you are correct, you should have had the same output as a local jupyter notebook, and it seems that in sagemaker studio it is not working 😞
Let me check something
Hi @<1523701066867150848:profile|JitteryCoyote63>
Thank you for bringing it! can you verify with the latest clearml-agent 1.5.3rc2 ?
Hi BroadMole64
'from X import Y', which says that there isn't such module X. any help? thanks.
can you see package X under the "Execution" tab "Installed Packages" section ?
(think of this section as requirements.txt section, in order for the agent to install the package on the remote machine it should have it listed there)
Ssh is used to access the actual container, all other communication is tunneled on top of it. What exactly is the reason to bind to 0.0.0.0 ? Maybe it could be a flag that you, but I'm not sure in what's the scenario and what are we solving, thoughts?
Good question 🙂from clearml import Task Task.init('examples', 'test')
Hi ZippySheep23
Any ideas what might be happening?
I think you passed the upload limit (2.36 GB) 🙂
Hi SubstantialElk6
Generically, we would 'export' the preprocessing steps, setup an inference server, and then pipe data through the above to get results. How should we achieve this with ClearML?
We are working on integrating the OpenVino serving and Nvidia Triton serving engiones, into ClearML (they will be both available soon)
Automated retraining
In cases of data drift, retraining of models would be necessary. Generically, we pass newly labelled data to fine...
Hi SparklingHedgehong28
What would be the use for "end of docker hook" ? is this like an abort callback? completion ?
instance protection
Do you mean like when instance just died (line spot in AWS) ?
Hi @<1726047624538099712:profile|WorriedSwan6>
On a different issue, have you any solution on how to make the agent listen to multiply queues?
each agent is connected with one type of queue that represents the Job that agent will create. You can connect to it multiple queues, and it will pull from creating the same "type" of job regardless of where it's coming from. If you want another job to be created, just spin another agent, there is no limit to the number of agents you can spin ...
for example, if I somehow start the execution of an agent task in a specific docker container?)
You mean to specify the container from code? or to make sure the agent can access private docker container registry ? Or is it for private pypi container repository ?