
Reputation
Badges 1
53 × Eureka!The answer is simple but also not completely obvious to someone new to the platform. So you can inject new command line args that hydra will recognize. This is what the Hydra section of args is for. However, if you enable _allow_omegaconf_edit_: True
, I think ClearML will “inject” the OmegaConf saved under the configuration object of the prior run, overwriting the overrides. I’ll experiment with this behavior a bit more to be sure.
I might have found the answer. I'll reply if it works as expected.
Thanks for your reply @<1523701070390366208:profile|CostlyOstrich36> Is there an example where a pipeline is built from existing tasks? I'd like to experiment with it and I don' t see any examples of what you describe with my (clearly lacking) google-fu. What happens if you wrap a function with a task.init() with a pipeline decorator or is that the process you're speaking of?
Thanks, that's exactly what I was looking for.
Hi again @<1523701435869433856:profile|SmugDolphin23> ,
The approach you suggested seems to be working albeit with one issue. It does correctly identify the different versions of the dataset when new data is added, but I get an error when I try and finalize the dataset:
Code:
if self.task:
# get the parent dataset from the project
parent = self.clearml_dataset = Dataset.get(
dataset_name="[LTV] Dataset",
dataset_project=...
Interesting approach. I'll give that a try. Thanks for the reply!
@<1523701070390366208:profile|CostlyOstrich36> ClearML: 1.10.1, I'm not self-hosting the server so whatever the current version is. Unless you mean the operating system?
@<1523701435869433856:profile|SmugDolphin23> Good to know.
Hi Again Eugen,
If I use the hyperparameter tool in ClearML, won't that create a different experiment for every step of the hyperparameter-optimizer? So this will be run across experiments. I could do something with pipelines but since the metrics are already available in the ClearML hyperparameter/metric tables I thought it would make sense to be able to plot against those values.
So far when I delete a task or dataset using the web interface that has artifacts on S3 it doesn't prompt me for credentials.
I figured as much. This is basically what I was planning to do otherwise. I have questions around that.
- It appears that the 'extra' config is displayed in plain text on the web app and downloadable in json. I was just curious if this is best practices.
- I noticed in the AWS instance that's spun up when starting the autoscaler there's 3 settings in the config:
use_credentials_chain: false, use_iam_instance_profile: false, use_owner_token: False
are these strictly for the credentials t...
Yes, it indeed appears to be a regex issue. If I run:
Dataset.list_datasets(
dataset_project=self.task.get_project_name(),
partial_name=re.escape('[LTV] Dataset Test'),
only_completed=True,
)
It works as expected. I'm not sure how raw you want to leave the partial_name features. I could create a PR to fix this but would you want me to re.escape at the list_datasets()
level? Or go deeper and do it at `Task._query_task...
I'd like to provide the credentials to any ec2 instances that are spun up.
What version of ClearML server are you using?
You might want to start with the first steps guide then:
None
Hyperdatasets are the only ones that require a premium. If you're using normal datasets it should be fine.
Are you self hosting a ClearML server?
Ah, I think I see the issue. In my head I was crossing ID with URL.
If I wanted to do this with the ID, how would I approach it?
In this case it's the ID of the "output" model from the first task.
I'm using pro. Sorry, for the delay, I didn't notice I never sent the response.
The plot thickens. It seems like there's something odd going on with the interaction between [LTV]
and additional text. If I just search [LTV]
it works, if I just search Dataset Test
it works, but if I put them together it breaks the search. Now that I think about it, there's other oddities that seem to happen in the web interface that might be explained by some bugs around using brackets in names.
Depending on the framework you're using it'll just hook into the save model operation. Every time you save a model, which will probably happen every epoch for some subset of the training. If you want to do it with the existing framework you could change the checkpoint so that it only clones the best model in memory and saves the write operation for last. The risk with this is if the training crashes, you'll lose your best model.
Optionally, you could also disable the ClearML integration with...
Yeah, it's because it's just hooking into the save operation and capturing the output, regardless of the parent call.
I see. Thanks for the insight. That seems to be the case. I'm struggling a bit with datasets. For example, if I wanted to trace the genealogy of a dataset that's used by traditional tasks and pipelines. I'll try and write something up about the challenges around that when I get the chance. But your comment revealed another issue:
It appears that the partial name matching isn't going well. I'm unclear why this wouldn't be matching. In the attached photo you can see the input for `partial_nam...
Alright, I'll try and put that together for Monday.
@<1523701070390366208:profile|CostlyOstrich36> Just pinging you 😄
Provide a bit more detail. What framework are you using?