Reputation
Badges 1
42 × Eureka!(including caching, even if the number of elements in the list of vals changes)
So the DAG is getting confused on bringing the results of the Tasks together
Is there a rule whereby only python native datatypes can be used as the “outer” variable?
I have a dict
of numpy np.array
s elsewhere in my code and that works fine with caching.
I basically just mean having a date input like you would in excel where it brings up a calendar and a clock if it’s time – and defaults to “now”
my colleague, @<1534706830800850944:profile|ZealousCoyote89> has been looking at this – I think he has used the relevant kwarg in the component decorator to specify the packages, and I think it worked but I’m not 100%. Connah?
The Dataset object itself is not being passed around. The point of showing you that was to say that the Dataset may change and therefore the number of objects (loaded from the Dataset, eg a number of pandas DataFrames that were CSV’s in the dataset) could change
Yep, that’s it. Obviously would be nice to not have to go via the shell but that’s by the by (edit: I don’t know of a way to build or run a new version of a pipeline without going via the shell, so this isn’t a big deal).
Basically, for a bit more context, this is part of an effort to incorporate ClearML Pipelines in a CI/CD framework. Changes to the pipeline script create_pipeline_a.py
that are pushed to a GitHub master
branch would trigger the build and testing of the pipeline.
And I’d rather the testing/validation etc lived outside of the ClearML Pipeline itself, as stated earlier – and that’s what your pseudo code allows, so if it’s possible that would be great. 🙂
from tempfile import mkdtemp new_folder = with_feature.get_mutable_local_copy(mkdtemp())
It’s this line that causes the issue
Ah ok. I’m guessing the state file is auto uploaded in the background? I haven’t kicked that off “intentionally”