AFAIK that's the only way right now (see my comment here - https://clearml.slack.com/archives/CTK20V944/p1657720159903739?thread_ts=1657699287.630779&cid=CTK20V944 )
Or then if you have the ClearML paid service, I believe there is a "vaults" service, right AgitatedDove14 ?
This is with:Task.set_offline_mode(True) task = Task.init(..., auto_connect_streams=False)
The only thing I could think of is that the output of pip freeze would be a URL?
Yes that's what I thought, thanks for confirming.
-ish, still debugging some weird stuff. Sometimes ClearML picks ip and sometimes ip2 , and I can't tell why 🤔
I wouldn't put past ClearML automation (a lot of stuff depend on certain suffixes), but I don't think that's the case here hmm
That would be nice :)
So a missing bit of information that I see I forgot to mention, is that we named our packages as foo-mod in pyproject.toml . That hyphen then get’s rewritten as foo_mod.x.y.z-distinfo .
foo-mod @ git+
Thanks Alon. In the full/official documentation the clearml-data CLI is not mentioned anywhere, so perhaps it should be refreshed 😉
I think we're referring to different things here.
I won't be using the UI (and neither will my team).
But as mentioned, we've used DVC before and it adds a lot of junk metadata files to each GitHub PR (many dvc.yaml , dvc.lock and .gitignore files). We're trying to avoid that as much as possible, hence my question about GitHub pull...
Thanks @<1537605940121964544:profile|EnthusiasticShrimp49> ! That’s definitely the route I was hoping to go, but the create_function_task is still a bit of a mystery, as I’d like to use an entire class with relevant logic and proper serialization for inputs, and potentially I’ll need to add more “helper functions” (as in the case of DataTransformationStep , for example). Any thoughts on that? 🤔
My suspicion is that this relates to https://clearml.slack.com/archives/CTK20V944/p1643277475287779 , where the config file is loaded prematurely (upon import ), so our dotenv.load_dotenv() call has not yet registered.
I think also the script path in the created task will cause some issues, but let’s see…
Basically when running remotely, the first argument to any configuration (whether object or string, or whatever) is ignored, right?
Yes exactly that AgitatedDove14
Testing our logic maps correctly, etc for everything related to ClearML
I'm trying to decide if ClearML is a good use case for my team 🙂
Right now we're not looking for a complete overhaul into new tools, just some enhancements (specifically, model repository, data versioning).
We've been burnt by DVC and the likes before, so I'm trying to minimize the pain for my team before we set out to explore ClearML.
Yeah 🤔 🤔 they did. I'll give your suggested fix a go on Monday!
Any thoughts @<1523701070390366208:profile|CostlyOstrich36> ?
I wouldn’t want to run the entire notebook, just a specific part of it.
I understand, but then the toml file needs to be parsed to ensure poetry is used. It's just a tool entry in the pyproject.toml.
Is Task.create the way to go here? 🤔
Yes. Though again, just highlighting the naming of foo-mod is arbitrary. The actual module simply has a folder structured with an implicit namespace:
foo/
mod/
__init__.py
# stuff
FWIW, for the time being I’m just setting the packages to all the packages the pipeline tasks sees with:
packages = get_installed_pkgs_detail()
packages = [f"{name}=={version}" if version else name for name, version in packages.values()]
packages = task.data.script.require...
Last but not least - can I cancel the offline zip creation if I'm not interested in it 🤔
EDIT: I see not, guess one has to patch ZipFile ...
packages an entire folder as zip
What if I have multiple files that are not in the same folder? (That is the current use-case)
It otherwise makes sense I think 🙂
Our workaround now for using a Dataset as we do, is to store the dataset ID as a configuration parameter, so it's always included too 😉
We're still working these quirks out. But one issue after we changed the AMI is that the VPC (SubnetId?) was missing from the instance so it could not reach the ClearML API server.
I think maybe the autoscaler service is missing some additional settings...
Sorry to keep this up - what about support for minio using the environment variable? Do I set the CLEARML_FILES_HOST to the end point instead of an s3 bucket?
And agent too, I hope..?
I'd be happy to join a #releases channel just for these!
Just randomly decided to check and saw there's a server 1.4 ready 🎉
I realized it might work too, but looking for a more definitive answer 😄 Has no-one attempted this? 🤔
The deferred_init input argument to Task.init is bool by default, so checking type(deferred_init) == int makes no sense to begin with, and is altering the flow.
It does (root in a docker container); it shouldn't touch /run/systemd/generator/systemd-networkd.service anyway though
Would be great if it is 😍 We have few files that change frequently and are quite large in size, and it would be quite a storage hit to save all of them