Reputation
Badges 1
662 × Eureka!Seems like you're missing an image definition (AMI or otherwise)
https://github.com/allegroai/clearml-agent/pull/98 AgitatedDove14 π
Hurrah! Addedgit config --system credential.helper 'store --file /root/.git-credentials' to the extra_vm_bash_script and now it works
(logs the given git credentials in the store file, which can then be used immediately for the recursive calls)
@<1539780258050347008:profile|CheerfulKoala77> you may also need to define subnet or security groups.
Personally I do not see the point in Docker over EC2 instances for CPU instances (virtualization on top of virtualization).
Finally, just to make sure, you only ever need one autoscaler. You can monitor multiple queues with multiple instance types with one autoscaler.
This happened again π€
How many files does ClearML touch? :shocked_face_with_exploding_head:
Looks great, looking forward to the all the new treats π
Happy new year! π
I'll have some reports tomorrow I hope TimelyPenguin76 SuccessfulKoala55 !
I can elaborate in more detail if you have the time, but generally the code is just defined in some source files.
Iβve been trying to play around with pipelines for this purpose, but as suspected, it fails finding the definition for the pickled objectβ¦
Honestly I wouldn't mind building the image myself, but the glue-k8s setup is missing some documentation so I'm not sure how to proceed
The tl;dr is that some of our users like poetry and others prefer pip . Since pip install git+.... stores the git data, it seems trivial to first try and install based on pip , and only later on poetry , since the pip would crash with poetry as it stores git data elsewhere (in poetry.lock )
That's fine as well - the code simply shows the name of the environment variable, not it's value, since that's taken directly from the agent listening to the services queue (and who's then running the scaler)
Ah, uhhhh whatever is in the helm/glue charts. I think itβs the allegroai/clearml-agent-k8s-base , but since I hadnβt gotten a chance to try it out, itβs hard to say with certainty which would be the best for us π
More experiments @<1537605940121964544:profile|EnthusiasticShrimp49> - the core issue with the create_function_step seems to be that the chosen executable will be e.g. IPython or some notebook, and not e.g. python3.10 , so it fails running it as a taskβ¦ π€
I have seen this quite frequently as well tbh!
I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.
Or is that functionality provided by setting offline mode and then importing an offline task?
It's okay π I was originally hoping to delete my "initializer" task, but I'll just archive it if someone is interested in the worker data etc. Setting the queue is quite nice.
I think this should get my team excited enough π
I can only say Iβve found ClearML to be very helpful, even given the documentation issue.
I think theyβve been working on upgrading it for a while, hopefully something new comes out soon.
Maybe @<1523701205467926528:profile|AgitatedDove14> has further info π
For example, can't interact with these two tasks from this view (got here from searching in the dashboard view; they're in different projects):
Thanks! To clarify, all the agent does is then spawn new nodes to cover the tasks?
Yes, exactly! I've added instructions for the users on creating their account and running clearml-init , and then they run the snippet that updates the api and sdk sections.
Or did you mean I can couple a short "mini config" with the package and redirect clearml to use this local one (instead of the one at ~/clearml.conf)?
SmugDolphin23 I think you can simply change not (type(deferred_init) == int and deferred_init == 0) to deferred_init is True ?
Ah okay π Was confused by what you quoted haha π
SuccessfulKoala55 That string was autogenerated by pyhocon and matches their documentation too - https://github.com/lightbend/config/blob/master/HOCON.md#substitutions
The first example won't work (it will treat ${...} as a string literal and won't replace it). The second does work, but as mentioned anyway, these were not hand typed, but rather generated from pyhocon, so I don't think that's the issue π€
Or do you mean the contents of the configuration, probably :face_palm: ... one moment
I'm not sure what you mean by "entity", but honestly anything work. We're already monkey-patching our way π
That's a nice work around of course - I'm sure it works and I'll give it a shot momentarily. I'm just wondering if ClearML could automatically recognize image files in upload_artifact (and other well known suffixes) and do that for me.
I see! The Hyper Datasets don't really fit our use case - it seems really focused on CNNs and image-based data, but lacking support for database-oriented tabular data.
So for now we mainly work with parquet and CSV files, and I was hoping there'd be an easy way to view those... I'll make a workaround with a "Datasets" project I suppose!
TimelyPenguin76 CostlyOstrich36 It seems a lot of manual configurations is required to get the EC2 instances up and running.
Would it not make sense to update the autoscaler (and example script) so that the config.yaml that's used for the autoscaler service is implicitly copied to the EC2 services, and then any extra_clearml_conf are used/overwritten?
Those are cool and very welcome additions (hopefully the additional info in the Info tab will be a link?) π
The main issue is the clutter that the forced renaming creates, as shown in the pictures I attached in the other thread.
Why does ClearML hide the dataset task from the main WebUI? Users should have some control over that. If I specified a project for the dataset, I specifically want it there, in that project, not hidden away in some .datasets hidden sub-project. Not...