We're using the example autoscaler, nothing modified
Also, creating from functions allows dynamic pipeline creation without requiring the tasks to pre-exist in ClearML, which is IMO the strongest point to make about it
I'm not sure about the intended use of connect_configuration
now.
I was under the assumption that in connect_configuration(configuration, name=None, description=None)
, the configuration
is only used in local execution.
But when I run config = task.connect_configuration({}, name='General')
(in remote execution), the configuration is set to the empty dictionary
There used to be a good example but it's now missing. I'm not sure what does Use only for automation (externally), otherwise use Task.connect_configuration
mean when e.g. looking at Task.set_configuration_object
, etc.
Could you clarify a bit, CostlyOstrich36 or AgitatedDove14 ?
First bullet point - yes, exactly
Second bullet point - all of it, really. The SDK documentation and the examples.
For example, the Task
object is heavily overloaded and its documentation would benefit from being separated into logical units of work. It would also make it easier for the ClearML team to spot any formatting issues.
Any linked example to github is welcome, but some visualization/inline code with explanation is also very much welcome.
Basically when running remotely, the first argument to any configuration (whether object or string, or whatever) is ignored, right?
That's exactly what I meant AgitatedDove14 π It's just that to access that comparison page, you have to make a comparison first. It would be handy to have a link (in the side bar?) to an empty comparison
Ah, you meant βfree python codeβ in that sense. Sure, I see that. The repo arguments also exist for functions though.
Sorry for hijacking your thread @<1523704157695905792:profile|VivaciousBadger56>
Generally, really. I've struggled recently (and in the past), because the documentation seems:
Very complete wrt available SDK (though the formatting is sometimes off) Very lacking wrt to how things interact with one anotherA lot of what I need I actually find from pluging into the source code.
I think ClearML would benefit itself a lot if it adopted a documentation structure similar to numpy ecosystem (numpy, pandas, scipy, scikit-image, scikit-bio, scikit-learn, etc)
I see, okay that already clarifies some stuff, I'll dig a bit more into this then! Thanks!
Sorry, found it on my end!
@<1523701205467926528:profile|AgitatedDove14> this
So basically what I'm looking for and what I have now is something like the following:
(Local) I have a well-defined aws_autoscaler.yaml
that is used to run the AWS autoscaler. That same autoscaler is also run with CLEARML_CONFIG_FILE=....
(Remotely) The autoscaler launches, listens to the predefined queue, and is able to launch instances as needed. I would run a remote execution task object that's appended to the autoscaler queue. The autoscaler picks it up, launches a new instanc...
I think you're looking for the execute_remotely
function?
Hey SuccessfulKoala55 ! Is the configuration file needed for Task.running_locally()
? This is tightly related with issue #395, where we need additional files for remote execution but have no way to attach them to the task other then using the StorageManager
as a temporary cache.
We're wondering how many on-premise machines we'd like to deprecate. For that, we want to see how often our "on premise" queue is used (how often a task is submitted and run), for how long, how many resources it consumes (on average), etc.
I don't think there's a PR issue for that yet, at least I haven't created one.
I could have a look at this and maybe make a PR.
Not sure what would the recommended flow be like though π€
I should maybe mention that the security regarding this is low, since this is all behind a private VPN server anyway, I'm mostly just interested in having the credentials used for backtracking purposes
Thanks CostlyOstrich36 !
And can I make sure the same budget applies to two different queues?
So that for example, an autoscaler would have a resource budget of 6 instances, and it would listen to aws
and default
as needed?
I mean, I know I could connect_configuration({k: os.environ.get(k) for k in [...]})
, but then those environment variables would be exposed in the ClearML UI, which is not ideal (the environment variables in question hold usernames and passwords, required for DB access)
Thanks for the reply CostlyOstrich36 !
Does the task read/use the cache_dir
directly? It's fine for it to be a cache and then removed from the fileserver; if users want the data to stay they will use the ClearML Dataset π
The S3 solution is bad for us since we have to create a folder for each task (before the task is created), and hope it doesn't get overwritten by the time it executes.
Argument augmentation - say I run my code with python train.py my_config.yaml -e admin.env
...
The S3 bucket credentials are defined on the agent, as the bucket is also running locally on the same machine - but I would love for the code to download and apply the file automatically!
QuaintPelican38 did you have a workaround for this then? Some cleanup service or similar?
Looks great, looking forward to the all the new treats π
Happy new year! π
Would be good if that's mentioned explicitly in the docs π Thanks!
Parquet file in this instance (used to be CSV, but that was even larger as everything is stored as a string...)
Maybe this is part of the paid version, but would be cool if each user (in the web UI) could define their own secrets, and a task could then be assigned to some user and use those secrets during boot?
One must then ask, of course, what to do if e.g. a text refers to a dictionary configuration object? π€