Reputation
Badges 1
662 × Eureka!Honestly I wouldn't mind building the image myself, but the glue-k8s setup is missing some documentation so I'm not sure how to proceed
I know the ClearML enterprise offers a vault.
If these are static-ish, you can set them directly in the agent's config file.
If not, what we did was that before executing remotely, we uploaded environment variables of interest as parameters, and then loaded them in the remote task.
These can then be overwritten with *** after loading them.
For the former (static-ish environment variables), just add:
environment {
VAR1: value1
VAR2: value2
}
to the agentโs clearml.conf
No it doesn't, the agent has its own clearml.conf file.
I'm not too familiar with clearml on docker, but I do remember there are config options to pass some environment variables to docker.
You can then set your environment variables in any way you'd like before the container starts
Seems like you're missing an image definition (AMI or otherwise)
@<1539780258050347008:profile|CheerfulKoala77> you may also need to define subnet or security groups.
Personally I do not see the point in Docker over EC2 instances for CPU instances (virtualization on top of virtualization).
Finally, just to make sure, you only ever need one autoscaler. You can monitor multiple queues with multiple instance types with one autoscaler.
Right so this is checksum based? Are there plans to only store delta changes for files (i.e. store the changed byte instead of the entire file)?
AFAICS it's quite trivial implementation at the moment, and would otherwise require parsing the text file to find some references, right?
https://github.com/allegroai/clearml/blob/18c7dc70cefdd4ad739be3799bb3d284883f28b2/clearml/task.py#L1592
At least as far as I can tell, nothing else has changed on our systems. Previous pip
versions would warn about this, but not crash.
We load the endpoint (and S3 credentials) from a .env
file, so they're not immediately available at the time of from clearml import Task
.
It's a convenience thing, rather than exporting many environment variables that are tied together.
That's what I thought too, it should only look for the CLEARML_TASK_ID
environment variable?
Setting the endpoint will not be the only thing missing though, so unfortunately that's insufficient ๐
I'm not sure about the intended use of connect_configuration
now.
I was under the assumption that in connect_configuration(configuration, name=None, description=None)
, the configuration
is only used in local execution.
But when I run config = task.connect_configuration({}, name='General')
(in remote execution), the configuration is set to the empty dictionary
Yeah, and just thinking out loud what I like about the numpy/pandas documentation
Also full disclosure - I'm not part of the ClearML team and have only recently started using pipelines myself, so all of the above is just learnings from my own trials ๐
We just do task.close() and then start a new task.Init() manually, so our "pipelines" are self-controlled
Why not give ClearML read-only access credentials to the repository?
Ah, you meant โfree python codeโ in that sense. Sure, I see that. The repo arguments also exist for functions though.
Sorry for hijacking your thread @<1523704157695905792:profile|VivaciousBadger56>
Internally yes, but in Task.init
the default argument is a boolean, not an int.
We don't want to close the task, but we have a remote task that spawns more tasks. With this change, subsequent calls to Task.init
fail because it goes in the deferred init clause and fails on validate_defaults
.
Kinda, yes, and this has changed with 1.8.1.
The thing is that afaik currently ClearML does not officially support a remotely executed task to spawn more tasks, so we also have a small hack that marks the remote "master process" as a local task prior to anything else.
SmugDolphin23 I think you can simply change not (type(deferred_init) == int and deferred_init == 0)
to deferred_init is True
?
Oh, well, no, but for us that would be one way solution (we didn't need to close the task before that update)
So now we need to pass Task.init(deferred_init=0)
because the default Task.init(deferred_init=False)
is wrong
But it is strictly that if condition in Task.init, see the issue I opened about it
1.8.3; what about when calling task.close()
? We suddenly have a need to setup our logging after every task.close()
call
True, and we plan to migrate to pipelines once we have some time for it :) but anyway that condition is flawed I believe
Basically when running remotely, the first argument to any configuration (whether object or string, or whatever) is ignored, right?
First bullet point - yes, exactly
Second bullet point - all of it, really. The SDK documentation and the examples.
For example, the Task
object is heavily overloaded and its documentation would benefit from being separated into logical units of work. It would also make it easier for the ClearML team to spot any formatting issues.
Any linked example to github is welcome, but some visualization/inline code with explanation is also very much welcome.