I guess it depends on what you'd like to configure.
Since we let the user choose parents, component name, etc - we cannot use the decorators. We also infer required packages at runtime (the autodetection based on import statements fails with a non-trivial namespace) and need to set that to all components, so the decorators do not work for us.
Ah, you meant “free python code” in that sense. Sure, I see that. The repo arguments also exist for functions though.
Sorry for hijacking your thread @<1523704157695905792:profile|VivaciousBadger56>
That's fine as well - the code simply shows the name of the environment variable, not it's value, since that's taken directly from the agent listening to the services queue (and who's then running the scaler)
I'd like to set up both with and without GPUs. I can use any region, preferably some EU one.
I guess I'll have to rerun the experiment without tags for this?
FWIW, we prefer to set it in the agent’s configuration file, then it’s all automatic
Different AMI image/installing older Python instances that don't enforce this...
For future reference though, the environment variable should be PIP_USE_PEP517=false
Sure! It looks like this
Happens pretty much consistently across all our projects -
Have a project with over 15 tasks (i.e. one that needs the Load More button) Click Load More, select a task that's not in the first 15 Let the page "rest" for a while (a couple of hours) Flip back to the page - the task is still active, but you cannot see it in the task list and there is no more Load More button
It's self-hosted TimelyPenguin76
SuccessfulKoala55 WebApp: 1.4.0-175 • Server: 1.4.0-175 • API: 2.18
DeterminedCrab71 not in this scenario, but I do have it occasionally, see my earlier thread asking how to increase session timeout time
If relevant, I'm using Chrome Version 101.0.4951.41 (Official Build) (64-bit)
JitteryCoyote63 please do not get used to it :D there's an open ticket/feature request to either revert this or let the user/server choose the most comfortable way
JitteryCoyote63 yes exactly, sorry, I forgot to add the Task.get_task
in my response. That's exactly what we do 😅
AgitatedDove14 yeah I see this now; this was an issue because I later had to "disconnect" the remote task, so it can, itself, create new tasks (using clearml.config.remote.override_current_task_id(None)
). I guess you might remember that discussion? 😁
EDIT: It's the discussion we had here, for reference. https://clearml.slack.com/archives/CTK20V944/p1640955599257500?thread_ts=1640867211.238900&cid=CTK20V944
So probably not needed in JitteryCoyote63 's case, we still have some...
AgitatedDove14 I will try! I remember there were some issues with it, where I had to resort to this method first, but maybe things have changed since :)
The Task.init
is called at a later stage of the process, so I think this relates again to the whole setup process we've been discussing both here and in #340... I promise to try ;)
StaleButterfly40 what use case are you looking for? I've used environment variables in the config file and then I can overwrite them in os.environ
before ClearML loads the config
I thought this follows from our previous discussion SuccessfulKoala55 , where this is a built-in feature of pyhocon?
I think the environment variables path might work for you then?
You'd set your config withuse_credentials_chain: ${CREDENTIALS_CHAIN}
Then in Python you could os.environ['CREDENTIALS_CHAIN'] = True/False
before you make any calls to ClearML?
Ah. Apparently getting a task ID while it’s running can cause this behaviour 🤔
Thanks! I'll wait for the release note/docs update 😁
This is related to my other thread, so I’ll provide an example there -->
Uhhh, not really unfortunately :white_frowning_face: . I have ~20 tasks happening in a single file, and it's quite random if/when this happens. I just noticed this tends to happen with the shorter tasks
I'm trying to decide if ClearML is a good use case for my team 🙂
Right now we're not looking for a complete overhaul into new tools, just some enhancements (specifically, model repository, data versioning).
We've been burnt by DVC and the likes before, so I'm trying to minimize the pain for my team before we set out to explore ClearML.
i.e.ERROR Fetching experiments failed. Reason: Backend timeout (600s)
ERROR Fetching experiments failed. Reason: Invalid project ID