Reputation
Badges 1
662 × Eureka!We're wondering how many on-premise machines we'd like to deprecate. For that, we want to see how often our "on premise" queue is used (how often a task is submitted and run), for how long, how many resources it consumes (on average), etc.
Consider e.g:
# steps.py
class DataFetchingStep:
def __init__(self, source, query, locations, timestamps):
# ...
def run(self, queue=None, **kwargs):
# ...
class DataTransformationStep:
def __init__(self, inputs, transformations):
# inputs can include instances of DataFetchingStep, or local files, for example
# ...
def run(self, queue=None, **kwargs):
# ...
And then the following SDK usage in a notebook:
from steps imp...
An internal project I've accidentally made with a hidden tag while playing around with the ClearML internal code.
There used to be a good example but it's now missing. I'm not sure what does Use only for automation (externally), otherwise use Task.connect_configuration mean when e.g. looking at Task.set_configuration_object , etc.
Could you clarify a bit, CostlyOstrich36 or AgitatedDove14 ?
I think ClearML boots up only afterwards, so those environment variables may not be available yet.
You should set them manually in the bootstrap code unfortuantely.
Managed now 🙂 Thank you for your patience!
I edited the previous post with some suggestions/thoughts
Hm, just a small update - I just verified and it does indeed work on linux:
` import clearml
import dotenv
if name == "main":
dotenv.load_dotenv()
config = clearml.backend_api.Config.load() # Success, parsed with environment variables `
The network is configured correctly 🙂 But the newly spun up instances need to be set to the same VPC/Subnet somehow
There's not much (or anything) in the log to provide...
` (.venv) 15:42 [0:user@server$~] CLEARML_CONFIG_FILE=~/agent_clearml.conf clearml-agent daemon --queue default on_prem --detached --order-fairness
Environment variables set from configuration: ['AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY', 'AWS_DEFAULT_REGION']
...
Then that did not work, but I'll look into it again soon!
That's what I thought too, it should only look for the CLEARML_TASK_ID environment variable?
Anything else you’d recommend paying attention to when setting the clearml-agent helm chart?
I'm working on the config object references 😉
Hi AgitatedDove14 !
Ah, thanks! I'll use the artifacts for linking.
We've forgone the "use current task" already because it indeed made things even more difficult (the task that was used is then automatically hidden by this automatic renaming of dataset tasks).
The current implementation (since 1.6.3 I think) creates the issues in the linked comment (with images to visualize).
Any updates @<1523701087100473344:profile|SuccessfulKoala55> ? 🫣
Without knowing anything, I'm assuming maybe ClearML patches plt.title and not Axes.set_title ?
Either, honestly, would be great. I meant even just a link to a blank comparison and one can then add the experiments from that view
Sure! It's a bit intricate as it accommodates many of our different plotting functionalities, but this consists of the important bits (I realize we have some bad naming here, but fig[0] is actually a Figure object, and fig[1] is an Axes object):
` plt.switch_backend('agg')
sns.set_theme(...)
fig = plt.subplots(...)
sns.histplot(data, ax=fig[1], ...)
fig[1].set_xlim(...)
fig[1].set_ylim(...)
fig[1].legend(loc='best')
fig[1].set_xlabel(xlabel)
fig[1].set_ylabel(ylabel)
fig[1].set_...
Say I upload each of these yamls as a configuration object (as with the above). Once I try to load bar.yaml remotely it will crash, since foo.yaml is missing (and is instead a clearml configuration object).
Does that make sense?
I'll try it out, but I would not like to rewrite that code myself maintain it, that's my point 😅
Or are you suggesting I Task.import_offline_session ?
FWIW running clearml ==1.9.1 with WebApp: 1.9.2-317 • Server: 1.9.2-317 • API: 2.23
Ah right, I missed that in the codebase. It just adds the .dataset convention to the dataset task.
AgitatedDove14 another option I thought would be nice is to actually self-sign the internal MinIO bucket, but then I get[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1076)Are you aware of any other way then (other than the secure: false flag?
I'm not sure why internally ClearML tries to initialize a task when get_task is called...
I wouldn't mind going the requests route if I could find the API end point from the SDK?