Reputation
Badges 1
662 × Eureka!From the traceback ( backend_interface/task/task.py, line 178, in __init__
), notice it's not Task.init
I'm trying to build an easy SDK that would fit DS work and fit the concept of clearml pipelines.
In doing so, I'm planning to define various Step
classes, that the user can then experiment with, providing Steps as input to other steps, etc.
Then I'd like for the user to be able to run any such step, either locally or remotely. Locally is trivial. Remotely is the issue. I understand I'll need to upload additional data to the remote instance, and pull a specific artifact back to the notebo...
I guess in theory I could write a run_step.py
, similarly to how the pipeline in ClearML works… 🤔 And then use Task.create()
etc?
TimelyPenguin76 here's the full log (took a moment to anonynomize completely):
`
Using environment access key CLEARML_API_ACCESS_KEY=xxx
Using environment secret key CLEARML_API_SECRET_KEY=********
Current configuration (clearml_agent v1.3.0, location: /tmp/.clearml_agent.zs4e7egs.cfg):
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.m...
PricklyRaven28 That would be my fallback, it would make development much slower (having to build containers with every small change)
Since this is a single process, most of these are only needed once when our "initializer" task starts and loads.
Yeah that works too. So one can override the queue ID but not the worker 🤔
Not that I recall
Odd; switching to virtual environment results infatal: could not read Username for '
': terminal prompts disabled
even though it does earlier show that:agent.git_user = xxx
That's enabled; I was aiming if there are flags to add to pip install
CLI, such as --no-use-pep517
Thanks! That's what I thought, but then I get2021-12-21 22:08:35,376 - clearml.storage - ERROR - Failed uploading: Parameter validation failed: Invalid bucket name "": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
In the Profile section, yes, they are well defined (bucket, secret, key, and endpoint)
Interesting, why won’t it be possible? Quite easy to get the source code using e.g. dill
.
At any case @<1537605940121964544:profile|EnthusiasticShrimp49> this seems like a good approach, but it’s not quite there yet. For example, even if I’d provide a simple def run_step(…)
function, I’d still need to pass the instance to the function. Passing it along in the kwargs
for create_function_task
does not seem to work, so now I need to also upload the inputs, etc — I’m bringing this up because the pipelines do already do this for you.
I also tried adding gent.package_manager.system_site_packages = true
to ensure these virtual environments have access btw, still no avail
Consider e.g:
# steps.py
class DataFetchingStep:
def __init__(self, source, query, locations, timestamps):
# ...
def run(self, queue=None, **kwargs):
# ...
class DataTransformationStep:
def __init__(self, inputs, transformations):
# inputs can include instances of DataFetchingStep, or local files, for example
# ...
def run(self, queue=None, **kwargs):
# ...
And then the following SDK usage in a notebook:
from steps imp...
@<1537605940121964544:profile|EnthusiasticShrimp49> It’ll take me still some time to find the MVC that generated this, but I do have the ClearML experiment page for it. I was running the thing from ipython
, and was trying to create a task from a function:
Maybe this is part of the paid version, but would be cool if each user (in the web UI) could define their own secrets, and a task could then be assigned to some user and use those secrets during boot?
Thanks, that's what I thought - so I'm missing something else in the installation. I'll dig further 🙂
Or do you mean the contents of the configuration, probably :face_palm: ... one moment
I'm guessing that's not on pypi yet?
That's fine for the current use-case I believe.
Once the team is happy with the logging functionality, we'll move on to remote execution and things will update.
That's what I found as well, but it did not like it after all (boto is fine with it, but underlying urllib
and requests
were not?)
It's fine -- I see the added benefit in making sure the users set up their clearml.conf
and I've made a script to edit it to our needs as part of the installation process 🙂 Thanks Martin!
Always great to find a bug! I'll make relevant SDK updates then.
That would be nice :)
Sorry to keep this up - what about support for minio using the environment variable? Do I set the CLEARML_FILES_HOST
to the end point instead of an s3 bucket?
If I add the bucket to that (so CLEARML_FILES_HOST=
s3://minio_ip:9000/minio/bucket ), I then get the following error instead --
2021-12-21 22:14:55,518 - clearml.storage - ERROR - Failed uploading: SSL validation failed for
... [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)
Ah! Makes sense. Thanks!