Reputation
Badges 1
662 × Eureka!Using an on-perm clearml server, latest published version
I mean, if I search for "model", will it automatically search for tasks containing "model" in their name?
I'll have a look, at least it seems to only use from clearml import Task , so unless mlflow changed their SDK, it might still work!
Sounds like a nice idea 😁
Follow-up; any ideas how to avoid PEP 517 with the auto scaler? 🤔 Takes a long time to build the wheels
Hah. Now it worked.
It's self-hosted TimelyPenguin76
SuccessfulKoala55 WebApp: 1.4.0-175 • Server: 1.4.0-175 • API: 2.18
DeterminedCrab71 not in this scenario, but I do have it occasionally, see my earlier thread asking how to increase session timeout time
If relevant, I'm using Chrome Version 101.0.4951.41 (Official Build) (64-bit)
Happens pretty much consistently across all our projects -
Have a project with over 15 tasks (i.e. one that needs the Load More button) Click Load More, select a task that's not in the first 15 Let the page "rest" for a while (a couple of hours) Flip back to the page - the task is still active, but you cannot see it in the task list and there is no more Load More button
Each user creates a .env file for their needs or exports them in the shell running the python code. Currently I copy the environment variables to an S3 bucket and download it from there.
Is there some default Docker image you ship with ClearML that you'd recommend, or can/should we use our own? 🙂
That is, we have something like:
` task = Task.init(...)
ds = Dataset.create(dataset_name=task.name, dataset_project=task.get_project_name(), use_current_task=True)
upload files
dataset.upload(show_progress=True)
dataset.finalize()
do stuff with task and dataset
task.close() `But because the dataset is linked to the task, the task is then moved and effectively becomes invisible 😕
Any thoughts AgitatedDove14 SuccessfulKoala55 ?
I also tried setting agent.python_binary: "/usr/bin/python3.8" but it still uses Python 2.7?
I'll see if we can do that still (as the queue name suggests, this was a POC, so I'm trying to fix things before they give up 😛 ).
Any other thoughts? The original thread https://clearml.slack.com/archives/CTK20V944/p1641490355015400 suggests this PR solved the issue
Btw TimelyPenguin76 this should also be a good starting point:
First create the target directory and add some files:sudo mkdir /data/clearml sudo chmod 777 -R /data/clearml touch /data/clearml/foo touch /data/clearml/bar touch /data/clearml/bazThen list the files using the StorageManager. It shouldn't take more than a few miliseconds.` from clearml import StorageManager
%%timeit
StorageManager.list("/data/clearml")
-> 21.2 s ± 328 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) `
At least as far as I can tell, nothing else has changed on our systems. Previous pip versions would warn about this, but not crash.
SuccessfulKoala55 The changelog wrongly cites https://github.com/allegroai/clearml/issues/400 btw. It is not implemented and is not related to being able to save CSVs 😅
Hey @<1523701070390366208:profile|CostlyOstrich36> , thanks for the reply!
I’m familiar with the above repo, we have the ClearML Server and such deployed on K8s.
What’s lacking is documentation regarding the clearml-agent helm chart. What exactly does it offer, etc.
We’re interested in e.g. using karpenter to scale our deployments per demand, effectively replacing the AWS autoscaler.
We have the following, works fine (we also use internal zip packaging for our models):
model = OutputModel(task=self.task, name=self.job_name, tags=kwargs.get('tags', self.task.get_tags()), framework=framework)
model.connect(task=self.task, name=self.job_name)
model.update_weights(weights_filename=cc_model.save())
It also happens when use_current_task=False though. So the current best approach would be to not combine the task and the dataset?
I think this is about maybe the credential.helper used
Since this is a single process, most of these are only needed once when our "initializer" task starts and loads.
Yeah I figured (2) would be the way to go actually 😄
Say I have Task A that works with some dataset (which is not hard-coded, but perhaps e.g. self-defined by the task itself).
I'd now like to clone Task A and modify some stuff, but still use the same dataset (no need to recreate it, but since it's not hard-coded, I have to maintain a reference somewhere to the dataset ID).
Since the Dataset SDK offers use_current_task , I would have also expected there to be something like dataset.link(task) or task.register_dataset(ds) ...
The odd thing is that it was already defined, and then when I clicked an S3 link, it asked me to fill it in again, adding a duplicate credentials row
After setting the sdk.development.default_output_uri in the configs, my code kinda looks like:
` task = Task.init(project_name=..., task_name=..., tags=...)
logger = task.get_logger()
report with logger freely `
Basically you have the details from the Dataset page, why should it be mixed with the others ?
Because maybe it contains code and logs on how to prepare the dataset. Or maybe the user just wants increased visibility for the dataset itself in the tasks view.
why would you need the Dataset Task itself is the main question?
For the same reason as above. Visibility and ease of access. Coupling relevant tasks and dataset in the same project makes it easier to understand that they're...
AgitatedDove14
I'll make a PR for it now, but the long story is that you have the full log, but the virtualenv version is not logged anywhere (the usual output from virtualenv just says which Python version is used, etc).
We can change the project name’s of course, if there’s a suggestion/guide that will make them see past the namespace…