Reputation
Badges 1
662 × Eureka!Parquet file in this instance (used to be CSV, but that was even larger as everything is stored as a string...)
Yes, that one shows up. I forgot to mention we also set the version explicitly, but that just creates a duplicate dataset under Datasets and anyway our main Task is now hidden from the original project.
So project project exists, but it is empty.
UPDATE: Apparently the quotation type matters for furl ? I switched the ' to \" and it seems to work now
When is the next release expected? ๐
The instance that took a while to terminate (or has taken a while to disappear from the idle workers)
I'll have a look at 1.1.6 then!
And that sounds great - environment variables should be supported everywhere in the config, or then the docs should probably mention where they are and are not supported ๐
I'll be happy to test it out if there's any commit available?
The thing I don't understand is how come this DOES work on our linux setups ๐ค
1.8.3; what about when calling task.close() ? We suddenly have a need to setup our logging after every task.close() call
This is related to my other thread, so Iโll provide an example there -->
I just used this to create the dual_gpu queue:clearml-agent daemon --queue dual_gpu --create-queue --gpus 0,1 --detached
I will! (once our infra guy comes back from holiday and updates the install, for some reason they setup server 1.1.1???)
Meanwhile wondering where I got a random worker from
AFAIU, something like this happens (oversimplified):
` from clearml import Task # <--- Crash already happens here
import argparse
import dotenv
if name == "main":
# set up argparse with optional flag for a dotenv file
dotenv.load_dotenv(args.env_file)
# more stuff `
I cannot, the instance is long gone... But it's not different to any other scaled instances, it seems it just took a while to register in ClearML
Follow-up question/feature request (out of interest) - could the WebUI show the matching commit message?
Honestly, this is all related to issue #340. The only reason we have this to begin with is because we need one separate "initializer" task that downloads the remote cache and prepares the agent environment for execution (downloading the configuration files, etc).
Otherwise it fits perfectly with pipelines, but we're not there yet.
In the local execution we don't have this initializer task, so we use Task.init() before starting to work on a model, and task.close() when we're done....
FWIW, we prefer to set it in the agentโs configuration file, then itโs all automatic
CostlyOstrich36 I'm not sure what is holding it from spinning down. Unfortunately I was not around when this happened. Maybe it was AWS taking a while to terminate, or maybe it was just taking a while to register in the autoscaler.
The logs looked like this:
- Recognizing an idle worker and spinning down.
2022-09-19 12:27:33,197 - clearml.auto_scaler - INFO - Spin down instance cloud id 'i-058730639c72f91e1'2. Recognizing a new task is available, but the worker is still idle.
` 2022-09...
Is it currently broken? ๐ค
AgitatedDove14 the issue was that we'd like the remote task to be able to spawn new tasks, which it cannot do if I use Task.init before override_current_task_id(None) .
When would this callback be called? I'm not sure I understand the usecase.
I've also followed https://clearml.slack.com/archives/CTK20V944/p1628333126247800 but it did not help
... and any way to define the VPC is missing too ๐ค
Ah, you meant โfree python codeโ in that sense. Sure, I see that. The repo arguments also exist for functions though.
Sorry for hijacking your thread @<1523704157695905792:profile|VivaciousBadger56>
Setting the endpoint will not be the only thing missing though, so unfortunately that's insufficient ๐
There's code that strips the type hints from the component function, just think it should be applied to the helper functions too :)
Feels like we've been over this ๐ Has there been new developments perhaps?
It's essentially that this - https://clear.ml/docs/latest/docs/guides/advanced/multiple_tasks_single_process cannot work in a remote execution.
No that does not seem to work, I get
task.execute_remotely(queue_name="default")
2024-01-24 11:28:23,894 - clearml - WARNING - Calling task.execute_remotely is only supported on main Task (created with Task.init)
Defaulting to self.enqueue(queue_name=default)
Any follow-up thoughts, @<1523701070390366208:profile|CostlyOstrich36> , or maybe @<1523701087100473344:profile|SuccessfulKoala55> ? ๐ค
Of course Im using report_table in the above; it seems the support for Pandas DataFrame does not include support for MultiIndex other than by concatenating the indices together
That's fine (as in, it works), but it looks a bit weird and defies the purpose of a MultiIndex ๐ค Was wondering if there are plans to add better support for it
I'd like to set up both with and without GPUs. I can use any region, preferably some EU one.
So the pipeline runs successfully, I can find all the different tasks, but I cannot see them in the Pipelines tabโฆ