Reputation
Badges 1
662 × Eureka!Yes, I’ve found that too (as mentioned, I’m familiar with the repository). My issue is still that there is documentation as to what this actually offers.
Is this simply a helm chart to run an agent on a single pod? Does it scale in any way? Basically - is it a simple agent (similiar to on-premise agents, running in the background, but here on K8s), or is it a more advanced one that offers scaling features? What is it intended for, and how does it work?
The official documentation are very spa...
Perfect, thanks for the answers Valeriano. These small stuff are missing from the documentation, but I now feel much more confident in setting this up.
Bump SuccessfulKoala55 ?
The documentation is messy, I’ve complained about it the in the past too 🙈
IIRC, get_local_copy()
downloads a local copy and returns the path to the downloaded file. So you might be interested in e.g.local_csv = pd.read_csv(a_task.artifacts['train_data'].get_local_copy())
With the models, you're looking for get_weights()
. It acts the same as get_local_copy()
, so it returns a path.
EDIT: I think also get_local_copy()
for a model should work 👍
Follow-up question/feature request (out of interest) - could the WebUI show the matching commit message?
In which repo?:)
Nope, no .netrc
defined anywhere, really (+I've abandoned the use of docker for the autoscaler as it complicates things, at least for now)
Will try later today TimelyPenguin76 and report back, thanks! Does this revert the behavior to the 1.3.x one?
That was a good idea, unfortunately did not help too much, but I think I may have a found a work around, thanks!
Aw you deleted your response fast CostlyOstrich36 xD
Indeed it does not appear in ps aux
so I cannot simply kill it (or at least, find it).
I was wondering if it's maybe just a zombie in the server API or similar
We're using the example autoscaler, nothing modified
We're using self hosted account
Nothing I can spot --
ClearML results page:
ClearML pipeline page:
Launching the next 2 steps
Launching step [...]
Launching step [...]
Launching step: ...
Parameters:
{...}
Configurations:
{}
Overrides:
{}
Launching step: ...
Parameters:
{...}
Configurations:
{}
Overrides:
{}
ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring
2023-02-21 13:53:48
ClearML Monitor: Could not detect iteration reporting, falling back to itera...
@<1523701070390366208:profile|CostlyOstrich36> I added None btw
I believe that a Pipeline should have the system tags ( pipeline
, maybe hidden
), even if it created in a running Task
.
Happens with the latest version indeed.
I can’t share our code, but the gist of it is:
pipe = PipelineController(name=..., project=..., version=...)
pipe.add_function_step(...) # Many calls
pipe.set_default_execution_queue(...)
pipe.start(queue=..., wait=True)
So the pipeline runs successfully, I can find all the different tasks, but I cannot see them in the Pipelines tab…
FWIW running clearml
==1.9.1
with WebApp: 1.9.2-317 • Server: 1.9.2-317 • API: 2.23
When I use the APIClient
to fetch the tags for the project, I get an empty collection of system tags:
<projects.GetProjectTagsResponse: {
"tags": [],
"system_tags": []
}>
Ah I see, if the pipeline controller begins in a Task it does not add the tags to it…
Yes, exactly. I have not yet had a chance to try this out -- should it work?
I... did not, ashamed to admit. The documentation says only boolean values.
No, I have no running agents listening to that queue. It's as if it's retained in some memory somewhere and the server keeps creating it.
Could also be related to K8, so pinging JuicyFox94 just in case 😉
I can only say I’ve found ClearML to be very helpful, even given the documentation issue.
I think they’ve been working on upgrading it for a while, hopefully something new comes out soon.
Maybe @<1523701205467926528:profile|AgitatedDove14> has further info 🙂