Badges 1625 × Eureka!
I've updated my feature request to describe that as well. A textual description is not necessarily a preview 😅 For now I'll use the debug samples.
These kind of things definitely show how ClearML was designed originally only for neural networks tbh, where images are almost always only part of the dataset. Same goes for the consistent use of
iteration everywhere 😞
Actually TimelyPenguin76 I get only the following as a "preview" -- I thought the preview for an image would be... the image itself..?
It's not exactly "debugging", but rather a description of the generated model/framework (generated with pygraphviz).
Hey @<1523701070390366208:profile|CostlyOstrich36> , thanks for the reply!
I’m familiar with the above repo, we have the ClearML Server and such deployed on K8s.
What’s lacking is documentation regarding the clearml-agent helm chart. What exactly does it offer, etc.
We’re interested in e.g. using karpenter to scale our deployments per demand, effectively replacing the AWS autoscaler.
Just because it's handy to compare differences and see how the data changed between iterations, but I guess we'll work with that 🙂
We'll probably do something like:
When creating a new dataset with a parent (or parents), look at immediate parents for identically-named files If those exist, load those with matching framework (pyarrow, pandas, etc), and log differences to the new dataset 🙂
Let me know if you do; would be nice to have control over that 😁
Opened this - https://github.com/allegroai/clearml/issues/530 let me know if it's not clear enough FrothyDog40 !
I'm using some old agent I fear, since our infra person decided to use chart 3.3.0 😕
I'll try with the env var too. Do you personally recommend docker over the simple AMI + virtual environment?
More complete log does not add much information -
Cloning into '/root/.clearml/venvs-builds/3.10/task_repository/xxx/xxx'... fatal: could not read Username for '
': terminal prompts disabled fatal: clone of ' ` ' into submodule path '/root/.clearml/venvs-builds/3.10/task_repository/...
.netrc defined anywhere, really (+I've abandoned the use of docker for the autoscaler as it complicates things, at least for now)
TimelyPenguin76 that would have been nice but I'd like to upload files as artifacts (rather than parameters).
AgitatedDove14 I mean like a grouping in the artifact. If I add e.g.
foo/bar to my artifact name, it will be uploaded as
Here's how it failed for us 😅
poetry stores git related data in
poetry.lock , so when you
pip list , you get an internal package we have with its version, but no git reference, i.e.
internal_module==1.2.3 instead of
internal_module @ git+https://....@commit .
pip actually fails (our internal module is not on pypi), but
The SDK is fine as it is - I'm more looking at the WebUI at this point
I wouldn't put past ClearML automation (a lot of stuff depend on certain suffixes), but I don't think that's the case here hmm
Different AMI image/installing older Python instances that don't enforce this...
For future reference though, the environment variable should be
Sure! It's a bit intricate as it accommodates many of our different plotting functionalities, but this consists of the important bits (I realize we have some bad naming here, but
fig is actually a Figure object, and
fig is an Axes object):
fig = plt.subplots(...)
sns.histplot(data, ax=fig, ...)
Because setting env vars and ensuring they exist on the remote machine during execution etc is more complicated 😁
There are always ways around, I was just wondering what is the expected flow 🙂
AgitatedDove14 Basically the fact that this happens without user control is very frustrating - https://github.com/allegroai/clearml/blob/447714eaa4ac09b4d44a41bfa31da3b1a23c52fe/clearml/datasets/dataset.py#L191
We just do task.close() and then start a new task.Init() manually, so our "pipelines" are self-controlled
Hey FrothyDog40 ! Thanks for clarifying - guess we'll have to wait for that as a feature 😁
Should I create a new issue or just add to this one? https://github.com/allegroai/clearml/issues/529
get_local_copy() downloads a local copy and returns the path to the downloaded file. So you might be interested in e.g.
local_csv = pd.read_csv(a_task.artifacts['train_data'].get_local_copy())
With the models, you're looking for
get_weights() . It acts the same as
get_local_copy() , so it returns a path.
EDIT: I think also
get_local_copy() for a model should work 👍
Hm, that seems less than ideal. I was hoping I could pass some CSV locations. I'll try and find a workaround for that. Thanks!
..data referenced in the example above are part of the git repository?
What about setting the
working_directory to the user working directory using
Right, so where can one find documentation about it?
The repo just has the variables with not much explanations.
But... Which queue does it listen to, and which type of instances will it use etc
karpenter (more magic keywords for me), so my understanding is that that will manage the scaling part.
Does it make sense to you to run several such glue instances, to manage multiple resource requirements?
Anything else you’d recommend paying attention to when setting the clearml-agent helm chart?
Either one would be nice to have. I kinda like the instant search option, but could live with an ENTER to search.
I opened this meanwhile - https://github.com/allegroai/clearml-server/issues/138
Generally, it would also be good if the pop-up presented some hints about what went wrong with fetching the experiments. Here, I know the pattern is incomplete and invalid. A less advanced user might not understand what's up.