Reputation
Badges 1
2 × Eureka!It part of the design I think. It makes sense that if we want to keep track of changes, we always build on top of what we already have 🙂 I think of it like a commit: I'm adding files in a NEW commit, not in the old one.
That makes sense! Maybe something like dataset querying as is used in the clearml hyperdatasets might be useful here? Basically you'd query your dataset to only include sample you want and have the query itself be a hyperparameter in your experiment?
I'm not quite sure what you mean here? From the docs it seems like you should be able to simply send an HTTP request to the localhost url to get the metrics. Is this not working for you? Otherwise, all the metrics end up in Prometheus, so you can also query that instead or use something like Grafana to visualize it
AstonishingRabbit13 If I'm not mistaken, you can add images to the preview tab by reporting them as debug samples.
So you'd run: dataset.get_logger().report_image()
or report_media()
This is not scalable though, so don't expect the server to handle millions of images well, for that you'd need Hyperdatasets 🙂
But it works well (as the name suggests) for some previews of the images!
Relevant docs:
https://clear.ml/docs/latest/docs/references/sdk/dataset/#get_logger
https://...
Cool! 😄 Yeah, that makes sense.
So (just brainstorming here) imagine you have your dataset with all samples inside. Every time N new samples arrive they're just added to the larger dataset in an incremental way (with the 3 lines I sent earlier).
So imagine if we could query/filter that large dataset to only include a certain datetime range. That range filter is then stored as hyperparameter too, so in that case, you could easily rerun the same training task multiple times, on differe...
Ok, so I recreated your issue I think. Problem is, HPO was designed to handle more possible combinations of items than is reasonable to test. In this case though, there are only 11 possible parameter "combinations". But by default, ClearML sets the maximum amount of jobs much higher than that (check advanced settings in the wizard).
It seems like HPO doesn't check for duplicate experiments though, so that means it will keep spawning experiments (even though it might have executed the exact s...
VivaciousBadger56 hope you had a great time while away :)
That looks correct indeed. Do you mind checking for me if the dataset actually contains the correct metadata?
Go to the datasets section, select the one you need and on the right click on more information. It should send you to the experiment manager view. Then, under artifacts, do you see a key in the list named metadata? Can you post a screenshot?
VivaciousBadger56 Thank you for the screenshots! I appreciate the effort. You indeed clicked on the right link, I was on mobile so had to instruct from memory 🙂
First of all: every 'object' in the ClearML ecosystem is a task. Experiments are tasks, so are dataset versions and even pipelines! Each task can be viewed using the experiment manager UI, that's just how the backend is structured. Of course we keep experiments and data separate by giving them a separate tab and different UI, but...
VivaciousBadger56 Thanks for your patience, I was away for a week 🙂 Can you check that you properly changed the project name in the line above the one you posted?
In the example, by default, the project name is "ClearML Examples/Urbansounds". But it should give you an error when first running the get_data.py
script that you can't actually modify that project (by design). You need to change it to one of you own choice. You might have done that in get_data.py
but forgot to do s...
Hi @<1534344450795376640:profile|VividSwallow28> ! I've seen your github issue and will answer you there 🙂 I'll leave a link here for others facing the same issue.
Wait is it possible to do what i'm doing but with just one big Dataset object or something?
Don't know if that's possible yet, but maybe something like the proposed querying could help here?
When I run the example this way, everything seems to be working.
Hi Fawad!
You should be able to get a local mutable copy using Dataset.get_mutable_local_copy
and then creating a new dataset.
But personally I prefer this workflow:
dataset = Dataset.get(dataset_project=CLEARML_PROJECT, dataset_name=CLEARML_DATASET_NAME, auto_create=True, writable_copy=True) dataset.add_files(path=save_path, dataset_path=save_path) dataset.finalize(auto_upload=True)
The writable_copy
argument gets a dataset and creates a child of it (a new dataset with your ...
If they don't want to use ClearML, why not just run docker run ...
from the command line? Does he want to use the queueing system without using a clearml task?
So you train the model only on those N preprocessed data points then? Never combined with the previous datapoints before N?
Also, this might be a little stupid sorry, but your torch save command saves the model in the current folder, whereas you give clearml the 'model_folder/model' path instead. Could it be that the path is just incorrect?
Thank you so much, sorry for the inconvenience and thank you for your patience! I've pushed it internally and we're looking for a patch 🙂
@<1523701949617147904:profile|PricklyRaven28> Please use this patch instead of the one previously shared. It excludes the dict hack :)
Hi @<1523701949617147904:profile|PricklyRaven28> sorry that this is happening. I tried to run your minimal example, but get a IndexError: Invalid key: 5872 is out of bounds for size 0
error. That said, I get the same error without the code running in a pipeline. There seems to be no difference between simply running the code and the pipeline (for me). Do you have an updated example, maybe also including getting a local copy of an artifact, so I can check?
I think that would defeat the purpose of lineage no? The point is to keep track of where data came from in the real world. Rewriting that record is just kind of... metadata?
As for the (*) line, could it be that "0385db..."
itself does not have parents itself? So "0385db..."
is the base dataset, without parents, and it has 1 child, which has "0385db..."
as its parent
It's been accepted in master, but was not released yet indeed!
As for the other issue, it seems like we won't be adding support for non-string dict keys anytime soon. I'm thinking of adding a specific example/tutorial on how to work with Huggingface + ClearML so people can do it themselves.
For now (using the patch) the only thing you need to be careful about is to not connect a dict or object with ints as keys. If you do need to (e.g. ususally huggingface models need the id2label dict some...
Nice! Well found and thanks for posting the solution!
May I ask out of curiosity, why mount X11? Are you planning to use a GUI app on the k8s cluster?
Hi PanickyMoth78 ,
I've just recreated your example and it works for me on clearml==1.6.2
but indeed not on clearml==1.6.3rc1
which means we have some work to do before the full release 🙂 Can you try on clearml==1.6.2
to check that it does work there?
Hi @<1547028116780617728:profile|TimelyRabbit96> Awesome that you managed to get it working!
Thanks again for the extra info Jax, we'll take it back to our side and see what we can do 🙂
Like Nathan said, custom engines are a TODO, but for your second question, you can add that API request in the model preprocessing, which is a function you can define yourself! It will be ran every time a request comes in and you can do whatever you want in it and change the incoming data however you wish 🙂
example: https://github.com/allegroai/clearml-serving/blob/main/examples/keras/preprocess.py
Hi Oriel!
If you want to only serve an if-else model, why do you want to use clearml-serving for that? What do you mean by "online featurer"?
Hi CourageousKoala93 ! Have you tried https://clear.ml/docs/latest/docs/references/sdk/task#set_comment by any chance? There's a description field under the info tab 🙂
@<1547028116780617728:profile|TimelyRabbit96>
Pipelines has little to do with serving, so let's not focus on that for now.
Instead, if you need a ensemble_scheduling
block, you can use the CLI's --aux-config
command to add any extra stuff that needs to be in the config.pbtxt
For example here, under the Setup section step 2, we use the --aux-config
flag to add a dynamic batching block: None