
Reputation
Badges 1
2 × Eureka!Great to hear! Then it comes down to waiting for the next hugging release!
I think that would defeat the purpose of lineage no? The point is to keep track of where data came from in the real world. Rewriting that record is just kind of... metadata?
As for the (*) line, could it be that "0385db..."
itself does not have parents itself? So "0385db..."
is the base dataset, without parents, and it has 1 child, which has "0385db..."
as its parent
Hi ExuberantParrot61 ! Can you try using a wildcard? E.g. ds.remove_files(dataset_path='folder_to_delete/*')
Hi VictoriousPenguin97 ! I think you should be able to change it in the docker-compose file here: https://github.com/allegroai/clearml-server/blob/master/docker/docker-compose.yml
You can map the internal 8008 port to another port on your local machine. But beware to provide the different port number to any client that tries to connect (using clearml-init
)
Usually those models are Pytorch right? So, yeah, you should be able to, feel free to follow the Pytorch example if you want to know how 🙂
Not exactly sure what is going wrong without an exact error or reproducible example.
However, passing around the dataset object is not ideal, because passing info from one step to another in a pipeline requires ClearML to pickle said object and I'm not exactly sure a Dataset obj is picklable.
Next to that, running get_local_copy() in the first step does not guarantee that you can access that data from the other step. Both might be executed in different docker containers or even on different...
You will have to provide more information. What other docker containers are running and how did you start the server?
Cool! 😄 Yeah, that makes sense.
So (just brainstorming here) imagine you have your dataset with all samples inside. Every time N new samples arrive they're just added to the larger dataset in an incremental way (with the 3 lines I sent earlier).
So imagine if we could query/filter that large dataset to only include a certain datetime range. That range filter is then stored as hyperparameter too, so in that case, you could easily rerun the same training task multiple times, on differe...
If that's true, the error should be on the combine function, no? Do you have a more detailed error log or minimal reproducible example?
Ah I see 😄 I have submitted a ClearML patch to Huggingface transformers: None
It is merged, but not in a release yet. Would you mind checking if it works if you install transformers from github? (aka the latest master version)
AstonishingRabbit13 If I'm not mistaken, you can add images to the preview tab by reporting them as debug samples.
So you'd run: dataset.get_logger().report_image()
or report_media()
This is not scalable though, so don't expect the server to handle millions of images well, for that you'd need Hyperdatasets 🙂
But it works well (as the name suggests) for some previews of the images!
Relevant docs:
https://clear.ml/docs/latest/docs/references/sdk/dataset/#get_logger
https://...
Can you elaborate a bit more, I don't quite understand yet. So it works when you update an existing task by adding a tag to it, but it doesn't work when adding a tag for the first time?
It part of the design I think. It makes sense that if we want to keep track of changes, we always build on top of what we already have 🙂 I think of it like a commit: I'm adding files in a NEW commit, not in the old one.
It should, but please check first. This is some code I quickly made for myself. It did make tests for it, but it would be nice to hear from someone else that it worked (as evidenced by the error above 😅 )
Hi PanickyMoth78 ,
I've just recreated your example and it works for me on clearml==1.6.2
but indeed not on clearml==1.6.3rc1
which means we have some work to do before the full release 🙂 Can you try on clearml==1.6.2
to check that it does work there?
How large are the datasets? To learn more you can always try to run something like line_profiler/kerprof, to get exactly how long a specific python line takes. How fast/stable is your internet?
Thank you so much! In the meantime, I check once more and the closest I could get was using report_single_value()
. It forces you to report each an every row though, but the comparison looks a little better this way. No color coding yet, but maybe it can already help you a little 🙂
Hi! Have you tried adding custom metrics to the experiment table itself? You can add any scalar as a column in the experiment list, it does not have color formatting, but it might be more like what you want in contrast to the compare functionality 🙂
Hey @<1541592213111181312:profile|PleasantCoral12> thanks for doing the profiling! This looks pretty normal to me. Although 37 seconds for a dataset.get is definitely too much. I just checked and for me it takes 3.7 seconds. Mind you the .get()
method doesn't actually download the data, so the dataset size is irrelevant here.
But the slowdowns do seem to only occur when doing api requests. Possible next steps could be:
- Send me your username and email address (maybe dm if you don't wa...
To be honest, I'm not completely sure as I've never tried hundreds of endpoints myself. In theory, yes it should be possible, Triton, FastAPI and Intel OneAPI (ClearML building blocks) all claim they can handle that kind of load, but again, I've not tested it myself.
To answer the second question, yes! You can basically use the "type" of model to decide where it should be run. You always have the custom model option if you want to run it yourself too 🙂
Yeah, I do the same thing all the time. You can limit the amount of tasks that are kept in HPO with the save_top_k_tasks_only
parameter and you can create subprojects by simply using a slash in the name 🙂 https://clear.ml/docs/latest/docs/fundamentals/projects#creating-subprojects
Hi Oriel!
If you want to only serve an if-else model, why do you want to use clearml-serving for that? What do you mean by "online featurer"?
Hey! Thanks for all the work you're putting in and the awesome feedback 😄
So, it's weird you get the shm error, this is most likely our fault for not configuring the containers correctly 😞 The containers are brought up using the docker-compose file, so you'll have to add it in there. The service you want is called clearml-serving-triton
, you can find it [here](https://github.com/allegroai/clearml-serving/blob/2d3ac1fe63637db1978df2b3f5ea4903ef59788a/docker/docker-...
Unfortunately, ClearML HPO does not "know" what is inside the task it is optimizing. It is like that by design, so that you can run HPO with no code changes inside the experiment. That said, this also limits us in not being able to "smartly" optimize.
However, is there a way you could use caching within your code itself? Such as using functools' LRU cache? This is built-in in python and will cache function return values if it's ever called again with the same input arguments.
There also see...
Hi @<1541592213111181312:profile|PleasantCoral12> thanks for sending me the details. Out of curiosity, could it be that your codebase / environment (apart from the clearml code, e.g. the whole git repo) is quit large? ClearML does a scan of your repo and packages every time a task is initialized, maybe that could be it. In the meantime I'm asking our devs if they can see any weird lag with your account on our end 🙂
Ok, so I recreated your issue I think. Problem is, HPO was designed to handle more possible combinations of items than is reasonable to test. In this case though, there are only 11 possible parameter "combinations". But by default, ClearML sets the maximum amount of jobs much higher than that (check advanced settings in the wizard).
It seems like HPO doesn't check for duplicate experiments though, so that means it will keep spawning experiments (even though it might have executed the exact s...
In the meantime, it might help to limit the amount of jobs using the advanced settings. If you know the exact amount and want to do every one for sure, just set it that way 🙂
Hi @<1523701062857396224:profile|AttractiveShrimp45> , I'm checking your issue myself. Do you see any duplicate experiments in the summary table?