Reputation
Badges 1
371 × Eureka!If I understood this correctly, so in case where we have defined steps in order as a parent child. If the parent had a pre execute callback return False, will all subsequent children nodes/steps not execute or will they just ignore it and still execute?
Retrying (Retry(total=239, connect=239, read=240, redirect=240, status=240)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb2191dcaf0>: Failed to establish a new connection: [Errno 111] Connection refused')': /auth.login
Retrying (Retry(total=238, connect=238, read=240, redirect=240, status=240)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb2191e10a0>: Failed to establish a new connection: ...
And multiple agents can listen to the same queue right?
I'm not in the best position to answer these questions right now.
Given a situation where I want delete an uploaded artifact from both the UI and the storage, how would I go about doing that?
Or is there any specific link you can recommend to try and create my own server.
I think maybe it does this because of cache or something. Maybe it keeps a record of an older login and when you restart the server, it keeps trying to use the older details maybe
I would normally like for it to install any requirements needed on its own.
I'm using clearml installed via pip in a conda env. Do I find this file inside the environment directory?
Yeah I think I did. I followed the tutorial on the repo.
It works this way. Thank you.
And casting it to bool converts it to True
Alright I solved it. Given that my parameters are stored inside args. the way to do it would be,
os.listdir(location) shows nothing
there are other parameters for add_task as well, I'm just curious as to how do I pass the folder and batch size in the schedule_fn=watch_folder part
adding tags this way to a Dataset object works fine. This issue only occured when doing this to a model.
Normally when you save a model in tensorflow, you get a whole saved_model not just the weights. Is there no way to get the whole model including the architecture?
Elastic is what Clear ML uses to handle Data?
the storage is basically the machine the clearml server is on, not using s3 or anything
As I go through the model.py file, I get what you're saying. Only problem is in the case of AutoLogging, I don't have the model id, for the model being saved.
It was working fine for a while but then it just failed.
So right now, I'm creating an OutputModel and passing the current task in the constructor. Then I just save the tensorflow keras model. When I look at the details, model artifact in the ClearML UI, it's been saved the usual way, and no tags that I added in the OutputModel constructor are there. From which to me it seems that ClearML is auto logging the model, and the model isn't connected to the OutputModel object that I created.
You're saying that the model should get connected if I call up...
I shared the error above. I'm simply trying to make the yolov5 by ultralytics part of my pipeline.
My draft is View Only but the cloned toy task one is in normal Draft mode.
Collecting idna==3.3
Using cached idna-3.3-py3-none-any.whl (61 kB)
Collecting importlib-metadata==4.8.2
Using cached importlib_metadata-4.8.2-py3-none-any.whl (17 kB)
Collecting importlib-resources==5.4.0
Using cached importlib_resources-5.4.0-py3-none-any.whl (28 kB)
ERROR: Could not find a version that satisfies the requirement jsonschema==4.2.1 (from -r /tmp/cached-reqsm1gu3664.txt (line 19)) (from versions: 0.1a0, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8.0, 1.0.0, 1.1.0, 1.2.0, 1.3.0, 2.0...
Well I'm still researching how it'll work. I'm expecting it to not be very good and will make the model learning very stochastic in nature.
I plan to instead at the training stage, instead of just getting this model, use Dataset.squash, to get previous M datasets merged together.
This should introduce stability in the dataset.
Also this way, our model is trained on a batch of data multiple times but only for a few times before that batch is discarded. We keep the training data fresh for co...