
Reputation
Badges 1
46 × Eureka!Yeah, I understand that it's a bit confusing what I'm asking. Here's a sample code:
` from clearml.automation.controller import PipelineDecorator
@PipelineDecorator.component(cache=True)
def step_one():
import numpy as np
image = np.ones((100, 100, 3))
crop = image[0:50, 0:50]
print("here's my crop of shape:", crop.shape)
@PipelineDecorator.pipeline(name='custom pipeline logic', project='examples', version='0.0.5')
def executing_pipeline():
step_one()
if name ==...
This however works fine:
` Python 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import numpy as np
a = np.ones((100, 100, 3))
a[0:50, 0:50].shape
(50, 50, 3)
`
I wasn't able to reproduce it with a simple piece of code, I'll try to see later if I can. But what I've seen is that I was logging too many images and it was somehow missing my last reports. With fewer image logs it seems like it's working normally
It's a bit strange, my pipeline is called "custom pipeline logic" (which I renamed to "blop" later). This api_client.projects.get_all(name=exact_match_regex("pipeline_project/.pipelines/blop"), search_hidden=True)
returns nothing and this api_client.projects.get_all(name=exact_match_regex("pipeline_project/.pipelines/custom pipeline logic"), search_hidden=True)
returns nothing either
This crashes with:File "/tmp/tmpa5l_cvuv.py", line 8 crop = image[(0:50, 0:50)] ^ SyntaxError: invalid syntax
I'm going to try deleting it using the APIClient
This is from the console by the way
Weeell it seems to work with version 1.7.0 and not with 1.7.1
Not really, it's an Ubuntu desktop machine that I'm just updating times to times. I've also got a few pipelines running during my trainings. Do you know any tools that I could use to analyze network errors?
This is what I've found, and there's no error that seem to come up
CostlyOstrich36 This looks like a bug? Here's a simpler version of it and what I'm getting:
` from clearml.automation.controller import PipelineDecorator
@PipelineDecorator.component(cache=False)
def step_one(my_arg):
print('step_one/my_arg:', my_arg) # step_one/my_arg: None
# I should not get None here! At least that's what I'm expecting
@PipelineDecorator.pipeline(name='custom pipeline logic', project='examples', version='0.0.5')
def executing_pipeline(my_arg):
print('my_ar...
WebApp: 1.7.0-232 • Server: 1.7.0-232 • API: 2.21
Nothing strange in dmesg
at least 😕
Yep I'm dumb, it worked. However I've launch a couple of tasks with name
='custom pipeline logic',
project
='examples'
and I have to delete them manually. When I try through the UI it just waits forever
So this seems like it could work as a work-around:
` Python 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import numpy as np
a = np.ones((100, 100, 3))
a.take(range(40), 0).take(range(40), 1).shape
(40, 40, 3)replaces a[0:40, 0:40] `
No sorry, I found the where the logs are. And it doesn't seem to have any errors in the logs:
` [2022-10-14 17:22:50,771] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.get_all in 3ms
[2022-10-14 17:22:50,784] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.get_by_id in 7ms
[2022-10-14 17:22:50,853] [9] [INFO] [clearml.service_repo] Returned 200 for events.add_batch in 182ms
[2022-10-14 17:22:50,874] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.edit in 28ms
[202...
Hmm okay, I'm doing a hyper parameter search by launching multiple processes of my train
function. I've got a main task runing the search to log the final results, and a bunch of training tasks running in parallel. It would've been nice to be able to come back to each one individual training task, but I guess I'll do without
But I've got /opt/clearml/data/fileserver/examples/.pipelines/custom pipeline logic
which has a bunch of folders of old tasks
We've updated everything now, launched a new experiment and we're tracking the logs. I'll tell you if I find anything
If you have any ideas as to what could go wrong, I'd be happy to look at it. But since my venv is rebuilt at each new agent run, I'm really struggling to debug it
Yes sure, I will do that
Thanks for the response, I don't have any specific reason. I just wanted to have a something cleaner. We don't have much projects yet, so we just get these examples in the way. But it's not bad, I was just wondering. I'll remember to check for the environment variables for our next ClearML install. Thanks anyways, I won't take the trouble of removing them then
Just dropping this here but I've had some funky compressions with very small datasets! It's not a big issue though, since it's still small and doesn't really affect anything
Here are the versions: WebApp: 1.7.0-232 • Server: 1.7.0-232 • API: 2.21
Thanks for trying to help me out! Here's some code that should reproduce the error (at least, it did for me): https://github.com/allegroai/clearml-agent/issues/111
From what I could see, generating SHA2:
i7-10700K: ~ 10 - 15 minutes Xeon E3-1240: 4 - 5 hours!Then in both cases I still have about an 1h30 to upload the images to the fileserver. Which I also find quite a bit slow, but the ClearML fileserver is on my old Xeon. I plan to upgrade my server and to test it again