Hello everyone, can you help me with an issue I faces recently? Namely I've got this message in console while training my neural net with pytorch lightning (+clearml):
"Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/recsys-kf/lib/python3.9/multiprocessing/queues.py", line 251, in _feed
File "/home/ubuntu/anaconda3/envs/recsys-kf/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/ubuntu/anaconda3/envs/recsys-kf/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/recsys-kf/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe".

I need to say that all the training process comes to an end successfully and at the end I have trained model but still I would prefer not to see this kind of messages 🙂 I was googling about it but nothing really helped me. Setting num_workers=0 makes this message disappear but nothing else happens - I still don't know how to fix this "broken pipe".

Do you have any idea what is happening here? Thanks in advance 😉

Posted 8 months ago
@<1554638160548335616:profile|AverageSealion33> , what if you just run a very simple piece of code that includes Task.init() ? One of the examples in the repository, does this issue reproduce?

Posted 8 months ago


Posted 8 months ago

hey @<1523701070390366208:profile|CostlyOstrich36> , I run this example ( None ) with changed num_workers to >1 and pin_memory=True in dataloader and again I got this brokenpipe issue :<

Posted 8 months ago

@<1523701070390366208:profile|CostlyOstrich36> Do you have any thoughts on this?

Posted 8 months ago

  • it all worked perfectly fine before applying clearml
Posted 8 months ago

Hi @<1554638160548335616:profile|AverageSealion33> , so if you remove the Task.init() everything goes back to working fine?

Posted 8 months ago
6 Answers
8 months ago
8 months ago