Reputation
Badges 1
46 × Eureka!Alright! I'll take a look at it. It's also nice to know that pipelines will take care of it. Thanks!
Not really, it's an Ubuntu desktop machine that I'm just updating times to times. I've also got a few pipelines running during my trainings. Do you know any tools that I could use to analyze network errors?
Here are the versions: WebApp: 1.7.0-232 • Server: 1.7.0-232 • API: 2.21
Thanks for trying to help me out! Here's some code that should reproduce the error (at least, it did for me): https://github.com/allegroai/clearml-agent/issues/111
This is what I've found, and there's no error that seem to come up
This is from the console by the way
We've updated everything now, launched a new experiment and we're tracking the logs. I'll tell you if I find anything
Nothing strange in dmesg at least 😕
I'm still in 1.7.0 because of the None arguments thing for now, but I'll test with the latest version if I ever find any other issue
Weeell it seems to work with version 1.7.0 and not with 1.7.1
Yes sure, I will do that
With default settings, to upload 2 datasets of 120 GB and 70 Gb it took more than 6 hours! And this is to upload the dataset on the server itself, the upload pipeline is done on the same computer as clearml
I'm going to try deleting it using the APIClient
No sorry, I found the where the logs are. And it doesn't seem to have any errors in the logs:
` [2022-10-14 17:22:50,771] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.get_all in 3ms
[2022-10-14 17:22:50,784] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.get_by_id in 7ms
[2022-10-14 17:22:50,853] [9] [INFO] [clearml.service_repo] Returned 200 for events.add_batch in 182ms
[2022-10-14 17:22:50,874] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.edit in 28ms
[202...
If I refresh, the project is still there 😕
Thanks for the response, I don't have any specific reason. I just wanted to have a something cleaner. We don't have much projects yet, so we just get these examples in the way. But it's not bad, I was just wondering. I'll remember to check for the environment variables for our next ClearML install. Thanks anyways, I won't take the trouble of removing them then
Yes sure CostlyOstrich36 , I'm just trying to pass some arguments from my __main__ to my pipeline_entry() to my component get_best_model() . But for some reason, I'm getting None into get_best_model instead of what I've given it in pipeline_entry
So this seems like it could work as a work-around:
` Python 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import numpy as np
a = np.ones((100, 100, 3))
a.take(range(40), 0).take(range(40), 1).shape
(40, 40, 3)replaces a[0:40, 0:40] `
Do you know where I can find the logs for that?
But I've got /opt/clearml/data/fileserver/examples/.pipelines/custom pipeline logic which has a bunch of folders of old tasks
From what I could see, generating SHA2:
i7-10700K: ~ 10 - 15 minutes Xeon E3-1240: 4 - 5 hours!Then in both cases I still have about an 1h30 to upload the images to the fileserver. Which I also find quite a bit slow, but the ClearML fileserver is on my old Xeon. I plan to upgrade my server and to test it again
Hmm okay, I'm doing a hyper parameter search by launching multiple processes of my train function. I've got a main task runing the search to log the final results, and a bunch of training tasks running in parallel. It would've been nice to be able to come back to each one individual training task, but I guess I'll do without
Yeah I had the same issue: https://clearml.slack.com/archives/CTK20V944/p1664887550256279
Yep, everything works now, thanks!
Alright, thanks! I can confirm that it works in 1.7.1