Reputation
Badges 1
45 × Eureka!AgitatedDove14 Yes, the difference in installed packages is large - the training stage, which runs ok has all the following:
Okay. I see, I didn't understand clearly the structure and logic behind ClearML. I though that exernal git repository should be set up to keep logs, stats, etc. So, all these are kept on the ClearML host, correct? However, if I want to keep logs on outer repo, is it possible to config ClearML to keep all these files there?
https://clearml.slack.com/archives/CTK20V944/p1610481348165400?thread_ts=1610476184.162600&cid=CTK20V944
Indeed, that was a cookie issue. After deleting cookies, everything works fine. Thanks. Interesting enough, I had this issue both on Chrome and FF.
Thanks. Not yet, but will watch, by all means.
AgitatedDove14 It works!!! Thanks a lot!
AgitatedDove14 According to the logs (up to traceback message), the only difference between those two tasks is task id name
Well, I'm pretty sure that nntraining is executed in the same queue for these two cases:
Ok, ran (just used point instead of comma in print statement - comment if someone reading this will run this code). Attached to this message.
Here's also the log of failed pipeline - maybe it may give a clue.
AgitatedDove14 Looks like that. First, I've created a toy task running in "services" queue (you didn't tell that but I guess you assumed). I haven't found how to specify the queue to run in code ( Task.equeue(task, queue_name='services') returned an error), so I ran toy.py first in "default" queue, aborted toy.py, started nntraining in "default" queue. Then I reset toy.py and enqueued it to "services" queue. Toy.py failed shortly. I've also reset both toy.py and nntraining and enqueue...
These libraries are absent in the option, which fails. The only libraries of that option (all are present in correct-working option) are:
absl_py==0.9.0
boto3==1.16.6
clearml==0.17.4
joblib==0.17.0
matplotlib==3.3.1
numpy==1.18.4
scikit_learn==0.23.2
tensorflow_gpu==2.2.0
watchdog==0.10.3
Exactly! To be more specified - the same base_task_id fails, if the pipeline is cloned and started from UI. I've checked the queues for failed and completed tasks - they are the same (default, gpu-all).
AgitatedDove14 Yes, that's what I have - for me it's weird, too.
TimelyPenguin76 Yes, that's a new file - I haven't added it to repository yet. What I see for original taks "uncommitted changes" - "no changes logged".
astunparse==1.6.3
attrs==20.3.0
botocore==1.19.63
cachetools==4.2.1
certifi==2020.12.5
chardet==4.0.0
cycler==0.10.0
Cython==0.29.21
furl==2.1.0
future==0.18.2
gast==0.3.3
google-auth==1.25.0
google-auth-oauthlib==0.4.2
google-pasta==0.2.0
grpcio==1.35.0
h5py==2.10.0
humanfriendly==9.1
idna==2.10
importlib-metadata==3.4.0
jmespath==0.10.0
jsonschema==3.2.0
Keras-Preprocessing==1.1.2
kiwisolver==1.3.1
Markdown==3.3.3
oauthlib==3.1.0
opt-einsum==3.3.0
orderedmultidict==1.0.1
pathlib2==2.3.5
pat...
No, I have only two agents pulling from different queue:
AgitatedDove14
No, I meant different thing. It's not easy to explain, sorry. Let me try. Say, I have a project in folder "d:\object_detection". There I have a script, which converts annotations from labelme format to coco format. This script name is convert_test.py and it runs a process, registered under the same name in clearml. This script, being run separately from command prompt creates new file in project folder - test.json . I delete this file, synch local and remote repos, both...
Yes, this works, thank you!
AgitatedDove14
No, I do not use --docker flag for clearml agent In Windows setting system_site_packages to true allowed all stages in pipeline to start - but doesn't work in Lunux. I've deleted tfrecords from master branch and commit the removal, and set the folder for tfrecords to be ignored in .gitignore. Trying to find, which changes are considered to be uncommited. By cache files I mean the files in folder C:\Users\Super.clearml\vcs-cache - based on error message, cle...
AgitatedDove14 How can the first process corrupt the second and why doesn't this occur if I run pipeline from command line? Just to be precise - I run all the processes as administrator. However, I've tested running the pipeline from command line in non-administrator mode, it works fine.



