
Reputation
Badges 1
59 × Eureka!I clicked Fetch/XHR and got the following (after another reboot)
Here is a snapshot of the blank screen:
No other error messages but the dashboard screen is blank.
However, there is a breakthrough: I can run the dashboard from Safari (Mac browser). So the problem is only in Chrome.
ok, so ~/clearml.conf points to ~/.clearml/cache, and such a file does not exist.
AppetizingMouse58 , SuccessfulKoala55 and AgitatedDove14 , after running the ES migration for the 2nd time the problem is solved π . Thank you all for your help! π
Yes I've performed the ES migration. The data is in clearml/data/elastic_7.
I don't see a cache related to clearml:(base) sigalr@rack-bermano-g03:~$ find . -name *cache* -not -name __pycache* ./.pycharm_helpers/python_stubs/cache ./.cache ./.conda/pkgs/cache
The 1st and last are obviously unrelated, and the middle one contains files related to python:(base) sigalr@rack-bermano-g03:~$ ls .cache/ matplotlib motd.legal-displayed pip
I did not upgrade anything and did not do docker pull.
I am having a temporary network issue . Will send the output of the β docker inspectβ as soon as I can reconnect to my server.
I am using a self hosted server.
I suspect that maybe the server gets stuck when I compare a large number of experiments (~10). Can that be possible?
my original trains server version was 0.14 if I remember correctly. Anywhere I can check it after the upgrade has been done?
My new clearml server is 1.5. I get that from http://localhost:8080/version.json but if there is somewhere else I should look, let me know.
As written above, I did the right click clone, then I did right click enqueue.
The experiment reported 'running', and immediately after preparing the environment it reported 'completed', without actually running my code. Please look at the beginning of this thread for output logs and more details.
Bingo (I guess). My code is local, with multiple files. I will try to connect it to a git repo and let you know how it worked.
Does the agent support uncommitted changes in multiple files? (on-top of a git commit).
The upgrade is from /home/orpat/trains/data/elastic into /home/orpat/trains/data/elastic_7. Do you different paths in the log? Where?
Many errors :white_frowning_face: . Any idea what they mean?
AgitatedDove14 , I noticed that if I run the initial experiment by "python -m folder_name.script_name" then the script path contains the whole list of arguments as you observed.
On the other hand, if I run the initial experiment by "python folder_name/script_name.py", then the script path contains only 'script_name.py'.
In both cases I cannot clone the experiment, with the same results as I reported in my initial message.
The sequence is unclear then:
I followed the instructions in https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_es7_migration/ .
Stage 5 ("python elastic_upgrade.py") ended successfully.
Then I skipped "Upgrading to ClearML Server v.1.2. or Newer" and went straight to "Completing the Installation".
Did I do wrong? What should I do to fix it?
AgitatedDove14 , thank you so much for your help.
I had a long video session today with the Israeli clearml engineers. There were plenty of things I had to do, and the two major ones were to define the environment variable CLEARML_AGENT_SKIP_PIP_VENV_INSTALL so it points to my conda environment python, and to call 'import clearml' from the top of my file (it was called from inside a method).
So now I can clone π
No. I put a break point in my python script, and examined os.environ. The only environment variable with 'CLEARML' in its name is CLEARML_PROC_MASTER_ID, whose value is '16188:' (maybe it means something to you?)
Update: I ran the mongo migration script (clearml-server-1.2.0-migration.py) and now I can see my projects! π
Now there is a new problem: I don't see any of the logs: console, artefacts, scalars, plots.
Can you help?
I am not sure it matters for the following output, but anyway please note that the clearml dockers are down right now.
sigalr@momo : ~ $ curl -XGET http://localhost:9200/_cat/indices
yellow open queue_metrics_d1bd92a3b039400cbafc60a7a5b1e52b_2022-06 2F6APbQWSvajTZQ5JxXY1Q 1 1 59 0 26.2kb 26.2kb
yellow open events-plot-d1bd92a3b039400cbafc60a7a5b1e52b bZMKKCaKRXCys6VD_9oDDw 1 1 8556 0 4.1mb 4.1mb
yellow open worker_stats_d1bd92a3b039400cbafc60a7a5b1e52b_2022-06 c85DhB...
Woohoo! π
The instructions in the https://superuser.com/questions/278948/clear-cache-for-specific-domain-name-in-chrome/444881#444881 were not accurate, but they brought me close enough.
Here is the exact sequence of operations:
F12 --> Applications tab --> Storage --> Clear site data --> refresh login screen
Thanks everyone for your help!
Where do I see the agent print outs?
I am using an old version. It's a trains server of version 0.16.3.
The clearml dockers are down right now because I started a new ES migration (elastic_upgrade.py). I started it before you contacted me and I don't want to break it now. So I cannot look at the console right now.
It will probably finish 30 hours from now. If the same problems repeat, we will continue this chat then.
AgitatedDove14 SuccessfulKoala55 , after I ran elastic_update.py (stage 5 as described above), I saw there was a new folder named data/mongo_4. Doesn't it mean mongodb was already migrated?
TimelyPenguin76 , it possible I tried to compare more than 10 experiments. The issue at the server is that it got very slow, and did not show the 'console' and 'scalars' results any longer, even for a single experiment.
CostlyOstrich36 , I don't have the ClearmlML RAM estimate. My machine is running many processes in addition to ClearML.