Reputation
Badges 1
59 × Eureka!I am running my own server. Those are not example experiments.
Who/What created the initial experiment ?
I created the initial experiment from command-line, with either "python folder/script.py" or "python -m folder.script".
Both end up with the experiment not running. I am attaching an agent daemon log where the initial experiment was called with "python folder/script.py".
Why isn't the entry point just the python script?
The entry point is folder.script and not just the script because I need the 'current' folder while running the script ...
The only thing I need to do is clone my experiment. Can you help me make this happen?
As written above, I did the right click clone, then I did right click enqueue.
The experiment reported 'running', and immediately after preparing the environment it reported 'completed', without actually running my code. Please look at the beginning of this thread for output logs and more details.
Attached are the agent log and the task log
Could it be the file you are trying to run is not in the repository ?
It is unclear what file is missing. The only hint is "Keyerror: '.'" and I am not sure what that refers to. All my code files are in the repository. Maybe the problem is with some installed package file?
Are you running inside a docker ?
No, I am running inside a conda environment.
Any chance you can send the full log ? (edited)
What I sent is the full agent daemon log. If you are asking for the console...
AgitatedDove14 , I noticed that if I run the initial experiment by "python -m folder_name.script_name" then the script path contains the whole list of arguments as you observed.
On the other hand, if I run the initial experiment by "python folder_name/script_name.py", then the script path contains only 'script_name.py'.
In both cases I cannot clone the experiment, with the same results as I reported in my initial message.
Thanks, I will give it a try
Yes, I am using the trains server. We never took the time to update it to clearml.
The version (according to pip freeze) is 0.16.3.
Woohoo! 🎉
The instructions in the https://superuser.com/questions/278948/clear-cache-for-specific-domain-name-in-chrome/444881#444881 were not accurate, but they brought me close enough.
Here is the exact sequence of operations:
F12 --> Applications tab --> Storage --> Clear site data --> refresh login screen
Thanks everyone for your help!
SuccessfulKoala55 , here is the output of "docker inspect trains-webserver" (attached).
I can enter my user name but even the button underneath it is blank (see below). Once clicking it, the whole screen is blank as in the 1st image that I sent.
Here is the developer tool Network screen capture after refreshing the page and trying to login.
Many errors :white_frowning_face: . Any idea what they mean?
Where do I see the agent print outs?
I am using an old version. It's a trains server of version 0.16.3.
Here is a snapshot of the blank screen:
I get an empty list for the 'XHR' filter.
I don't get the error any longer and the experiments get deleted as expected. So no complains on my side...
I did not upgrade anything and did not do docker pull.
I am having a temporary network issue . Will send the output of the “ docker inspect” as soon as I can reconnect to my server.
I don't see a cache related to clearml:(base) sigalr@rack-bermano-g03:~$ find . -name *cache* -not -name __pycache* ./.pycharm_helpers/python_stubs/cache ./.cache ./.conda/pkgs/cache
The 1st and last are obviously unrelated, and the middle one contains files related to python:(base) sigalr@rack-bermano-g03:~$ ls .cache/ matplotlib motd.legal-displayed pip
ok, so ~/clearml.conf points to ~/.clearml/cache, and such a file does not exist.
No. I put a break point in my python script, and examined os.environ. The only environment variable with 'CLEARML' in its name is CLEARML_PROC_MASTER_ID, whose value is '16188:' (maybe it means something to you?)
I clicked Fetch/XHR and got the following (after another reboot)
No other error messages but the dashboard screen is blank.
However, there is a breakthrough: I can run the dashboard from Safari (Mac browser). So the problem is only in Chrome.
my original trains server version was 0.14 if I remember correctly. Anywhere I can check it after the upgrade has been done?
My new clearml server is 1.5. I get that from http://localhost:8080/version.json but if there is somewhere else I should look, let me know.