So, at the high level, I have this code base on GIT and I am able to run them without trains interface. i.e. locally, What I want to do is use trains interface for experiments on the repo.
Please see the link to code I am trying to execute.
Hi LonelyMoth90 , where exactly are you getting the error ? Is it trains-agent running your experiment ?
Okay
Try to reset the experiment and resend for execution, let me know if you still get the error, if you do, could you send a screen grap of the Execution tab? Trains supports either git repo, or standalone code (jupyter) but not a mixture of the two. This means that if you want to run the jupyter/colab the cloning will have to be part of the notebook itself (as you already have it). That said, due to the way CoLab works, Trains will log your execution history (as opposed to the entire jupyter notebook), this means that if you want your entire colab to be logged, you have to run the entire thing once Not to worry, you can also run a single cell then clone the experiment/edit the "uncommitted changes" and just copy paste the python code there. From here you should be able to send for execution using the trains-agentMake sense ?
Hmm, you are missing the entry point in the execution (script path).
Also as I mentioned you can either have a git repo or script in the uncommitted changes, but not both (if you have a git repo then the uncommitted changes are the git diff)
What I have a doubt is github setting. Is the branch setting ok? there are no other branches on the repo
I will try again Understood Understood Thats what I had done already.
Could you right click on the failed experiment , select reset and send it again for execution?
Could that error be a random network issue ?
(Basically this seems like a generic network error not actually related to the trains-agent)
Is the trains-agent
running in docker mode or venv mode?
Yeah. Simple git clone on that repo works well.
Okay this is indeed reported in the UI, but the trains-agent
is running the experiment, and seems to be failing to clone the repository in question.
Seems like a "https" error, git is actually failing to clone the repository error: RPC failed; curl 56 GnuTLS recv error (-54): Error in the pull function.
Can you manually run the clone command on that machine ? I would guess there is some kind of firewall sitting in the middle of the https connection, and that is causing the git to fail.
Simple git clone on that repo works well
On the machine running the trains-agent ?
BTW: the cloning error is actually the wrong branch, if you take a look at your initial screenshot, you can see the line before last branch='default'
which I assume should be branch='master'
(The error itself is still weird, but I assume that this is what git is returning)
I am getting it on WebUI when running the experiment.
Okay, so you want to take the jupyter notebook (aka colab) and have that experiment show on Trains, then use the Trains UI to launch it remotely on one of the machines running the trains-agent. Is that correct?