Reputation
Badges 1
58 × Eureka!mongo 4.4 image does not launch a container if the data in mongo dir is for previous versions. We should add that comment in the documentation
I had to manually create a dump for the mongo data and import it into 4.4. I was just referring to adding a note to the documentation for other users.
I cannot execute step 4 because I can't get past step 3. Does that make sense?
I'm using docker to run the experiment. Could it be that the config in the docker container doesn't have the git credentials?
Ok, So Git credentials are present at two locations - 1) outside the agent config and 2) inside it. I updated credentials at both locations and now I'm seeing agent.git_user = <username> in the dump, but I still have the same issue.
` # Set GIT user/pass credentials
leave blank for GIT SSH credentials ...
That makes sense. The configuration file is located at ~/trains.conf which I believe is the default location.
No I can't see my username printed out in the dump
Yes the 'training' is my main code. You can think of it has launching a job (training or inference). My main code launches multiple jobs using multiprocessing. Each job is a seprate task for clearml that gets logged. Does that make sense?
I'm getting the same error when I followed the instructions to the letter.
Here is one line from the mongo docker output"This version of MongoDB is too recent to start up on the existing data files. Try MongoDB 4.2 or earlier."
Hi AgitatedDove14 , yes, I was able to change the color from the UI. But this may be less than ideal for the following use case.
A model is an ensemble of say 10 models. Each member of the ensemble generates two train-validation curves. So for 1 model, I will have 20 plots. There are two problems with the current setup:
Manually changing colors of all the plots is not feasible The default color scheme is not consistent and changes randomly with every run
It would be nice if I can control t...
I come across many small questions like these which may been answered earlier. But they are hard to find in Slack messages. Is it better to post such questions on Stackoverflow so they benefit everybody? I might post the link here.
The docker container in step 3 does not run because of the incompatibility
Hi AgitatedDove14 Thanks for checking. I would like to compare several experiments (plots, hyperparams, etc), so it would have to been nice to do it in the UI. I have to search through the long list right now. With python, I can only do few of the things that I intend to do. Is this something that might be added in the future?
There was some complication during the upgrade so I had to resort to the manual process.
I have now been able to upgrade by dumping the mongodb data and restoring it independently.
Hi AppetizingMouse58
Yes, I tried to perform steps 3-10, however step 3 raised an error because data files for mongo were incompatible between 3.6 and >4.0
Hi AgitatedDove14 , I'll wait for you to reply on Github before I add my comments to these points.
Steps 1 and 2 on this https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_mongo44_migration/ say to backup opt/clearml/data/mongo and uncompress into /opt/clearml/data/mongo_4 . Isn't it just copying the old data files?
You will need to habe multipleĀ
trains-agent
sĀ but they will be sharing the same queue (i.e. pulling jobs from the same queue the HPO process is pushing to)
Make sense ?
Hmm. So say I have a parameter NUM_PARALLEL_EXECUTIONS , I can programmatically launch that many trains-agent for every optimization run?!
SuccessfulKoala55 Yes, I am using the --docker flag.
You are right about the Keyring. Once I make sure credentials are stored in a secure way, it works as expected. Thanks :)
Yes, I am using Pool. Here is what I think is happening. clearml launches a subprocess which I assume is a daemonic process. That process in-turn launches a subprocess for training which causes the error I mentioned
The second subprocess is by design. It becomes the primary process when clearml does not use multiprocessing. I hope I'm not confusing you further