Reputation
Badges 1
58 × Eureka!That makes sense. The configuration file is located at ~/trains.conf
which I believe is the default location.
No I can't see my username printed out in the dump
You will need to habe multipleĀ
trains-agent
sĀ but they will be sharing the same queue (i.e. pulling jobs from the same queue the HPO process is pushing to)
Make sense ?
Hmm. So say I have a parameter NUM_PARALLEL_EXECUTIONS
, I can programmatically launch that many trains-agent
for every optimization run?!
Yes the 'training' is my main code. You can think of it has launching a job (training or inference). My main code launches multiple jobs using multiprocessing. Each job is a seprate task for clearml that gets logged. Does that make sense?
fatal: could not read Username for '
': terminal prompts disabled error: Could not fetch origin
Why is trains-agent trying read from terminal prompt instead of trains.conf
?
SuccessfulKoala55 Yes, I am using the --docker flag.
You are right about the Keyring. Once I make sure credentials are stored in a secure way, it works as expected. Thanks :)
I posted the https://stackoverflow.com/questions/64636294/trains-reusing-previous-task-id/64636297#64636297 on stackoverflow with the answer :)
The second subprocess is by design. It becomes the primary process when clearml does not use multiprocessing. I hope I'm not confusing you further
I come across many small questions like these which may been answered earlier. But they are hard to find in Slack messages. Is it better to post such questions on Stackoverflow so they benefit everybody? I might post the link here.
Hi AgitatedDove14 , yes, I was able to change the color from the UI. But this may be less than ideal for the following use case.
A model is an ensemble of say 10 models. Each member of the ensemble generates two train-validation curves. So for 1 model, I will have 20 plots. There are two problems with the current setup:
Manually changing colors of all the plots is not feasible The default color scheme is not consistent and changes randomly with every run
It would be nice if I can control t...
Hi AgitatedDove14 Thanks, I'll check these out.
What is the exact use case you have in mind?
I want to store some additional data that is not relevant to training a model. For example, store inference results, explanations, etc and then use them in a different process. I currently use separate database for this.
Btw, I had been busy with another project and hadn't logged in here for some time. I see that you guys have made a lot of progress in the last two months! I'm excited to di...
Ok, I will look into artifacts. However, I will probably need high performance query functionality. For example, say I have a model and hundreds of thousands of inference records for that model. I want to be able to efficiently query that. My guess is that wouldn't be possible with artifacts. But that should be possible with Task.get_tasks
.
Yes, I am using Pool. Here is what I think is happening. clearml launches a subprocess which I assume is a daemonic process. That process in-turn launches a subprocess for training which causes the error I mentioned
Steps 1 and 2 basically copy mongo 3.6 data into a new dir mongo_4
but mongo image of version 4.4 does not accept that data. So I had to perform the following steps:
Launch docker container with mongo=3.6 dump data using mongo dump Launch docker container with mongo=4.4 and empty mongo_4
data dir Restore the dump data using mongo restore
This made sure the data is now compatible with mongo 4.0 or greater
There was some complication during the upgrade so I had to resort to the manual process.
I have now been able to upgrade by dumping the mongodb data and restoring it independently.
I cannot execute step 4 because I can't get past step 3. Does that make sense?
I'm getting the same error when I followed the instructions to the letter.
Here is one line from the mongo docker output"This version of MongoDB is too recent to start up on the existing data files. Try MongoDB 4.2 or earlier."
I was getting the error in step number 3
Hi AgitatedDove14 Thanks for checking. I would like to compare several experiments (plots, hyperparams, etc), so it would have to been nice to do it in the UI. I have to search through the long list right now. With python, I can only do few of the things that I intend to do. Is this something that might be added in the future?
Got it. That makes sense. Thanks!
The docker container in step 3 does not run because of the incompatibility