Reputation
Badges 1
47 × Eureka!I mean, you know in trains github, there are examples and when I deploy the server, these examples are exist in server with draft status. So, I want to add my examples in the same way.
I couldn't tell. Assume, I have a huge github repository, it has 100 ml project and I want to see all of them in the trains server. Should I write "train.init()" and run them all in order to see them in the server or is there any other way to see all of them in the server without run them all.
fatal: destination path '/home/dogukan/.trains/vcs-cache/pre-post-script-repo.git.35f82b395021c8e6afef186fafa662cc/pre-post-script-repo.git' already exists and is not an empty directory.
Actually, the error occurs because of this line. I run my experiment firstly on docker, then it creates a folder in the vcs-cache that gives access permission only to root. Then, I run my experiment on venv and it cannot access this folder and cannot create new one because of the same name.
Hi TimelyPenguin76 , My version is 0.15.2rc0 and I am running with this command; trains-agent daemon --detached --gpus 0 --queue default --docker nvidia/cuda --foreground
Okey, know it is 2.9, I misunderstood you.
print(task.data.hyperparams)
I tried this one
AttributeError: 'Task' object has no attribute 'hyperparams'
Okey, I got it now. Thanks for help 🙂
In older versions it returns:"parameters": { "batch_size": "64", "test_batch_size": "1000", "epochs": "1", "lr": "0.01", "momentum": "0.5", "no_cuda": "False", "seed": "1", "log_interval": "10", "save_model": "True" }
But, now it just returns empty dict. I think it is because of separation of hyper-parameters section on UI.
Yes, it fixed. If I install trains from github repo directly withouth using my local version. But, is there another way to avoid it, because I might change the code for my personal use, that's why I want to install trains from my local.
Actually, I want to retrieve metrics from code and I thought that these metrics might be stored somewhere in the local folders so that I can access them via code. You know the log file is created and stored in /tmp folder.
Yes, I mean trains-agent. Actually I am using 0.15.2rc0. But, I am using local files, I mean I clone trains and trains-agent repos and install them. Their versions are 0.15.2rc0
-e git+
torch == 1.5.1 torchvision == 0.6.1 trains == 0.15.2rc0
Actually, package version is also written. However, because of git ref, trains-agent fails.
It worked when I changed python3 -m trains-agent --help
to trains-agent --help
I was using APIClient inside of trains-agent and in trains-agent there is no v2.9. I think that's why I could not get hyperparams
TimelyPenguin76 It is 2.1, but 2.9 files are also exist. How can I update it?
Thanks a lot for the last information. It worked 😄
I think you can reproduce it by cloning the trains repository, then pip install -e ~/trains
and then you need to run one of the examples from trains examples by writing python3 toy_base_task.py
. Then you should see this odd bug.
AgitatedDove14 I might find something to fix the issue but I am not sure. In trains-agent worker.py script log it is written like that python3 -u -m trains_agent execute --disable-monitoring --id 9fe6d610a2b946379255b0fc25b5f9fd')
so at the end there is an extra " ' ". So when I run this script in my local environment by writing python3 -u -m trains_agent execute --disable-monitoring --id 9fe6d610a2b946379255b0fc25b5f9fd
it works and runs the code. However, if I write ` pytho...
If I delete this folder that is in ~/.trains/vcs-cache directory, it fixes the problem
AgitatedDove14 , sorry for my late response, I will try it. it might work and Thanks.
AgitatedDove14 Is it possible to delete specified worker? I mean, I have 10 workers and I want to delete one of them?
Ups, you misunderstood me. I just want to remove specified agent. For example, I had 3 agents on the same queue with different worker names. So, if I remove them by applying what you said in this thread, all of them will be removed. However, I just want to remove one of them.
Yes, I mean removing agent from the server