Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I'M Trying To Install A New Server, This Is A Fresh Ubuntu 18.04 Install. When I Try To Run The Docker Composer Up Command I Get Error Messages Like This One:

Hi, I'm trying to install a new server, this is a fresh ubuntu 18.04 install. when i try to run the docker composer up command i get error messages like this one:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='elasticsearch', port=9200): Max retries exceeded with url: /_template/events_plot (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9897136c50>: Failed to establish a new connection: [Errno 111] Connection refused',))Any tips ?

  
  
Posted 3 years ago
Votes Newest

Answers 27


which changes do i need to make to get elastic search to work ?

  
  
Posted 3 years ago

CourageousLizard33 specifically section (4) is the issue (and it's related to any elastic docker, nothing specific to trains-server)
echo "vm.max_map_count=262144" > /tmp/99-trains.conf sudo mv /tmp/99-trains.conf /etc/sysctl.d/99-trains.conf sudo sysctl -w vm.max_map_count=262144 sudo service docker restartDid you try the above, and you are still getting the same error ?

  
  
Posted 3 years ago

It worked ! took me a while to get the docker "user" to pick up trains.conf ...

  
  
Posted 3 years ago

Hi, yes its a docker on a VM

  
  
Posted 3 years ago

OK what solved it is increasing the RAM of the VM, do you specify minimum requirements anywhere ?

  
  
Posted 3 years ago

Probably less secure though :)

  
  
Posted 3 years ago

I did, but i will try again

  
  
Posted 3 years ago

there will be a tr but there will be a separate graph for top1 and loss, on your system then go into the same graph, since loss and train accuracy usually have very diffrent value ranges, it make it impossible to see the loss graph without starting go manipulate it

  
  
Posted 3 years ago

SteadyFox10 I suspect you are correct 🙂
CourageousLizard33 see also section (4) here:
https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md#launching-the-trains-server-docker-in-linux-or-macos

  
  
Posted 3 years ago

CourageousLizard33 so you have a Linux server running Ubuntu VM with Docker inside?
I would imagine that you could just run the docker on the host machine, no?
BTW, I think 8gb is a good recommendation for a VM it's reasonable enough to start with, I'll make sure we add it to the docs

  
  
Posted 3 years ago

CourageousLizard33 VM?! I thought we are talking fresh install on ubuntu 18.04?!
Is the Ubuntu in a VM? If so, I'm pretty sure 8GB will do, maybe less, but I haven't checked.
How much did you end up giving it?

  
  
Posted 3 years ago

:) yes on your gateway/firewall set http://demoapi.trains.allegro.ai to 127.0.0.1 . That's always good practice ;)

  
  
Posted 3 years ago

i have actually already tried to follow those instructions, after a fresh install of the OS

  
  
Posted 3 years ago

there is a funny issue with trains, one of the great features in our book is the fact that you pickup tensorboard logs automatically, but you group them in the opposite direction, i.e. if i have:

  
  
Posted 3 years ago

CourageousLizard33 if the two series are on the same graph, just click on the series in the legend, you can enable/disable it, and the scale will adjust automatically.
Regarding grouping, this is a feature that can be turned off, the idea is that we split the tag to title/series... So if you have the same prefix you get to group the TF scalars on the same graph, otherwise they will be on a diff title graph. That said you can make force it to have a series per graph like in TB. Makes sense?

  
  
Posted 3 years ago

Thanks ! thats great, also can i some how make sure that no matter what results are not uploaded to the public demo server ?

  
  
Posted 3 years ago

Hmm CourageousLizard33 seems you stumbled on a weird bug,
This piece of code only tries to get the username of the current UID, but since you are running inside a docker and probably set the environment UID but there is no "actual" UID by that number on /etc/passwd , and so it cannot resolve it.
I'm attaching a quick fix, please let me know if it solved the problem.
I'd like to make sure we have it in the next RC as soon as possible.

  
  
Posted 3 years ago

File "/opt/conda/lib/python3.6/site-packages/trains/task.py", line 277, in init not auto_connect_frameworks.get('detect_repository', True)) else True File "/opt/conda/lib/python3.6/site-packages/trains/task.py", line 1163, in _create_dev_task log_to_backend=True, File "/opt/conda/lib/python3.6/site-packages/trains/task.py", line 111, in __init__ super(Task, self).__init__(**kwargs) File "/opt/conda/lib/python3.6/site-packages/trains/backend_interface/task/task.py", line 108, in __init__ self.id = self._auto_generate(project_name=project_name, task_name=task_name, task_type=task_type) File "/opt/conda/lib/python3.6/site-packages/trains/backend_interface/task/task.py", line 251, in _auto_generate created_msg = make_message('Auto-generated at %(time)s by %(user)s@%(host)s') File "/opt/conda/lib/python3.6/site-packages/trains/backend_interface/util.py", line 28, in make_message user=getpass.getuser(), File "/opt/conda/lib/python3.6/getpass.py", line 169, in getuser return pwd.getpwuid(os.getuid())[0] KeyError: 'getpwuid(): uid not found: 10001'

  
  
Posted 3 years ago

CourageousLizard33 Are you using the docker-compose to setup the trains-server?

  
  
Posted 3 years ago

no, we have a vmware server, on it we run a bunch of servers. While I have your attention, I'm running into a new issue, most of our training sessions run from inside a docker. When i try to run such a training session, i get an error about the user:

  
  
Posted 3 years ago

It is a VM running Ubuntu 18.04, yes i ended up giving it 8 GB which seemed to solve the issue. Pretty common to run servers on VMs these days ... :)

  
  
Posted 3 years ago

Where are you seeing this message?

  
  
Posted 3 years ago

logger.log_metric('tr.top1', to_python_float(prec1))

  
  
Posted 3 years ago

thanks

  
  
Posted 3 years ago

i need to run the docker with my uid which is 10001 but the docker does not know or have the user, why does it need it ? to find the trains.conf ? is there any way to pass it manually ?

  
  
Posted 3 years ago

You need to change a setting in your host machine to make the elasticsearch working.

  
  
Posted 3 years ago
700 Views
27 Answers
3 years ago
one year ago
Tags