AnsweredAssumed Answered

Four instance cluster stuck on final deployment

Question asked by Alejandro Lineiro Employee on Jun 1, 2017
Latest reply on Jun 5, 2017 by Jared Cohen

Hello everyone,

 

We are trying to deploy a four instance cluster but the deployment of the instances does not progress beyond this point:

 

image001.png

 

We have tried uninstalling and reinstalling the instances, and leaving the deployment process overnight just in case it was a matter of waiting, but the problem persists.

 

We also attempted to perform a single instance installation and it got stuck on exactly the same part of the process.

 

Installation progressed correctly up until that point. All the instances appeared eligible to form the cluster.

 

The instances are running on RedHat. The version of docker  installed is 17.05.0-ce.

 

Firewalld and selinux are deactivated.

 

[root@opecs2hcip04 bin]# systemctl status firewalld

● firewalld.service - firewalld - dynamic firewall daemon

   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)

   Active: inactive (dead)

     Docs: man:firewalld(1)

 

[root@opecs2hcip04 bin]# getenforce

Disabled

 

The docker configuration was completed without errors:

 

[root@opecs2hcip04 bin]# docker ps

CONTAINER ID        IMAGE                                        COMMAND                  CREATED             STATUS              PORTS               NAMES

b6d4366c63f8        com.hds.ensemble/logstash:23.0.0.1537        "/bin/sh -c /opt/h..."   3 minutes ago       Up 3 minutes                            mesos-ecec30bf-b095-4bdc-848b-fbcff3de8ebc-S0.6d987db1-b692-45cb-9c41-3026f0c4cf5f

f4f47dc59cbf        com.hds.ensemble/solr:23.0.0.1537            "/bin/sh -c /opt/h..."   5 minutes ago       Up 5 minutes                            mesos-ecec30bf-b095-4bdc-848b-fbcff3de8ebc-S0.ef31debd-cb1a-47e6-8941-f522544fd5e3

ca89cd8c99cf        com.hds.ensemble/elasticsearch:23.0.0.1537   "/docker-entrypoin..."   5 minutes ago       Up 5 minutes                            mesos-ecec30bf-b095-4bdc-848b-fbcff3de8ebc-S0.1d8841db-c2b4-4f11-a7f1-4cd0b5a9bcc4

fe4f9c863bd1        com.hds.ensemble/kafka:23.0.0.1537           "/bin/sh -c /opt/h..."   5 minutes ago       Up 5 minutes                            mesos-ecec30bf-b095-4bdc-848b-fbcff3de8ebc-S0.1609fbe2-81e6-4238-9c8b-1802bd0e7e15

f894d0411ec1        com.hds.ensemble/cassandra:23.0.0.1537       "/bin/sh -c /opt/h..."   5 minutes ago       Up 5 minutes                            mesos-ecec30bf-b095-4bdc-848b-fbcff3de8ebc-S0.9835b191-01ea-431f-a9b1-129bd4fd27eb

b0513b8e5144        com.hds.ensemble/tomcat:23.0.0.1537          "/bin/sh -c /opt/h..."   5 minutes ago       Up 5 minutes                            mesos-ecec30bf-b095-4bdc-848b-fbcff3de8ebc-S0.9c011228-85ba-4393-ad4d-f2d92a690e8e

83d0e9a6a110        com.hds.ensemble/chronos:23.0.0.1537         "/bin/sh -c /opt/h..."   6 minutes ago       Up 6 minutes                            mesos-ecec30bf-b095-4bdc-848b-fbcff3de8ebc-S0.62bb15be-5e67-4770-92ba-5469a2302700

b728f19f3b77        com.hds.ensemble/tomcat:23.0.0.1537          "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 14 minutes                           admin-app

e39f00692a91        com.hds.ensemble/tomcat:23.0.0.1537          "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 14 minutes                           sentinel-service

08bef87e0feb        com.hds.ensemble/haproxy:23.0.0.1537         "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 14 minutes                           haproxy-service

ddb73843861f        com.hds.ensemble/marathon:23.0.0.1537        "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 14 minutes                           marathon-service

10c71069e2ec        com.hds.ensemble/mesos:23.0.0.1537           "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 15 minutes                           mesos-slave-service

f64fcf59bf18        com.hds.ensemble/mesos:23.0.0.1537           "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 15 minutes                           mesos-master-service

c9ef6492f561        com.hds.ensemble/zookeeper:23.0.0.1537       "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 15 minutes                           zookeeper-service

59f25c395781        com.hds.ensemble/watchdog:23.0.0.1537        "/opt/hci/1.1.2/sv..."   15 minutes ago      Up 15 minutes                           watchdog-service

 

[root@opecs2hcip04 bin]# docker info

Containers: 16

Running: 15

Paused: 0

Stopped: 1

Images: 21

Server Version: 17.05.0-ce

Storage Driver: devicemapper

Pool Name: docker-thinpool

Pool Blocksize: 524.3kB

Base Device Size: 10.74GB

Backing Filesystem: xfs

Data file:

Metadata file:

Data Space Used: 3.911GB

Data Space Total: 53.69GB

Data Space Available: 49.78GB

Metadata Space Used: 2.335MB

Metadata Space Total: 1.07GB

Metadata Space Available: 1.067GB

Thin Pool Minimum Free Space: 5.369GB

Udev Sync Supported: true

Deferred Removal Enabled: true

Deferred Deletion Enabled: true

Deferred Deleted Device Count: 0

Library Version: 1.02.135-RHEL7 (2016-11-16)

Logging Driver: json-file

Cgroup Driver: cgroupfs

Plugins:

Volume: local

Network: bridge host macvlan null overlay

Swarm: inactive

Runtimes: runc

Default Runtime: runc

Init Binary: docker-init

containerd version: 9048e5e50717ea4497b757314bad98ea3763c145

runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228

init version: 949e6fa

 

 

We have attached the logs of the four nodes involved, as well as the logs of the single instance deployment attempt.

 

If there is any other info necessary to diagnose the problem please let us know.

 

Thank you for your help!

Outcomes