1 Install standalone maddash

This document shows the steps to install a standalone maddash server, for example on a VM.

NOTE: all the commands must be run as the "root" user.

Additional information can be found via http://software.es.net/maddash/

1.1 Install CentOS

Install a vanilla CentOS 6.x minimal. Suggested parameters: 1G RAM, 8G disk.

If possible, configure it with a meaningful hostname and put that in the DNS.

Bring it up to date:

yum update

1.2 Disable SELinux

Currently, maddash does not work properly with SELinux (there are problems proxying to port 8881, and possibly others). So for now, disable SELinux:

vi /etc/selinux/config
...
SELINUX=permissive
...

and either reboot, or run this:

echo "0" >/selinux/enforce

1.3 Enable repositories

We need the Internet2 and EPEL repositories for pulling maddash and its dependencies.

yum install wget
wget http://software.internet2.edu/rpms/el6/x86_64/RPMS.main/Internet2-repo-0.4-2.noarch.rpm
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
yum localinstall Internet2-repo-0.4-2.noarch.rpm epel-release-6-8.noarch.rpm
yum clean all

1.4 Install maddash

yum install maddash

This will install a large number of dependencies, including apache (httpd)

1.5 Open firewall

At this point, you should be able to access the webserver at http://x.x.x.x/. If not, you probably need to open the firewall to allow http/https traffic.

If you normally use system-config-firewall-tui then use that, otherwise just edit the config directly and add two lines after the --dport 22 rule:

vi /etc/sysconfig/iptables
...
-A INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 443 -j ACCEPT
...

Then:

/etc/init.d/iptables restart

(Repeat for ip6tables if you are also running IPv6)

2 Configuration - full mesh

A central configuration file can be used to build both the maddash config and to configure the remote personar collector nodes. If you don't already have one, you need to create one now. This file can be created in whatever directory you like with whatever name you like - e.g. sample.conf

To start with we will configure a "full mesh", which means that every node tests to every other node. It requires you to have administrative control to all the nodes, so you can configure them to be part of the mesh.

2.1 Create configuration file

Create your configuration file (e.g. sample.conf) with the following contents and then edit it to make sense for your environment. Replace the hostnames e.g. pop1.sample.example with the hostnames of the nodes in your mesh.

description Sample Mesh Config

<administrator>
  name       Sample NOC
  email      noc@sample.example
</administrator>

<organization>
    description     Sample

    <site>
        <host>
              description POP1
              address pop1.sample.example
              <measurement_archive>
                  type        perfsonarbuoy/bwctl
                  read_url    http://pop1.sample.example/esmond/perfsonar/archive
                  write_url   http://pop1.sample.example/esmond/perfsonar/archive
              </measurement_archive>
              <measurement_archive>
                  type        traceroute
                  read_url    http://pop1.sample.example/esmond/perfsonar/archive
                  write_url   http://pop1.sample.example/esmond/perfsonar/archive
              </measurement_archive>
              <measurement_archive>
                  type        perfsonarbuoy/owamp
                  read_url    http://pop1.sample.example/esmond/perfsonar/archive
                  write_url   http://pop1.sample.example/esmond/perfsonar/archive
              </measurement_archive>
        </host>
    </site>

    <site>
        <host>
              description POP2
              address pop2.sample.example
              <measurement_archive>
                  type        perfsonarbuoy/bwctl
                  read_url    http://pop2.sample.example/esmond/perfsonar/archive
                  write_url   http://pop2.sample.example/esmond/perfsonar/archive
              </measurement_archive>
              <measurement_archive>
                  type        traceroute
                  read_url    http://pop2.sample.example/esmond/perfsonar/archive
                  write_url   http://pop2.sample.example/esmond/perfsonar/archive
              </measurement_archive>
              <measurement_archive>
                  type        perfsonarbuoy/owamp
                  read_url    http://pop2.sample.example/esmond/perfsonar/archive
                  write_url   http://pop2.sample.example/esmond/perfsonar/archive
              </measurement_archive>
        </host>
    </site>

    <site>
        <host>
              description POP3
              address pop3.sample.example
              <measurement_archive>
                  type        perfsonarbuoy/bwctl
                  read_url    http://pop3.sample.example/esmond/perfsonar/archive
                  write_url   http://pop3.sample.example/esmond/perfsonar/archive
              </measurement_archive>
              <measurement_archive>
                  type        traceroute
                  read_url    http://pop3.sample.example/esmond/perfsonar/archive
                  write_url   http://pop3.sample.example/esmond/perfsonar/archive
              </measurement_archive>
              <measurement_archive>
                  type        perfsonarbuoy/owamp
                  read_url    http://pop3.sample.example/esmond/perfsonar/archive
                  write_url   http://pop3.sample.example/esmond/perfsonar/archive
              </measurement_archive>
        </host>
    </site>
</organization>

<test_spec bwctl_test>
  type              perfsonarbuoy/bwctl  # Perform a bwctl test (i.e. achievable bandwidth)
  tool              bwctl/iperf3         # Use 'iperf' to do the bandwidh test
  protocol          tcp                  # Run a TCP bandwidth test
  interval          21600                # Run the test every 6 hours
  duration          20                   # Perform a 20 second test
  force_bidirectional 1                  # do bidirectional test
  random_start_percentage 10             # randomize start time
  omit_interval     5                    # ignore first few seconds of test
</test_spec>

<test_spec owamp_test>
  type              perfsonarbuoy/owamp  # Perform a constant low-bandwidth OWAMP test
  packet_interval   0.1                  # Send 10 packets every second (i.e. pause 0.1 seconds between each packet)
  loss_threshold    10                   # Wait no more than 10 seconds for a response
  session_count     10800                # Refresh the test every half hour (once every 18000 packets)
  sample_count      600                  # Send results back every 60 seconds (once every 600 packets)
  packet_padding    0                    # The size of the packets (not including the IP/UDP headers)
  bucket_width      0.0001               # The granularity of the measurements
  force_bidirectional 1                  # do bidirectional test
</test_spec>

<group core_group>
    type       mesh

    member     pop1.sample.example
    member     pop2.sample.example
    member     pop3.sample.example
</group>

<test>
  description       Core Throughput Testing
  group             core_group
  test_spec         bwctl_test
</test>

<test>
  description       Core OWAMP Testing
  group             core_group
  test_spec         owamp_test
</test>

What you have done is:

For a much bigger example see the ESNet configuration which generates the ESNet dashboard

Note: "force_bidirectional" means that every host runs tests both ways. For example, host A runs and stores tests from A to B and B to A; host B will also run and store tests from B to A and A to B. This means every test is run twice. If you set this to zero it will halve the amount of test traffic. However host A's database will only store the measurements from A to B, and host B's database will only store the measurements from B to A.

2.2 Build and publish configuration

This can be published at whatever URL you like - using the webserver on the maddash server itself is the easiest approach.

First install the tool to convert the config to JSON format:

yum install perl-perfSONAR_PS-MeshConfig-JSONBuilder

Run this tool, putting the results in a directory which your webserver will serve:

/opt/perfsonar_ps/mesh_config/bin/build_json -o /var/www/html/sample.json sample.conf

Now check that the JSON you have created is visible at your chosen URL, e.g. http://x.x.x.x/sample.json

More details at http://docs.perfsonar.net/multi_server_config.html

2.3 Configure maddash to consume the configuration

Still on your maddash server:

yum install perl-perfSONAR_PS-MeshConfig-GUIAgent

Edit /opt/perfsonar_ps/mesh_config/etc/gui_agent_configuration.conf to tell it where to find the JSON file you have published:

<mesh>
    ...
    configuration_url             http://x.x.x.x/sample.json
    ...
</mesh>
...
restart_services    0
...
use_toolkit         0

NOTE Since we are running on a host without the pS toolkit, it is essential to change use_toolkit and restart_services as shown.

Some additional work is required for maddash running standalone. Install an extra cron file:

cp /opt/perfsonar_ps/mesh_config/doc/cron-restart_gui_services /etc/cron.d/cron-restart_gui_services

Also edit the existing mesh_config_gui_agent cron job:

vi /etc/cron.d/cron-mesh_config_gui_agent
...
change it so that it runs as "root" instead of "perfsonar"

Before we regenerate the configuration, we need to remove the existing configuration (since the old dashboards are merged with the new ones).

cd /etc/maddash/maddash-server/
mv maddash.yaml maddash.yaml.orig

Create a new maddash.yaml with just the following contents:

# Set the directory where the database will be stored
database: /var/lib/maddash/
##
# Set the host where the REST server listens
serverHost: "localhost"
##
# Activate http and set the port where it listens
http: 
    port: 8881

Finally we can generate the configuration:

/opt/perfsonar_ps/mesh_config/bin/generate_gui_configuration
/etc/init.d/maddash-server restart

More details at http://software.es.net/maddash/mesh_config.html

3 Test and troubleshooting

At this point you should be able to go to http://x.x.x.x/maddash-webui/

There won't be any useful data to display until you have perfsonar nodes running, but you should see the grid with grey boxes for unknown data.

3.1 Continuous spinner

If the page contains a "spinner" indefinitely, it may be that Apache has disabled proxying to the Java backend because it was down for a short time. Check for the following error:

tail /var/log/httpd/error_log
...
[Tue Sep 08 16:38:35 2015] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8881 (localhost) fa
[Tue Sep 08 16:38:35 2015] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Tue Sep 08 16:38:35 2015] [error] proxy: HTTP: disabled connection for (localhost)
[Tue Sep 08 16:39:24 2015] [error] proxy: HTTP: disabled connection for (localhost)
[Tue Sep 08 16:39:24 2015] [error] proxy: HTTP: disabled connection for (localhost)

The quick solution is just to restart apache:

/etc/init.d/httpd restart

However this does not stop the problem recurring. To do this, edit /etc/httpd/conf.d/apache-maddash.conf and add status=+i to the end of the ProxyPass line, like this:

    ProxyPass /maddash http://localhost:8881/maddash status=+i

and then restart Apache.

To debug other errors where the page does not load, it can be helpful to go to your browser's Javascript Console and look for error messages there. In Chrome this is View > Developer > Javascript Console

3.2 No dashboard selected

If the page is empty apart from menu bar, click on "Dashboards" and select the first dashboard.

You can make this dashboard selected automatically by editing /opt/maddash/maddash-webui/etc/config.json and setting the default dashboard to match the name of your dashboard:

...
    "defaultDashboard": "Sample Mesh Config",
...

4 Configure perfsonar nodes

The perfsonar nodes can now be configured to read the published maddash configuration.

On each node, install the package:

yum install perl-perfSONAR_PS-MeshConfig-Agent

Edit /opt/perfsonar_ps/mesh_config/etc/agent_configuration.conf and change the configuration URL:

<mesh>
    ...
    configuration_url             http://x.x.x.x/sample.json
    ...
</mesh>

Run this command to pick up the configuration:

sudo -u perfsonar /opt/perfsonar_ps/mesh_config/bin/generate_configuration

It will be refreshed nightly from cron in any case.

Your dashboard should start to be populated with test data. Also if you go to the web interface of your perfsonar nodes and click "Configure Tests" you should see the tests which the mesh configuration has added.

More details at http://docs.perfsonar.net/multi_agent_config.html

5 Configuration - disjoint mesh

We will now extend the dashboard to include a second grid with tests from the hosts we control to some remote hosts which we do not control. These remote hosts or "beacons" respond to bwctl and owamp tests which we request, but they do not consume our mesh configuration and do not store any of the results in their database.

Please remember to ask permission from the operator of any node which you would like to run regular testing to. You can find the contact details by looking for "Administrator Email" in their node's perfsonar web interface.

5.1 Update configuration

Edit your mesh configuration and add the following sections - don't remove or change anything you already have.

<organization>
    description ESnet
    <site>
        <host>
              description London
              address lond-pt1.es.net
              no_agent 1
        </host>
        <host>
              description London
              address lond-owamp.es.net
              no_agent 1
        </host>
    </site>
    <site>
        <host>
              description Amsterdam
              address amst-pt1.es.net
              no_agent 1
        </host>
        <host>
              description Amsterdam
              address amst-owamp.es.net
              no_agent 1
        </host>
    </site>
<organization>

<organization>
    description University of Oregon
    <site>
        <host>
              description University of Oregon
              address perfsonar.uoregon.edu
              no_agent 1
        </host>
    </site>
</organization>

The flag no_agent 1 means that this node isn't reading our mesh configuration, it's just a passive endpoint.

Now add some more test specifications. Again, this is in addition to the tests you have already defined. We are making new test specifications so that you can tweak the settings separately to those for your internal mesh.

<test_spec bwctl_test_external>
  type              perfsonarbuoy/bwctl  # Perform a bwctl test (i.e. achievable bandwidth)
  tool              bwctl/iperf3         # Use 'iperf' to do the bandwidh test
  protocol          tcp                  # Run a TCP bandwidth test
  interval          28800                # Run the test every 8 hours
  duration          20                   # Perform a 20 second test
  force_bidirectional 1                  # do bidirectional test
  random_start_percentage 10             # randomize start time
  omit_interval     5                    # ignore first few seconds of test
</test_spec>

<test_spec owamp_test_external>
  type              perfsonarbuoy/owamp  # Perform a constant low-bandwidth OWAMP test
  packet_interval   0.1                  # Send 10 packets every second (i.e. pause 0.1 seconds between each packet)
  loss_threshold    10                   # Wait no more than 10 seconds for a response
  session_count     10800                # Refresh the test every half hour (once every 18000 packets)
  sample_count      600                  # Send results back every 60 seconds (once every 600 packets)
  packet_padding    0                    # The size of the packets (not including the IP/UDP headers)
  bucket_width      0.0001               # The granularity of the measurements
  force_bidirectional 1                  # do bidirectional test
</test_spec>

Note that we must use force_bidirectional 1, since only our hosts will be running these tests, and we want them to be run both ways.

Now add groups to define the tests. A "disjoint" group runs tests from the a_members to the b_members and vice versa, but not from a_member to a_member or b_member to b_member.

<group core_to_external_group_bwctl>
    type       disjoint

    a_member   lond-pt1.es.net
    a_member   amst-pt1.es.net
    a_member   perfsonar.uoregon.edu

    b_member   pop1.sample.example
    b_member   pop2.sample.example
    b_member   pop3.sample.example
</group>

<group core_to_external_group_owamp>
    type       disjoint

    a_member   lond-owamp.es.net
    a_member   amst-owamp.es.net
    a_member   perfsonar.uoregon.edu

    b_member   pop1.sample.example
    b_member   pop2.sample.example
    b_member   pop3.sample.example
</group>

In the dashboard, the a_members will be displayed down the left side of the grid, and the b_members will be displayed along the top of the grid.

Note that we have defined two groups, one for bwctl and one for owamp tests; this is because ESNet provides different test endpoints for throughput and latency tests.

Finally, link the groups to the test specifications

<test>
  description       Core to External Throughput Testing
  group             core_to_external_group_bwctl
  test_spec         bwctl_test_external
</test>

<test>
  description       Core to External OWAMP Testing
  group             core_to_external_group_owamp
  test_spec         owamp_test_external
</test>

5.2 Redeploy configuration

Now you will need to repeat some of the steps you did before.

Go to your maddash server webpage and check that the new grid is visible. Go to your perfsonar node web interfaces, click "Configure Tests" and check that they are configured with the new tests.

6 Optional: additional tweaking of maddash server

6.1 Auto-start

Check that the maddash-server and httpd services are configured to come up on server start:

chkconfig --list

If not, then configure them so they do, e.g.

chkconfig --add httpd 2345 on

6.2 Redirect

If this is a dedicated server, then it's friendly to have a redirect from the front page. Create /var/www/html/index.html containing:

<html>
<head>
<meta http-equiv="refresh" content="0; url=http://x.x.x.x/maddash-webui/">
</head>
<body>
Redirecting <a href="http://x.x.x.x/maddash-webui/">here</a>...
</body>
</html>

6.3 Change thresholds

You can change the thresholds at which warnings (yellow) or errors (red) are displayed.

Edit /opt/perfsonar_ps/mesh_config/etc/gui_agent_configuration.conf and look for these settings:

        acceptable_loss_rate     0
        critical_loss_rate       0.01
...
        acceptable_throughput    900
        critical_throughput      500

Edit as required. Then rebuild the maddash.yaml configuration:

/opt/perfsonar_ps/mesh_config/bin/generate_gui_configuration
/etc/init.d/maddash-server restart

Notice also these settings for the bwctl (throughput) tests:

        check_interval           28800
        check_time_range         86400

This means that check_throughput.pl is run every 8 hours, and it averages the throughput results over the last 24 hours. This means you're looking at an average throughput, not the results of the most recent test.

The corresponding parameters for owamp are 30 minutes and 15 minutes, so it responds much more quickly to changes in packet loss.

6.4 Admin password for maddash

Maddash also has a settings/admin page which can be used to mark hosts as down for maintenance, and force immediate re-tests.

To enable this, you need to create a username and password - none is set by default, and it does not have to be the same as your perfsonar UI username/passwords.

The following command will create a user called "admin" and prompt you for the password:

htpasswd /etc/maddash/maddash-webui/admin-users admin

Once this is done, go to the dashboard, click on Settings > Server Settings... and you will be prompted for this username and password.

Should you wish to remove a user from the password file, you can either edit the file directly, or use this command:

htpasswd /etc/maddash/maddash-webui/admin-users -D admin