Author Archives: Iurie Costasco

Monitor Temperature and Humidity using Raspberry Pi

Today I would like to share you the way to monitor via web page temperature and humidity using raspberry Pi and Sensor for humidity and temperature. You could find a lot of similar articles in the Internet. Mine solution is just combination of several and small adjustment.

Level of moisture in our house is too high and this requires intervention from our side to reduce it. Recommended value of moisture should be between 35% and 55%. High level of moisture increase probability to get ingress on our things, clothes, walls..

Moisture level depends on air flows and temperature in your house. If in the winter temperature is below 20 Celsius degree the level of humidity will be higher than recommended value. Minimum recommended level of temperature should be 21 Celsius degrees.  In other words you need to have smart heating system and good ventilation to reduce fast humidity when you are cooking or after you used bathroom. So let’s see how to get data which we need for decisions.

We need equipment:

  • DHT22 AM2302 Digital Temperature and Humidity Sensor module Replace SHT11 SHT15, Price – 5.97 USD
  • Raspberry Pi-2 + Acrylic Enclosure Case + Heat sink, Price – 42.27 USD 

Let’s connect sensor to the raspberry. I did everything like in this link http://www.home-automation-community.com/temperature-and-humidity-from-am2302-dht22-sensor-displayed-as-chart/ except the thing that I didn’t used resistor and development plate. My sensor has resistor built in. Here is my equipment and connections:

Before to collect data let’s update our raspberry:

sudo aptget update

Note: Many steps in this tutorial require internet acces on raspberry PI (If you don't know how to do it see my previous article how to connect raspbery to the internet via Wi-Fi)

Raspberry collect data from sensor on the GPIO pin 4. To read data we will use python library called Adafruit_Python_DHT.

To install library, get the dependencies with:

sudo aptget install y buildessential pythondev git

and then download and install the library with:

mkdir -p /home/pi/sources 
cd /home/pi/sources 

git clone https://github.com/adafruit/Adafruit_Python_DHT.git 

cd Adafruit_Python_DHT 

sudo python setup.py install 

 Here we can do fast check to see that Pi is ready to collect data from sensor:

sudo /home/pi/sources/Adafruit_Python_DHT/examples/AdafruitDHT.py 2302 4

The first argument is the sensor type, it can be 11 or 22 or 2302. The second argument is the RPi GPIO pin which is connected to the sensor data pin.

Output will be like:

sudo /home/pi/sources/Adafruit_Python_DHT/examples/AdafruitDHT.py 2302 4

Temp = 24.2*C Humidity=26.3%

Let’s create sql database to collect our data

Install sqlite

sudo apt-get install sqlite3

  Create DB schema

BEGIN;

CREATE TABLE temp_hum (timestamp DATETIME, temp NUMERIC, hum NUMERIC);

COMMIT;

Great, now we have DB where we could write data. We created table called temp_hum. This table has 3 columns: timestamp, temp and hum. First column will contain actual date and time. So let’s assure that time and data is right. Use date command to check it.

If something is wrong let’s install ntpdate packet and adjust time:

sudo apt-get install ntpdate

sudo service ntpd stop

sudo ntpdate time.nist.gov

sudo service ntpd start

Last step is to adjust timezone. For me timezone is GMT+3. Find your on the internet if you don’t know which one is yours.

sudo cp /usr/share/zoneinfo/Etc/GMT+3 /etc/timezone

Let’s install and configure web server

sudo apt-get install apache2

We will use cgi-bin scripts to display collected data on web page. So let's say apache web server to use them. This step could be optional for you:

a2enmod cgi

a2enmod cgid

service apache2 restart

Move created DB to the /var/www/ folder and change owner of DB:

sudo cp templog.db /var/www/
sudo chown www-data:www-data /var/www/templog.db 

Now we have to do little bit programming or just copy ready solution from the internet (as I did, thanks this author for help). I adjusted liitle bit found code in the internet in order to get expected results.

nano /root/monitor.py

and insert following text inside:

#!/usr/bin/env python

import sqlite3
import Adafruit_DHT
import os
import time
import glob

# global variables
dbname='/var/www/templog.db'

# store the temperature and hummidity in the database
def log_temperature(temp,humm):

    conn=sqlite3.connect(dbname)
    curs=conn.cursor()

    curs.execute("INSERT INTO temp_hum values(datetime('now'), (?), (?))", (temp,humm))
    #curs.execute("INSERT INTO temp_hum values(datetime('now'), (?))", (humm,))
    # commit the changes
    conn.commit()

    conn.close()


# display the contents of the database
def display_data():

    conn=sqlite3.connect(dbname)
    curs=conn.cursor()

    for row in curs.execute("SELECT * FROM temp_hum"):
        print str(row[0])+”     “+str(row[1])+”        “+str(row[2])

    conn.close()

# get temerature and humidity
# returns None on error, or the temperature,humidity as a float
def get_temp_hum():
        try:
                humm, temp = Adafruit_DHT.read_retry(Adafruit_DHT.DHT22, 4)
                humm = round (humm, 2)
                temp = round (temp, 2)
                return humm, temp
        except:
                return None, None

# main function
# This is where the program starts
def main():
        # get the temperature from the device file
        humm, temp = get_temp_hum()
        # Store data to DB in case we have values
        if (humm != None and temp !='None'):
                log_temperature(humm, temp)
#       display_data()

if __name__=="__main__":
    main()

Let me add several comments to the code. First of all we defined database location dbname='/var/www/templog.db'. After that we defined function log_temperature to insert data in our DB. We have function called get_temp_hum to read data from the sensor using python library installed at the beginning of this article.The most important part is main function where we are calling get_temp_hum, check that we have valid data and insert them to DB using log_temperature. One more function display_data() is commented and we need it only for debug purposes. Uncomment this line and execute the script to assure that you have inserted data in the DB.

Don't forget to make file monitor.py executable:

chmod 755 /root/monitor.py

Let's collect data every 15 minutes. For this print:

crontab -e

Insert this line:

*/15 * * * * /root/monitor.py

Note: If execution of script fail for some reason try to excute script manually with sudo /root/monitor.py. Also you could monitor script execution in crontab using postfix mail service. To install it use sudo apt-get install postfix.

Last step is create script which will display data on the web page.

nano /usr/lib/cgi-bin/webgui.py

Insert there following text:

#!/usr/bin/env python
import sqlite3
import os
import sys
import cgi
import cgitb

# global variables
dbname='/var/www/templog.db'

# print the HTTP header
def printHTTPheader():
    print "Content-type: text/html\n\n"

# print the HTML head section
# arguments are the page title and the table for the chart
def printHTMLHead(title, table):
    print "<head>"
    print "    <title>"
    print title
    print "    </title>"

    print_graph_script(table)

    print "</head>"


# get data from the database
# if an interval is passed,
# return a list of records from the database
def get_data(interval):

    conn=sqlite3.connect(dbname)
    curs=conn.cursor()

    if interval == None:
        curs.execute("SELECT * FROM temp_hum")
    else:
        curs.execute("SELECT * FROM temp_hum WHERE timestamp>datetime('now','-%s hours') AND timestamp<=datetime('now')" % interval)

    rows=curs.fetchall()

    conn.close()
    return rows


# convert rows from database into a javascript table
def create_table(rows):
    chart_table=""

    for row in rows[:-1]:
        rowstr="[‘{0}’, {1}, {2}],\n”.format(str(row[0]),str(row[1]),str(row[2]))
        chart_table+=rowstr

    row=rows[-1]
    rowstr="[‘{0}’, {1}, {2}]\n”.format(str(row[0]),str(row[1]),str(row[2]))
    chart_table+=rowstr

    return chart_table


# print the javascript to generate the chart
# pass the table generated from the database info
def print_graph_script(table):

    # google chart snippet
    chart_code="""
    <script type="text/javascript" src="https://www.google.com/jsapi"></script>
    <script type="text/javascript">
      google.load("visualization", "1", {packages:[“corechart”]});
      google.setOnLoadCallback(drawChart);
      function drawChart() {
        var data = google.visualization.arrayToDataTable([
          [‘Time’, ‘Humidity’, ‘Temperature’],
%s
        ]);
        var options = {
          title: 'Temperature and Humidity'
        };
        var chart = new google.visualization.LineChart(document.getElementById('chart_div'));
        chart.draw(data, options);
      }
    </script>"""

    print chart_code % (table)

 


# print the div that contains the graph
def show_graph():
    print "<h2>Temperature Chart</h2>"
    print '<div id="chart_div" style="width: 900px; height: 500px;"></div>'

# connect to the db and show some stats
# argument option is the number of hours
def show_stats(option):

    conn=sqlite3.connect(dbname)
    curs=conn.cursor()

    if option is None:
        option = str(24)

    curs.execute("SELECT timestamp,max(temp) FROM temp_hum WHERE timestamp>datetime('now','-%s hour') AND timestamp<=datetime('now')" % option)
    tempmax=curs.fetchone()
    tempmax="{0}&nbsp&nbsp&nbsp{1} %".format(str(tempmax[0]),str(tempmax[1]))

    curs.execute("SELECT timestamp,min(temp) FROM temp_hum WHERE timestamp>datetime('now','-%s hour') AND timestamp<=datetime('now')" % option)
    tempmin=curs.fetchone()
    tempmin="{0}&nbsp&nbsp&nbsp{1} %".format(str(tempmin[0]),str(tempmin[1]))

    curs.execute("SELECT avg(temp) FROM temp_hum WHERE timestamp>datetime('now','-%s hour') AND timestamp<=datetime('now')" % option)
    tempavg=curs.fetchone()

    curs.execute("SELECT timestamp,max(humm) FROM temp_hum WHERE timestamp>datetime('now','-%s hour') AND timestamp<=datetime('now')" % option)
    hummmax=curs.fetchone()
    hummmax="{0}&nbsp&nbsp&nbsp{1} C".format(str(hummmax[0]),str(hummmax[1]))

    curs.execute("SELECT timestamp,min(humm) FROM temp_hum WHERE timestamp>datetime('now','-%s hour') AND timestamp<=datetime('now')" % option)
    hummmin=curs.fetchone()
    hummmin="{0}&nbsp&nbsp&nbsp{1} C".format(str(hummmin[0]),str(hummmin[1]))

    curs.execute("SELECT avg(humm) FROM temp_hum WHERE timestamp>datetime('now','-%s hour') AND timestamp<=datetime('now')" % option)
    hummavg=curs.fetchone()


    print "<hr>"


    print "<h2>Minumum Value&nbsp</h2>"
    print tempmin
    print hummmin
    print "<h2>Maximum Value</h2>"
    print tempmax
    print hummmax
    print "<h2>Average Value</h2>"
    print "%.1f" % tempavg+"%"
    print "%.1f" % hummavg+"C"
    print "<hr>"

    print "<h2>In the last hour:</h2>"
    print "<table>"
    print "<tr><td><strong>Date/Time</strong></td><td><strong>Temperature</strong><td><strong>Humidity</strong></td></td></tr>"

    rows=curs.execute("SELECT * FROM temp_hum WHERE timestamp>datetime('now','-1 hour') AND timestamp<=datetime('now')")
    for row in rows:
        tempstr="<tr><td>{0}&emsp;&emsp;</td><td>{1} %</td><td>{2} C</td></tr>".format(str(row[0]),str(row[1]),str(row[2]))
        print tempstr
    print "</table>"

    print "<hr>"

    conn.close()

def print_time_selector(option):

    print """<form action="/cgi-bin/webgui.py" method="POST">
        Show the temperature logs for
        <select name="timeinterval">"""


    if option is not None:

        if option == "6":
            print "<option value=\"6\" selected=\"selected\">the last 6 hours</option>"
        else:
            print "<option value=\"6\">the last 6 hours</option>"

        if option == "12":
            print "<option value=\"12\" selected=\"selected\">the last 12 hours</option>"
        else:
            print "<option value=\"12\">the last 12 hours</option>"

        if option == "24":
            print "<option value=\"24\" selected=\"selected\">the last 24 hours</option>"
        else:
            print "<option value=\"24\">the last 24 hours</option>"

        if option == "168":
            print "<option value=\"168\" selected=\"selected\">the last 168 hours</option>"
        else:
            print "<option value=\"168\">the last 168 hours</option>"

    else:
        print """<option value="6">the last 6 hours</option>
            <option value="12">the last 12 hours</option>
            <option value="24">the last 24 hours</option>
            <option value="168">the last 168 hours</option>"""

    print """        </select>
        <input type="submit" value="Display">
    </form>"""


# check that the option is valid
# and not an SQL injection
def validate_input(option_str):
    # check that the option string represents a number
    if option_str.isalnum():
        # check that the option is within a specific range
        if int(option_str) > 0 and int(option_str) <= 24:
            return option_str
        else:
            return None
    else:
        return None


#return the option passed to the script
def get_option():
    form=cgi.FieldStorage()
    if "timeinterval" in form:
        option = form[“timeinterval”].value
        return validate_input (option)
    else:
        return None

# main function
# This is where the program starts
def main():

    cgitb.enable()

    # get options that may have been passed to this script
    option=get_option()

    if option is None:
        option = str(24)

    # get data from the database
    records=get_data(option)

    # print the HTTP header
    printHTTPheader()

    if len(records) != 0:
        # convert the data into a table
        table=create_table(records)
    else:
        print "No data found"
        return

    # start printing the page
    print "<html>"
    # print the head section including the table
    # used by the javascript for the chart
    printHTMLHead("Raspberry Pi Temperature and Humidity Logger", table)

    # print the page body
    print "<body>"
    print "<h1>Raspberry Pi Temperature and Humidity Logger</h1>"
    print "<hr>"
    print_time_selector(option)
    show_graph()
    show_stats(option)
    print "</body>"
    print "</html>"

    sys.stdout.flush()

if __name__=="__main__":
    main()

Let me again comment the script. First of all we need to know how much hours to display in graph using get_option function. By default 24 hours will be displayed. Function get_data will collect dates from DB. If you have at least one string in the DB data will be converted create_table function to display everything on the web. Actual string is not converting data for 1 week. This is what I have to fix after.

Don't forget to make file webgui.py executable:

chmod 755 /usr/lib/cgi-bin/webgui.py

To see data in the web page open web browser on the PC in the same network with raspberry and insert in the address field:

192.168.1.10/cgi-bin/webgui.py

192.168.1.10 is the ip address of Pi

Final result look like this:

After 10 AM I moved sensor closer to the heating system. This is why we have so big changes in the graph.

Links

  1. http://www.home-automation-community.com/temperature-and-humidity-from-am2302-dht22-sensor-displayed-as-chart/
  2. https://github.com/adafruit/Adafruit_Python_DHT
  3. http://raspberrywebserver.com/cgiscripting/rpi-temperature-logger/building-a-web-user-interface-for-the-temperature-monitor.html

Connect raspberry pi to Wi-FI

Some time ago I decided to become more familiar with Internet of things. I thought about several projects, but I didn't decide what to do this is why I bought several things and forgot about them for a while.

Few days ago I decided to take a look inside. Newbies (like me) think that pieces for the project could be expensive, actually they are not so. I decided to publish what I bought and how much it cost.

  • Electronic Parts Pack KIT for ARDUINO component Resistors Switch Button WT, Price – 3.70 USD
  • Soil Hygrometer Humidity Detection Module Soil Moisture Water Sensor, Price –  1.16 USD
  • MB102 Power Supply Module 3.3V 5V+Breadboard Board 830 Point+65PCS Jumper cable, Price – 5.08 USD
  • DHT22 AM2302 Digital Temperature and Humidity Sensor module Replace SHT11 SHT15, Price – 5.97 USD
  • UNO R3 ATmega328P CH340 Mini USB Board for Compatible-Arduino NEW, Price – 3.95 USD
  • Raspberry Pi-2 + Acrylic Enclosure Case + Heat sink, Price – 42.27 USD 
  • Micro USB Charger 5V,  2A, Price – 7.64 USD
  • Temperature sensor DS18b20 in iron case, Price – 1.25 USD
  • Micro SD memory card Class 10 Size 8GB , Price – 5.17 USD
  • Wi-Fi USD Adapter EDUP EP-N8508GS, Price – 5.66 USD

Let's start with something. I went to the https://www.raspberrypi.org/ and found that I have to copy operating system to the SD card and insert SD card to the raspberry. Ok I decided to load NOOBS OS and after that select raspbian OS. I bought Class 10 SD card (fast SD Card 🙂 ). After I inserted card in the raspberry and connect via HDMI raspberry I found that nothing is printed on screen. First attempt fail, the good idea was to consult here the list of compatible SD card before to buy something. Ok I found at home old SD card at 8 Gb and I decided to try it just for proof of concept. Second attempt was good.

To configure raspberry you have to get USB mouse and Keyboard. Nothing of those I have in the house. After several hours I found only mouse. Ok let's start with this, especially that on first stage I have only to select OS and boot raspberry. After boot I got the screen with GUI and I still don't have keyboard. Probably I could connect via ssh to OS was my idea. I could do it in two ways via LAN or WLAN. My Home router has password, so even wi-fi card is up raspberry doesn't know my password. I connected laptop to device via LAN, bring up on laptop DHCP server (tftpf32 could do it very fast), found which ip address was leased to raspberry and connects to him via putty (default user: pi, password: raspberry). Everything is easy here I thought. Let's connect raspberry to wi-fi. I have never been connect linux to wi-fi before. It's shouldn't be too difficult, especially that I am connecting linux via LAN in a day by day activities. Spent some time on google I found a lot of step-by-step instructions and different options, but no one work for me.

My Wi-Fi Card is:

pi@raspberrypi:~ $ lsusb
Bus 001 Device 004: ID 0bda:8176 Realtek Semiconductor Corp. RTL8188CUS 802.11n WLAN Adapter

My OS is:

pi@raspberrypi:~ $ cat /etc/*release
PRETTY_NAME="Raspbian GNU/Linux 8 (jessie)"
NAME="Raspbian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

After several hours I start to hate Linux and I saw that I am not single in this feeling. How can so easy task take so much time? I started my attempts with wireless-tools installed on raspbian. Command iwlist wlan0 scan provides info about my router. This means that driver for wlan0 interface was installed and interface is up. Let's try connect manually to router using: iwconfig wlan0 essid <YOUR_SSID> key s:<Your password>.  Several attemps fail. Ok let's try to use another option. I found that wpa_supplicant was created especially for people who want to use Wi-FI in secure way and has WPA2 encryption. Command wpa_passphrase <Your_SSID> <Your_Password> >> /etc/wpa_supplicant/wpa_supplicant.conf will create config file for your Network and save password in the safe way. If select wpa_supplicant method you have to fill /etc/network/interfaces with something like this:

allow-hotplug wlan0
iface wlan0 inet manual
wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

Inet manual is required by wpa_supplicant even you have to use dhcp. If you want dhcp do this:

allow-hotplug wlan0
iface wlan0 inet manual
wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf

iface default inet dhcp

I created wpa_supplicant config for me, checked /etc/network/interfaces but I still couldn't associate with Wi-Fi. After several attempts I tried to run wpa_supplicant daemoon manually. This part I want to share with you in more details. After network service restart (/etc/init.d/networking restart) in the background is started wpa_supplicant daemoon. Command ps aux | grep wpa* provides me the following:

/sbin/wpa_supplicant -s -B -P /run/wpa_supplicant.wlan0.pid -i wlan0 -D nl80211,wext -c /etc/wpa_supplicant/wpa_supplicant.conf

This still doesn't help me. Somewhere should be the log files. Yes they are in /var/log/syslog or /var/log/messages or /var/log/wpa_supplicant.log. For me it was /var/log/syslog. Inside I found that:

nl80211: Driver does not support authentication/association or connect commands

Hmm, let's kill the process and play manually with different options, drivers of wpa_supplicant.

Sometimes I could get another error:

wlan0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="Home" auth_failures=11 duration=120 reason=CONN_FAILED

I edited several times wpa_supplicant.conf file with different options and at the end I got something like this:

root@raspberrypi:/home/pi# cat /etc/wpa_supplicant/wpa_supplicant.conf
ap_scan=0
country=GB
ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1

network={
        ssid="<Your_SSID>"
        proto=RSN
        scan_ssid=1
        key_mgmt=WPA-PSK
        pairwise=CCMP TKIP
        group=CCMP TKIP
        psk="<Your_Password>"
        #psk=8d0526f31a39e78dfgk;srs61ae5a1e9d0c30e8e48cc838f8541d4a8e7af259 #password generated by wpa_passphrase
}

At the beginning I had only ssid and psk key generated by wpa_passphrase. Information about key_mgmt, pairwise you could get using iwlist wlan0 scan command. Even everything seems to be logic I still couldn’t connect to the Router and get ip address from him. This is why I tried to launch manually supplicant using debug options:

/sbin/wpa_supplicant -s -B -P /run/wpa_supplicant.wlan0.pid -i wlan0 -D nl80211,wext -c /etc/wpa_supplicant/wpa_supplicant.conf -dd

Debug options and /var/log/syslog will give you information what failed for you. Each new option enabled/disabled in wpa_supplicant.conf file and drivers used in command provide new error to my investigation. It's interesting, but didn't help. Try to see in your case this.

Some instructions proposed to change /etc/network/interfaces in different way to connect raspberry to wi-fi. I have new way to investigate. I saw that someone is saying to use wpa-ssid, someone wpa-essid and different things. Why options are different? Let's try all of them :). It's good idea but this still didn't solve the issue. Windows seems to be better 🙂 was my idea. Trying to find solution for me I found very interesting article written by mrEngman. Inside he shares his script how to configure wi-fi cards based on RTL8188CUS. I decided to read script instead to launch him (thanks author for very good comments inside). And there I found that wpa-ssid is for Router without authentication. For those with autehntication should be wpa-essid. wpa-psk should be used when you have WPA2 (most used), in case you have WEP you shold use wpa-key. Thanks to mrEngman that he decided to explain what means different options. After I edited /etc/network/interfaces like below everything started to work for me.

auto wlan0
iface wlan0 inet dhcp
wpa-essid <Your_SSID>
wpa-psk <Your_Password>

Conclusion: "Never give up!". Linux could be awful at first meet, but when you start to understand and think what is the impact of command, where to see logs, what means your options (use man pages), how to restart process manually and apply changes it becomes logic.

Important: Don't use at the same time iwconfig, wpa_supplicant. Select one method or script and debug it. Remember that processes are running in background and applied commands could affect actual results.

Links:

  1. https://www.raspberrypi.org/ 
  2. http://elinux.org/RPi_SD_cards
  3. http://tftpd32.jounin.net/
  4. http://www.raspberrypi.org/forums/viewtopic.php?t=6256

 

VRRP – start build redundant network

Redundant in the network is critical for service availability. Everyone wanted to have five "9" – 99,999% of available service. In order to get this we are buying expensive equipement, learn how works STP, RSTP, MSTP and other redundant protocols. Often we forget to make redundant obvious things. How many gateways could you configure on the PC? Always one.. You know that gateway is our exit point from the network in the big Internet. So in case this point is failing we are losing our way out.

VRRP is based on the RFC 3768 and RFC 5798. RFC 3768 is already version 2 of VRRP and is only for IPv4. RFC 5798 is already for IPv6 support. You could see that they are pretty new (2004 and 2010 year). To understand very good VRRP you could read RFCs, but I don't know too many people who like text without images :). This is why I decided to create some images and sample which could help you understand how RFC 3768 recommend to implement this standard.

VRRP goal is to provide redundancy for the gateway IP address. We will not change the way how we are configuring default gateway on the PC. So we have to do something on the gateways. VRRP will force routers to talk between them in order to see who owns Virtual IP address. Virtual IP address will be assigned as default gateway on the PCs in our network. Routers will talk via multicast ip address 224.0.0.18.

VRRP has 3 states: Init, Slave and Master. Init state is saying that VRRP is enabled on the Router but for some reason is irrelevant to start communication. For example we have to send VRRP packets via Fa0/0 interface, but this interface is down. VRRP slave means that router at the moment is backup. VRRP Master means that VRRP router owns Virtual IP. Routers have to select VRRP master. They could this based on the VRRP priority. Higher priority is better. We could configure VRRP priority from 1 to 254.

Do you remember that in ethernet LANs we have to encapsulate packets and add IP source, IP destination, Mac Source, MAC destination ? Here we have another interesting thing for VRRP. VRRP enter not only VIrtual IP but also Virtual MAC which will have following format: 00-01-5E-00-01- <GROUP_ID>. Group id value is limited from 1 to 255 (from 01 until FF in hex). Virtual IP and VIrtual MAC will be used only for the traffic which is going through routers, all other traffic will use Physical MAC and physical IP. Let's see the sample how ill work ARP protocol with VRRP:

We must have same values for 3 parameters on both routers:

  • Group ID
  • Group IP address
  • Advertisement interval

What for we need advertisement interval and how fast backup router will become VRRP master in case of failure? Here is the formula: 3*Adv.timer + skew time. Skew time formula = (256-Priority)/ 256. Skew time goal is to add time required for the message propagation in the LAN. So in case we have advertisement timer 1 sec, switchover will be done in maximum = 3*1 + (256-100)/256= 3.6 s. Pretty good result. We could decrease this value under 1 sec. I will not recommend to do that for beginners. VRRP is working with other protocol in the networks and adjustments in this area could create another issues for other protocols. 

Who is the VRRP Master?

VRRP master is router with biggest priority when VRRP preemption is enabled on all routers. Let's see one example:

When our Routers are up VRRP Master will be always R2. When R2 is down or link between R2 and switch is down VRRP master will be R1. Simple until now.

Let's suppose that we have network with big fluctuations. Routers are going down often. VRRP switchover is too often and sometimes without any reason. Let's suppose that R2 has preemption on and he is rebooting each 5 minutes. Our default gateway will go up/down every 5 minutes. R1 at the same time is stable and ok. For this case we could configure VRRP preemption off on R2. When R2 will go up after reboot he will stay in the VRRP Slave state. He lost Master state and the only way to recover it is to shutdown R1.

When our Routers are up VRRP Master will be always R2. When R2 is down or link between R2 and switch is down VRRP master will be R1. Simple until now.

Let's suppose that we have network with big fluctuations. Routers are going down often. VRRP switchover is too often and sometimes without any reason. Let's suppose that R2 has preemption on and he is reabooting each 5 minutes. Our default gateway will go up/down every 5 minutes. R1 at the same time is stable and ok. For this case we could configure VRRP preemption off on R2. When R2 will go up after reboot he will stay in the VRRP Slave state. He lost Master state and the only way to recover it is to shutdown R1.

VRRP Priority 0 and 255

RFC 3768 reserved VRRP priority 0 and 255 for two special cases. R2 is VRRP master for the network. Based on some events R2 decided to leave VRRP Master role. So he will send VRRP priority 0 which means that backup Router has to become VRRP Master immediately.

VRRP allows us to configure the same ip address for the Virtual IP and physical only one one router in the network. In this case Router where we configured the same IP address for the VIP will send VRRP packets with priority 255.

Decrease VRRP priority in some conditions

Last thing what I want to say about VRRP is related to dynamic priority. Network developers decided to adjust VRRP priority based on some important metrics. For example R2 router will monitor link to the ISP and will decrease VRRP priority when this link is down. More than that you could use Cisco IP SLA (when you have Cisco equipment) and monitor quality of the link (delay, drops…). Quality will be trigger for value of VRRP priority in your case.

In the network with big numbers of VLANs it is recommended to balance VRRP master roles between R1 and R2. For Example R1 is VRRP master for VLAN 10,20 and 30, when R2 is VRRP Master for VLAN 15 and 25. In this way you will use all links in your network. VRRP has to work very close with protocols like ARP, Proxy-ARP and STP. When you set VRRP on L3 switches and have redundant links you have to think how traffic will flow through network at the L2 and L3. In other case suboptimal routing will deprecate al benefits of protocols. STP and VRRP timers also have to be synced and this is why I recommended you not to change default timers.

During the switchover from one VRRP router to another, new master will send gratuitous ARP and announce him as new owner of the IP. This will converge network faster.

Links:

  1. https://www.ietf.org/rfc/rfc3768.txt
  2. https://tools.ietf.org/html/rfc5798

TCP Performance. Why I couldn’t reach expected speed?

After you read this article you will find answers or suggestions for following questions:

  • I have 1Gbps NIC on PC and Server. They are on the same network, all links between are 1Gbps but our throughput is low.
  • TCP header should have 20 bytes, but I saw headers with 32 or 40 bytes. What means those values?
  • I heard about different congestion control mechanisms. What is the difference between them?

TCP connection is established between two end hosts. Doesn't matter how many routers, networks, switches you have between. TCP will ask remote host application to get/transmit data. This means that TCP interact directly with application on local and remote host. For example we open browser on our PC and enter in the address field www.google.com. This means that we are asking web page located on the google's server hosted by some hosting application, like apache2 or something else.

Let's take a look on the TCP header:

Source: https://ciscoskills.net/2011/02/25/understanding-tcp/tcpheader/

In the TCP header we could find answer why sometimes we could see TCP header with more than 20 bytes. Options allow us to extend header up to 52 bytes. Usually we could see bigger header in the TCP three way handshake, after that communication has standard header.

TCP is called reliable protocol and connection oriented. Let's review what means reliable protocol. TCP has on the header fields called Sequence number (seq) and Acknowledgement number (ack). Sequence numbers are generated in the random way at the beginning of TCP communication. Acknowledgement number is used to confirm that we get data send by sender. In case we don't have confirmation data will be retransmitted. Let's see how combination of seq and ack works as concept in following example:

First 3 packets in the communication are related tp three way handshake. You could note that start sequence number on the PC and Server are different (they could be the same). We set them different just to avoid confusion. This example shows us download of example.zip file from server. Server has to divide file on segments to send them via Internet. How many segments will be? This Depends on MSS (Maximum segment size) and size of the file.

Acknowledgement number has to confirm that we get data. For example server sends us two segments with total size 1500 bytes. Server start sequence number was 401, so 401 + 1500 bytes will be 1901. This means that server expect to get segment with ack 1901+1=1902. Server will keep sent segments without acknowledgement in his tx buffer. When server get ack about received data he release segments from tx buffer (keep in mind tx buffer is limited in size). In the same way we will get Ack number for the last sent segment.

Why PC acknowledge first two segments in one hit and third in separate ack? Server task is to send as much as he could segments, PC task is to advertise as fast as he could already received segments. So you could have 1 ACK segments per 50 segments and after that 1 ACK after next 5 segments.

TCP Performance

TCP could adapt speed of transfer to the link condition

Let's go back to the TCP header. There we have very important field called Window. This field has crucial impact on performance and speed transfer. Let's see why. We have a transfer sample between server and client. Let's suppose that client announce that his window is 14600 bytes (Window = Receive buffer size). This means that server could send segments with total size up to 14600 bytes until he has to stop and wait for ack message. Usually client will send ack faster and window will not become full. In case window become full server stop communication until client will say that he is ready to get new data. So in other words bigger window require less ack packets and increase throughput. Window field is limited to 16 bits, which means 2^16 =65535 bytes.

Why window field vary from packet to packet?

When we establish TCP connection window field has low value. This value is linked to default mss of OS multiplied to 2,3 or 10. In short time window value increase very fast up to saturation. Window is increased after each ACK sent by receiver. This procedure is called TCP slow start.

Here we have a formula which shows us what is maximum speed:

MaxSpeed = Window * 8 / Delay in sec.

65535 * 8 / 0.001 sec = 524 Mbps.

Most interesting in this formula is dependency of delay. When your server is far and between you is present slow slow link this means that speed will dramatically decrease.

Window field will decrease in case of lost packets, this will be done very fast. In case of too many lost packets TCP connection will be simply closed with error.

TCP Congestion control mechanism

Window, sequence number and acknowledgement number fields are responsible for the speed control. Server cannot push too much data on the client, this could overhead RX buffer and all new packets will be dropped (unacceptable situation for the TCP communication). All those specification were collected in the TCP congestion control mechanism called TCP Reno. In our days we have links, NICs, OS with much more capacity this is why it is required to do changes in the TCP stack inside of operating systems. See here list of new proposed mechanism. For example starting Linux kernel 2.6.19 up to kernel 3.1 default congestion mechanism inside of Linux is CUBIC.

Are the new requirements for the speed control change TCP header? No, just few new options will be added and TCP stack inside of operating system will be changed accordingly.

We said that TCP is connection oriented which means that before to start exchange of data we have to negotiate and answer following questions:

  • Are both hosts ready to exchange data between their applications?
  • Which additional options support both hosts?
  • At the end of communication we have to close connection. Operating systems limit maximum number of connections to avoid overhead of the system.

Which options could be interesting for us:

  • Selective Acknowledgement
  • Windows Scaling
  • Timestamp options

Let's look inside of packet captured by wireshark (good bless people who are working on this application).

Please pay attention that information placed between brackets in the wireshark is not present in real packet, it is info added by wireshark after analysis to help us understand communication.Different wireshark version will treat in different way this information.

Selective Acknowledgement

We said that Server should keep in his TX buffer sent segments until he would get Ack for data.Next question it is how data will de retransmitted in case of lost packet? Let's suppose that PC announced Window 14000 bytes. Server is sending 14 segments each by 1000 bytes. We lost segment #5 and #8. Classical TCP have to retransmit full window (all 14 segments). Selective ACK suport allow sender to send only #5 and #8. Also this allow server to release ack packets. I will give an example how it works little bit latter.

Window Scaling

Let's go back to the speed formula MaxSpeed = Window * 8 / Delay in sec. When we have Long Fat Networks (networks with big delay, like satellite communication) we could reach 600ms delay. This means that maximum speed through this network could be 65535 * 8 /0.6= 0.8 Mbps :). It could be strange to pay huge amount of money to get less than 1 Mbps just because of TCP limitation. Window Scaling option allow us to multiply announced window with scalar factor and get calculated window size. I will not explain how is calculated at which value multiply window, see RFC 1323 for more details. I just want to add that window scaling allow us to get up to 1 GB of RX Window without ack :).

Timestamp

Sequence and acknowledgement number are limited by in the TCP header by 32 bits. Together with ip source and destination those allow us to distinguish which application has to receive specific segment in the network. In High loaded servers this is not enough. It could be that segment will be assigned to wrong TCP stream when we reach 1 Gbps traffic. Timestamp added to the packet header allow us to get in game one more unique parameter to distinguish flow between them.

TCP Fast Retransmission

TCP is reliable protocol. He has to retransmit lost segments. How much time has to wait host until resend lost segment? Regular timer could be 2 sec. It is too much for high performance application. How to deal with that? We have ack segments which inform us how much data receiver got. Let's suppose that Server sent 14 segments. Segment #7 was lost. Receiver will inform Server that he got data up to segment #6 even in his buffer more data are present. Receiver will send again and again that he received up to segment #6. All new segments will be received and buffered. After third duplicate ACK server will think: "Hmmm.. Maybe he indeed didn't get 7th segment. Let me send it one more time.." (of course only in case selective ack is supported). In case of 1 ms delay between Server and PC retransmission could take 1.5 ms (regular timer 2 sec). Really impressive, isn't ? When delay between server and PC is more we could attest hundreds of DUP ACK messages in the capture (it should be like this!). For example 80 ms delay creates in the capture 80 DUP ack messages (it was around 80 on different tests).

Why not to adjust regular TCP timer for retransmission to relevant value instead of fast retransmission?

RFC 6582 is trying to adjust RTO (Recovery time Objective). We have different network types (Wi-Fi, Satellite, Fiber…). They have different delay and RTO couldn't be equal for all them. Right, let's measure RTT of each TCP segment and adjust RTO based on that measurement. How to do it ? Take a look in the RFC 🙂

Now we are ready to see full picture of TCP Congestion Control Mechanism:

Note: In this sample I did one mistake. After receiver notify lost packet he has to decarese Window Size.

What for information described in this article could be useful for us?

We still don't have answer for one objective of this article:

  • I have 1Gbps NIC on PC and Server. They are on the same network, all links between are 1Gbps but our throughput is low.

We have checked everything on the network, but issue is still present. Let's take a look on the traffic sniffer:

So in this graph we could see that our PC announce very low window size together with latency this creates average speed ~21Mbps. I observed that speed depends on the application used in the communication. For example built-in FTP client in Windows 7 got 35 Mbps, at the same conditions Filezilla FTP client got 140 Mbps. Try to use another application or adjust parameter is the operating system (do it carefully, could damage another things). Following image shows available parameters for the TCP in Centos6.5.

More often TCP performance could be affected by CPU utilization spike on the sender/receiver, buffer overload.. Traffic sniffer could help us to find when it happened.

Wireshark graph shows us that communication stops for several seconds. Let's explain what happened in the capture on that time:

Our PC for some reason decreases his window from 5840 bytes to 2920 bytes. Server sends last two segments (each 1460 bytes) and has to stop until ack from the PC. PC is sending last segment where window value is 0. This means stop sending traffic I couldn't get more. Server will wait and try from time to time asking PC if we could continue send traffic. Only after 6 seconds client will open again window and get remain info. Why PC closed window? You have to analyze logs on the PC to find reason.

Links:

  1. TCP congestion control – https://en.wikipedia.org/wiki/TCP_congestion_control
  2. Wireshark application –  www.wireshark.org/
  3. RFC 1323 TCP Extensions for High Performance – www.ietf.org/rfc/rfc1323.txt
  4. RFC 6582  The NewReno Modification to TCP's Fast Recovery Algorithm – https://tools.ietf.org/html/rfc6582

GNS3 – doesn’t work ping through the cloud on VM

First of all I would like to mention that all actions published here have as scope educational purpose. A lot of people who are trying to use GNS3 for testing/lab purpose meet the problem with connection to Internet through cloud.

My problem was: I couldn't ping from router in gns3 any external host. I meet this problem several times on different Operating Systems with different versions of GNS3. I researched on Internet for the solution many times and every time gave up in front of the problem. This is why I decided to write this article. It could be useful for someone in the same situation.

Let's begin with topology:

  1. I decided to install GNS3 on the Linux Ubuntu 14.04. Big Thanks to people who wrote this article http://www.computingforgeeks.com/2014/12/best-way-to-install-gns3-version-12-in.html and special thanks for those who wrote installation script.
  2. One of the most important things is that I decided to install GNS3 on VM using as hypervisor ESXi. Below you can see the topology. So I selected VM1 for this purpose.
  3. I connected ESXi host to the Cisco's switch using two cables. Cable connected to the G0/1 interface was set to trunk. Cable on G0/2 wasn't used (we will use it for debug purpose later).
  4. My VM1 has two interfaces. One is for remote management and the second for testing purpose.
  5. From VM1 I could ping from VLAN 10 interface User PC.

VMware_GNS3

Let's move to GNS3 and create topology:

I am not going explain how to create topology and connect to the cloud you could easy find how to do it on internet. I will just publish screen from my topology:

VMware_GNS3_topologyAfter topology creation I was configured Interface Fa0/0 to get ip address via DHCP:

R1(config)#int fa0/0
R1(config-if)#ip add dhcp
R1(config-if)#no sh
R1(config-if)#
*Mar  1 00:03:35.071: %LINK-3-UPDOWN: Interface FastEthernet0/0, changed state to up
*Mar  1 00:03:36.071: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/0, changed state to up
R1(config-if)#
*Mar  1 00:03:48.367: %DHCP-6-ADDRESS_ASSIGN: Interface FastEthernet0/0 assigned DHCP address 192.168.140.65, mask 255.255.255.0, hostname R1

Ok I have confirmation that packets are going through cloud. Everything is good. Next test is to ping CR2 interface ip address:

R1(config-if)#do ping 192.168.140.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.140.254, timeout is 2 seconds:
…..
Success rate is 0 percent (0/5)

 Hmm.. Maybe ICMP packets were suppressed. Let's try to send ping from VM – ping test passed. So we have the situation when some packets could reach GNS3 and some others couldn't. Let's try to find where is the exactly the problem. Packets could be stuck on VM or somewhere in the network. We are connected to the Cisco switch and we have option to configure SPAN or port mirroring in order to check out suppositions. Let's do it.

C2960(config)#monitor session 1 source interface gigabitEthernet 0/1 both

C2960(config)#monitor session 1 destination interface GigabitEthernet 0/2

Please keep in mind that Gi0/2 interface couldn't send legacy user traffic after this configuration. In our case this is not a problem because I don't use this interface. On the VM2 I run Wireshark application and from R1 in GNS run again ping command. In the wireshark I see that packets leave VM and return back to the host. So problem seems to be somewhere on the VM. It could be SElinux or iptables.

Let's try another thing. Let's check how is set vmnic0 interface for VLAN10 in security tab. For this go the vSphere Client -> Configuration -> Networking -> Properties -> Double-click on your VLAN -> Press Security tab. I had the following configuration:

VMware_GNS3_errorLet's try to change promiscuous mode from Reject to Accept and repeat test again. The results:

R1(config-if)#do ping 192.168.140.254

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.140.254, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/20/44 ms
R1(config-if)#

Here was my problem. Promiscuous mode allows vSwitch to forward all frames including those which are not directed to VM. Router in GNS3 acts as virtual interface inside of VM. From security purpose VMware block frames which are addressed to VM.

I hope that this article could be useful for you.

 

How to find if Cisco supports specific command or feature

Hello all,

This is my first post in the blog and first article published in English. It is going to be interesting experience smiley

I decided to write this article to share my experience which could be interesting for some of you. One of the very common issue with Cisco IOS is that I found a command which doesn't work for me but this command is working for other people. Why so and how to solve this issue? I will try to give you some advice below.

Few time ago my task was to configure qinq vlan tagging on the Cisco switch. Let's start with what means qinq. Here you can find article from Cisco http://www.cisco.com/en/US/docs/ios/lanswitch/configuration/guide/lsw_ieee_802.1q.html which show us that qinq suppose to tag one frame twice. What is the reason to use qinq? Let's suppose that you have in the network one specific VLAN, for example 30. The same vlan 30 with the same pool of ip address you have on another part of your network. This situation is typical for ISP and his clients. Your task is to keep traffic from those VLANs separate and transport traffic through your network. So the solution for that is IEEE 802.1q tunneling or qinq tunnel in other words.

If you have some experience with Cisco you will try to enter on the Google how to configure IEEE 802.1q tunneling and will find article like this http://networklessons.com/switching/802-1q-tunneling-q-q-configuration-example/. By the way very good explained how to configure this specific feature. Not all people like to read official cisco docs (this was also my mistake). Let's say that Cisco provide information in not so interactive way like Rene did. I read Rene's article and said that is very easy to configure qinq and I need one switch and some minutes to do this task. I got C2960G switch, installed him and was starting to configure. I went to the interface:

Switch(config-if)#switchport mode ?
  access   Set trunking mode to ACCESS unconditionally
  dynamic  Set trunking mode to dynamically negotiate access or trunk mode
  trunk    Set trunking mode to TRUNK unconditionally

It seems to be that I will need more than couple of minutes to configure qinq. Command switchport mode qot1q-tunnel is missed from my IOS. So I need new version of IOS. But the question is which exactly one? One more problem could be that model of switch doesn't support this feature. So let's try to see what Google says about qinq on C2960. The result is confusing. In some links we could find that C2960 doesn't support this feature in other we see that feature is working (common situation). We need to know the answer. Let's go to the Cisco Feature Navigator http://www.cisco.com/web/go/fn. Feature navigator is very powerful tool which could show us if specific feature is supported on particular IOS. The most difficult thing here is to find the right name of the feature. Try to enter for example IPv6 on the feature field and you will see a lot of options. Which options is mine and command which I have to enter in configuration is covered? Try to search with some assumptions and using description of feature to do it faster. I supposed that my feature must contain 802.1q in his name. I supposed that name of my feature is IEEE 802.1Q tunneling. Description showed that it could be.

qinq

I searched on the Cisco's site and I found that is exactly the thing which I looked for. Now I have feature name and I see that this feature is supported on list of IOS-es. The great news I have one of this version on another switch. I repeated my configuration:

Switch(config-if)#switchport mode ?
  access   Set trunking mode to ACCESS unconditionally
  dynamic  Set trunking mode to dynamically negotiate access or trunk mode
  trunk    Set trunking mode to TRUNK unconditionally

Hey.. What is wrong? Feature name? License? It could be.. Let's go back to the internet..

License on this switch doesn't have any limitation and feature name seems to be right. Where to look ? Let's try Cisco Configuration guide http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst2960/software/release/15-0_2_se/configuration/guide/scg2960.html. Configuration guide contain list of commands which are supported by the switch in specific version of IOS. I didn't find anything about qinq in configuration guide..

Conclusion: Cisco Feature Navigator is great tool but the results of searching must be verified with Cisco configuration guide to assure that your model of equipment is supported. It very useful to check before to upgrade switch with new IOS and find that command or feature which you are look for isn't working. My switch doesn't support IEEE 802.1q tunneling. This feature is supported by C2960X series of switches sad.