Instructions for installing Netflix RSS Reader on Amazon EC2
In the last week of December 2013, I built and installed the Netflix example RSS Reader application through following these instructions on the Netflix recipes-rss wiki. See also the Netflix Tech Blog for an overview.There were a lot of dead ends and rabbit warrens involved in this process as there are a lot of components to get up and running, including: infrastructure, networking, JDK, Gradle, source code, Tomcat, Jetty.
Hopefully these step-by-step instructions make it easier for someone.
It's worth noting a couple of default assumptions that aren't immediately clear that do actually make your life easier:
- All 3 services (Eureka, RSS Middletier, RSS Edge) are installed on the same host - you don't need to create 3 separate machines and network them all together
- The RSS Reader example does not require Cassandra but uses in-memory data storage by default (connecting to Cassandra is optional)
The first set of instructions below are how to get it running on a single instance. The follow-on instructions describe how to scale it out to separate clusters of nodes - which is where Turbine/Hystrix gets interesting.
On to the instructions...
Basic Single-Instance Installation
# Instructions prefixed with "###" are dead-ends I went down - may save you time to skip them. # Create the machine # Create an EC2 Instance in the AWS console with the following: # AMI base image: Centos 6 x86_64 with updates # Instance: m1.small (WARNING: t1.micro's 600M RAM is insufficient) # Create a security group and save the key. Login to your new instance as root with the downloaded key. # Setup the networking configuration to allow the services to talk to each other and allow you to browse to them: # Configure AWS security Group: # Open TCP Input ports: 22, 80, 9090, 9092, 9191, 9192 # Ideally but optionally expose them only to your IP address instead of the whole world (0.0.0.0/0) # Configure iptables: # Flush all existing rules iptables -F # Block null recon packets iptables -A INPUT -p tcp --tcp-flags ALL NONE -j DROP # Reject syn-flood attack iptables -A INPUT -p tcp ! --syn -m state --state NEW -j DROP # Allow loopback for internal services iptables -A INPUT -i lo -j ACCEPT # Open ports 22 (ssh) & 80 (http) iptables -A INPUT -p tcp -m tcp --dport 22 -j ACCEPT iptables -A INPUT -p tcp -m tcp --dport 80 -j ACCEPT # Open ports 9090 & 9092 for RSS Reader Edge webserver iptables -A INPUT -p tcp -m tcp --dport 9090 -j ACCEPT iptables -A INPUT -p tcp -m tcp --dport 9092 -j ACCEPT # Open ports 9191 for RSS Reader Middletier webserver iptables -A INPUT -p tcp -m tcp --dport 9191 -j ACCEPT iptables -A INPUT -p tcp -m tcp --dport 9192 -j ACCEPT # Allow outgoing connections iptables -I INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow all outgoing connections iptables -P OUTPUT ACCEPT # Drop everything else iptables -P INPUT DROP iptables -L -n service iptables save service iptables restart # For better security, consider leaving port 80 closed and forwarding requests on port 80 to port 8080. Then, in the Tomcat instructions below, we could leave Tomcat on the default 8080 port without requiring "root" user # iptables -A PREROUTING -t nat -i eth1 -p tcp --dport 80 -j REDIRECT --to-port 8080 # iptables -A INPUT -p tcp -m tcp --dport 8080 -j ACCEPT # Install JDK ### I shouldn't have done this first step... it only installs the JRE... Gradle needs the JDK ### yum install -y java-1.7.0-openjdk.x86_64 # Install Oracle JDK as per: http://parijatmishra.wordpress.com/2013/03/09/oraclesun-jdk-on-ec2-amazon-linux/ ### Remove OpenJDK. Hopefully not required if you didn't do the above "yum install java-1.7.0-openjdk.x86_64" ### rpm --erase --nodeps java-1.7.0-openjdk java-1.7.0-openjdk-devel yum install -y wget # There's probably a better way to get the most recent JDK - but these instructions made it easy wget --no-check-certificate --no-cookies --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2Ftechnetwork%2Fjava%2Fjavase%2Fdownloads%2Fjdk-7u3-download-1501626.html;" http://download.oracle.com/otn-pub/java/jdk/7u25-b15/jdk-7u25-linux-x64.rpm mv jdk-7u25-linux-x64.rpm\?AuthParam\=1388300280_9fd087722658cfbb8e571f2d0449beea jdk-7u25-linux-x64.rpm yum install -y jdk-7u25-linux-x64.rpm for i in /usr/java/jdk1.7.0_25/bin/* ; do \ f=$(basename $i); echo $f; \ sudo alternatives --install /usr/bin/$f $f $i 20000 ; \ sudo update-alternatives --config $f ; \ done cd /etc/alternatives ln -sfn /usr/java/jdk1.7.0_25 java_sdk cd /usr/lib/jvm ln -sfn /usr/java/jdk1.7.0_25/jre jre # JAVA_HOME must be set for Gradle to work echo "export JAVA_HOME=/usr/java/jdk1.7.0_25" >> .bashrc . ~/.bashrc ### Install Gradle - may not be required as the Netflix build steps below use self-contained "gradlew" script (which downloads Gradle) ### curl -O "http://downloads.gradle.org/distributions/gradle-1.10-all.zip" ### yum install -y unzip ### cd /opt ### unzip ~/gradle-1.10-bin.zip ### cd ### echo "export PATH=$PATH:/opt/gradle-1.10/bin" >> .bashrc ### . ~/.bashrc # Build RSS Reader Middletier and Edge webapps yum install -y git git clone https://github.com/Netflix/recipes-rss.git cd recipes-rss ./gradlew clean build ### This was required to fix error "Error compiling file: /tmp//org/apache/jsp/jsp/rss_jsp.java org.apache.jasper.JasperException: PWC6033: Unable to compile class for JSP" ### javac /tmp/org/apache/jsp/jsp/rss_jsp.java -cp /root/recipes-rss/rss-edge/build/libs/rss-edge-0.1.0-SNAPSHOT.jar:/tmp -source 1.5 -target 1.5 # Install Tomcat 6 (didn't bother with Tomcat 7 as it wasn't available in defuault Centos 6 yum repo, as per "yum search tomcat") yum install -y tomcat6 sed -i 's/port=\"8080\"/port=\"80\"/g' /etc/tomcat6/server.xml # Set TOMCAT_USER to "root" vim /etc/tomcat6/tomcat6.conf # Replace TOMCAT_USER setting with: TOMCAT_USER="root" # End of edit cd echo "export TOMCAT_HOME=/usr/share/tomcat6" >> .bashrc . ~/.bashrc # Build and deploy Eureka git clone https://github.com/Netflix/eureka.git cd eureka/ ./gradlew clean build cp ./eureka-server/build/libs/eureka-server-XXX-SNAPSHOT.war $TOMCAT_HOME/webapps/eureka.war service tomcat6 start # Make sure there are no errors grep "ERROR" /usr/share/tomcat6/logs/catalina.out | less -S # (takes around 2mins to startup, expect some startup errors due to Eureka not running in an established cluster) # Browse to http://[IP ADDRESS]/eureka/ <-- trailing slash required # In another terminal session # Start RSS Middletier Webserver export APP_ENV=dev cd recipes-rss java -Xmx128m -XX:MaxPermSize=32m -jar rss-middletier/build/libs/rss-middletier-*SNAPSHOT.jar # Test via Admin port: Browse to: http://[IP ADDRESS]:9192 # In another terminal session # Start RSS Edge Webserver export APP_ENV=dev cd recipes-rss java -Xmx128m -XX:MaxPermSize=32m -jar rss-edge/build/libs/rss-edge-*SNAPSHOT.jar # Test via Admin port: Browse to: http://[IP ADDRESS]:9092 # Browse to http://[IP ADDRESS]:9090/jsp/rss.jsp # Add the following RSS feeds: # http://rss.cnn.com/rss/edition.rss # http://feeds.washingtonpost.com/rss/politics # http://news.yahoo.com/rss/us # http://rss.cnn.com/rss/money_autos.rss # Optional Extras... # Install Hystrix: https://github.com/Netflix/recipes-rss/wiki/Hystrix-Metrics-%28Optional%29 # Open port 7979 in AWS Security Group iptables -A INPUT -p tcp -m tcp --dport 7979 -j ACCEPT iptables -L -n service iptables save service iptables restart git clone https://github.com/Netflix/Hystrix.git cd Hystrix/hystrix-dashboard ../gradlew jettyRun # Browse to: http://[IP ADDRESS]:7979/hystrix-dashboard # Enter http://[IP ADDRESS]:9090/hystrix.stream to see the Hystrix metrics show up in the dashboard. You will have to send a few transactions from the Edge service to have the Hystrix metrics loaded. # Stress the Edge webserver and watch the circuit trip into "Open" state: for i in {1..100}; do curl -s -o /dev/null -w "%{http_code} %{url_effective}\\n" "http://[IP ADDRESS]:9090/jsp/rss.jsp" & done # Hystrix Example application: https://github.com/Netflix/Hystrix/tree/master/hystrix-examples-webapp # Open port 8989 in AWS Security Group iptables -A INPUT -p tcp -m tcp --dport 8989 -j ACCEPT iptables -L -n service iptables save service iptables restart cd Hystrix/hystrix-examples-webapp ../gradlew jettyRun # Browse to: http://[IP ADDRESS]:8989/hystrix-examples-webapp # View it on Hystrix dashboard # Browse to: http://[IP_ADDRESS]:7979/hystrix-dashboard/ # Enter: http://[IP_ADDRESS]:8989/hystrix-examples-webapp/hystrix.stream # Click "Monitor Stream" # See the metrics change with this in one window: curl [IP_ADDRESS]:8989/hystrix-examples-webapp/hystrix.stream # And this in another window: while true ; do curl "[IP_ADDRESS]:8989/hystrix-examples-webapp/"; done # <-- The trailing "/" is required # Install Turbine: https://github.com/Netflix/Hystrix/wiki/Dashboard curl -L -O https://github.com/downloads/Netflix/Turbine/turbine-web-1.0.0.war cp turbine-web-1.0.0.war $TOMCAT_HOME/webapps/turbine.war # Configure Turbine (using Archaius) vi /root/rss-edge-turbine.properties # From https://github.com/Netflix/Hystrix/wiki/Dashboard # Hystrix stream for RSS Edge webapp turbine.ConfigPropertyBasedDiscovery.default.instances=localhost turbine.instanceUrlSuffix=:9090/hystrix.stream # End edit # Add Archaius config to Tomcat "archaius.configurationSource.additionalUrls" vi /etc/tomcat6/tomcat6.conf # Archaius properties for Turbine and Netflix JAVA_OPTS="${JAVA_OPTS} -Darchaius.configurationSource.additionalUrls=file:///root/rss-edge-turbine.properties" # End edit service tomcat6 restart # Should see this in the logs: "URLs to be used as dynamic configuration source: [file:/root/rss-edge-turbine.properties]" grep rss-edge /usr/share/tomcat6/logs/catalina.out # Browse to: http://[IP_ADDRESS]:7979/hystrix-dashboard/ # Enter: http://54.206.19.137/turbine/turbine.stream # Click "Monitor Stream" # Install hystrix-dashboard in Tomcat curl -O http://search.maven.org/remotecontent?filepath=com/netflix/hystrix/hystrix-dashboard/1.3.8/hystrix-dashboard-1.3.8.war # Install hystrix-examples-webapp in Tomcat cd Hystrix/hystrix-examples-webapp ../gradlew build cp build/libs/hystrix-examples-webapp-1.3.9-SNAPSHOT.war /usr/share/tomcat6/webapps/hystrix-examples-webapp.war
Cluster Install
Turbine really only starts to shine once there are clusters involved.Let's set up a cluster of dedicated Edge and Middletier instances, a dedicated Eureka instance (on Tomcat together with Hystrix Dashboard + Turbine) and with ELB in front of the Edge cluster. Something like this:
Internet Internet
| |
------------|----------------------------------|---------------------------------------------
| v
| AWS ELB
| |
/----------|------------------------\ /|\
| v | v v v
| Hystrix Dashboard <-- Turbine <----- RSS Edge (x3)
| | ^ ^ ^ ^
| | / \|/
| Eureka <-------- |
| ^ | |
\--------------------------|--------/ /|\
| v v v
---------- RSS Middletier (x3)
# To make things easier, before we scale out we'll set up some convenience
# hostnames and scripts - in reality we'd come back and automate this
# properly later on with Puppet/Chef/Ansible/Salt/Baked-into-image # Add an entry with the Private IP address of the server in /etc/hosts e.g.: 127.0.0.1 eureka vim "recipes-rss/rss-edge/src/main/resources/edge.properties" eureka.serviceUrl.default=http://eureka/eureka/v2/ # End edit vim "recipes-rss/rss-middletier/src/main/resources/middletier.properties" eureka.serviceUrl.default=http://eureka/eureka/v2/ # End edit # Rebuild both jars cd /root/recipes-rss && gradlew build # Create the following scripts in root's homedir: # change_eureka_host.sh #!/bin/bash if [ $# -lt 1 ]; then echo "Usage: $0 IP_ADDRESS" fi sed -i "s/.*eureka/$1 eureka/g" /etc/hosts # start_rss-edge.sh #!/bin/bash cd /root/recipes-rss nohup java -Xmx128m -XX:MaxPermSize=32m -jar rss-edge/build/libs/rss-edge-*SNAPSHOT.jar & echo "Output is logged to /root/recipes-rss/logs/rss-edge.log" # start_rss-middletier.sh #!/bin/bash cd /root/recipes-rss nohup java -Xmx128m -XX:MaxPermSize=32m -jar rss-middletier/build/libs/rss-middletier-*SNAPSHOT.jar & echo "Output is logged to /root/recipes-rss/logs/rss-middletier.log" # tail_rss-edge.sh #!/bin/bash tail -100f /root/recipes-rss/logs/rss-edge.log # tail_rss-middletier.sh #!/bin/bash tail -100f /root/recipes-rss/logs/rss-middletier.log # Save the instance as an AMI # Now create 6 more instances based off this AMI - they can all be t1.micro's (in fact we can reprovision our main node as a t1.micro and only run Tomcat on it for Eureka, Hystrix Dashboard, & Turbine) # Optional: Name them in the EC2 console as rss-edge-N, rss-middletier-N, rss-eureka # Create 2 more security groups: rss-edge & rss-middletier # Ensure 'rss-edge' security group has ports 9090 & 9092 open # Ensure 'rss-middletier' security group has ports 9191 & 9192 open # Create a Load Balancer named 'RssEdgeLoadBalancer' containing all 3
# rss-edge-* nodes. Attach ports 9090 & 9092 to the same ports. # Optional: add a health check on: # Protocol: HTTP # Port: 9092 # Path: /adminres/webadmin/index.html # Timeout: 5s # Interval: 0.5min # Unhealthy Threshold: 2 # Optional: Add cloudwatch alarm in Monitoring tab # UnHealthyHostCount >= 1 for 1 minute # Send message to topic "NotifyMe" # On the Eureka node add the Private IP addresses of the nodes to /etc/hosts
# e.g.: 127.0.0.1 eureka 172.31.111.333 rss-edge-1 172.31.111.444 rss-edge-2 172.31.111.555 rss-edge-3 172.31.111.666 rss-middletier-1 172.31.111.777 rss-middletier-2 172.31.111.888 rss-middletier-3 # On the Eureka node - modify the all-turbine.properties vi /root/all-turbine.properties turbine.ConfigPropertyBasedDiscovery.rss-edge.instances=rss-edge-1,rss-edge-2,rss-edge-3 # End edit # Create a script to load test the Edge nodes # hammer_rss_vip.sh #!/bin/bash # Usage: hammer_rss_vip.sh [pause_seconds] # e.g. hammer_rss_vip.sh 0.1 - will pause for 100ms between requests # e.g. hammer_rss_vip.sh - no pause: fire requests as fast as we can fork processes in the background RSS_EDGE_VIP=rssedgeloadbalancer-NNNNNNNNNN.xx-region-N.elb.amazonaws.com DELAY=${1-0} while true ; do curl -s -o /dev/null http://$RSS_EDGE_VIP:9090/jsp/rss.jsp & sleep $DELAY done # Now watch Turbine via the Hystrix Dashboard while you load test the Edge via the ELB and see the difference for different timings hammer_rss_vip 1 hammer_rss_vip 0.1