Clustering and Load Balancing With tc Server and ERS httpd #s2gx

Mark Thomas – SpringSource

  • Tomcat committer
  • tc Server developer
  • responsible for keeping tc Server and Tomcat in sync
    • memory leak detection in tomcat manager app
    • recent logging improvements
    • simplifying jmx access
    • all of the above started in tc Server, but have been contributed back and implemented these features in tomcat
    • don't want to get into having a significant fork of tomcat

Typical Architectures

  • load balancer (round robin) -> httpd (sticky sessions) -> tc Server (clustered)
    • don't go anywhere near tc Server clustering unless you absolutely have to–adds complexity and overhead
    • only thing tc Server clustering gives you is the ability for users not to lose sessions if an instance of tomcat goes down
    • ask yourself how big of a deal it is if your users lose their sessions when an outage occurs–if it's a big deal then you may need clustering

Starting Point

  • ubuntu 8.04.4 64-bit VM
  • vmware tools installed
  • 64-bit sun jdk 1.6.0_21
  • will be installing tc Server, Hyperic, etc. on this clean image

tc Server Installation

  • don't run tc Server as root
  • create a tcserver user
    • owns the tc Server files
    • runs the tc Server processes
  • install to /usr/local/tcserver

Instance Naming and Port Numbering

  • think about this in advance–may wind up with 100s of instances
  • tc01, tc02, etc. as the instance name, then follow this for ports
  • example scheme for ports
    • 1NN80 – http
    • 1NN43 – https
    • 1NN09 – ajp
    • 1NN05 – shutdown (if used)
    • 1NN69 – jmx
  • server and jvmRoute naming–consider linking server name to IP address, e.g. srvXXX-tcYY where XXX is the end of the IP address, YY is the tomcat instance number
    • 1NN20 – cluster communication

DEMO: Installing tc Server

  • tc Server version names are e.g. apache-tomcat-6.0.29.A.RELEASE where the first part is the version of Tomcat, the "A" means it's the first release of tc Server based on that tomcat release
  • if shutdown port is disabled, doing a kill -15 does a graceful shutdown. kill -9 works too and tomcat won't care, though your application might, so only do -9 if you have to
  • created two instances of tc Server using the tc Server create instance script
  • tc Server comes with templates for startup scripts–copy these over to /etc/init.d and edit as needed
  • paramterize cluster addresses and ports in a catalina properties file
  • can use ${…} notation in server.xml to hit the properties in catalina.properties

Creating a Cluster

  • switching to static node membership
    • cumbersome for large clusters
    • remove the <Membership …/> element
    • need to add a bunch of config stuff after the <Interceptor …/> elements
  • easier to use dynamic node discovery
  • backup strategies — tomcat gives you DeltaManager and BackupManager
    • delta manager is simplest–replicates every session to every node in the cluster
    • if your sessions use a lot of memory, delta manager doesn't give you much scalability
    • if your limitation is CPU, delta manager gives you some scalability
    • amount of network traffic on delta manager increases with the square of the number of nodes–not terribly scalable
  • backup manager
    • replicates session data to one other node in the cluster
    • send options: synchronous vs. asynchronous
      • in synchronous, writes session changes to other nodes, waits for acknowledgement, and then sends response to the user. can mean a lag for the user.
      • asynchronous — changes to sessions are put on a queue and the user gets the response immediately. means there's a chance that the cluster will be in an inconsistent state. use of sticky sessions means the consistency of the cluster doesn't really matter.
      • because java thread running isn't deterministic, in asynchronous mode the session updates may not be processed in the same order in which they were placed on the queue, so if your application depends on these being processed in the same order this is a risk
    • no need for the WAR farm deployer — hyperic does this better
      • WAR farm deployer has been removed from tc Server
    • backup manager DOES know where the primary and backup nodes ARE for every session
      • i.e. it doesn't actually store all the sessions from all nodes, but it knows where to get the session it lost
    • backup manager scales much better than delta manager in both memory and network traffic
      • network traffic scales linearly with number of nodes
  • for availability on a small cluster, use the delta manager
  • if you're worried about scalability, go with the backup manager

Hyperic HQ Installation

  • create an hqs user
  • hqs user owns the hyperic hq agent files
  • the agent itself runs as the tcserver user
  • os security considerations
    • agent doesn't need root privileges to access OS mechanics, start/stop processes, etc.
    • tc Server needs to be able to read WAR files uploaded via the agent
    • don't want tc Server runtime running as root
  • hyperic security considerations
    • don't want agent connecting as hqadmin super user
    • create a dedicated agent user
    • requires create, modify, and delete privileges for platform and platform services only

ERS httpd

  • ERS = Enterprise Ready Server
  • SpringSource's distribution of Apache httpd
  • install ERS as root
    • httpd processes run as nobody:nobody so this is fine
  • remove the test instance
  • create a new instance
  • module configuration
    • enable mod_proxy_balancer
    • enable mod_proxy_ajp
    • mod_proxy_ajp isn't quite as stable vs. mod_jk and mod_proxy_http
    • mentioned something about mod_http now having remote IP addresses available–need to ask about this
  • configure balancer in ers


<Proxy balancer://tc>
  BalancerMember http://ip.address.here:port route=tc01-uniqueID
  BalancerMember http://ip.address.here:port route=tc02-uniqueID
</Proxy>

ProxyPass /cluster-test balancer://tc/cluster-test stickysession=JSESSIONID:jessionid
ProxyPassReverse /cluster-test balancer://tc/cluster-test

Debugging Clusters

  • need something in your apps that tells you which cluster node you're on
  • also need something to spit out the session ID so you can test that the sticky sessions are working
  • if your context path differs from your host name in tc Server, this may cause your cookies not to work since the hosts are different
    • can use cookiepath in proxypassreverse directive
    • easier: just have your context path match your host name
  • anything you want replicated in sessions has to be serializable
    • if your application can't support having everything in the session be serializable, terracotta will support non-serializable data in session replication

Gaining Visibility Into Enterprise Spring Applications with tc ServerSpring Edition #s2gx

Gaining Visibility Into Enterprise Spring Applications with tc Server Spring EditionSteve Mayzak – SpringSource

  • tc Server — enterprise version of Tomcat developed by SpringSource
  • Built on Tomcat
  • Survey in 2008: 68% of companies surveyed using Tomcat; most popular lightweight container

tc Server editions

  • developer edition — can get it when you download STS
  • standard edition — application provisioning, server administration, advanced diagnostics
  • spring edition — spring + tomcat stack, spring application visibility, spring performance management

tc Server: Key Highlights

  • Developer efficiency
  • operational control
  • deployment flexibility

Spring Insight

  • Spring Insight — knows about Spring and Grails applications, so can provide specific information about your apps as they’re running
  • When you deploy a WAR to tc Server, Spring Insight gets involved and get get specific information from your apps
  • Spring Insight is currently intended for development use only
  • DEMO: huge amount of details come out of Spring Insight, from the HTTP request details down to the JDBC statements that were run and timings on the various actions during the request
  • can use Insight with other frameworks, etc. as well by adding annotations to your code

Operational Control

  • performance & sla management of spring apps
  • application provisioning and server administration
  • rich alert definition, workflows, and control actions
  • group availability & event dashboards
  • secure unidirectional agent communications
  • tc Server is a combination of Hyperic and Tomcat
  • monitoring is done via valves in Tomcat–isn’t a fork or modified version of Tomcat
  • Hyperic monitors web servers, app servers, databases, caching, messaging, directories, virtualization, etc.
  • Hyperic is also a management tool–admin, provisioning, groups, metrics, alerts, events, access control, upgrades, etc.
    • hyperic is jmx based, runs as an agent on each server

Enterprise Capabilities in tc Server

  • Run multiple instances per install — creates tc Server install updates
    • tc Server instances can point to a central set of binaries so upgrades are simpler
  • advanced scalability options
    • non-blocking (NIO) conectors
    • high-concurrency connection pool
  • Can create tc Server templates and create multiple instances from a base template
  • Advanced diagnostics–detects deadlocks and slow-running requests

Hyperic

  • monitoring and deployment
  • can see what apps are running on tc Server, number of sessions on the apps, up/down times, etc.
  • can deploy war files and change context path as you deploy
  • can schedule deployments and have the server auto-restart
  • can deploy from a remote machine (e.g. build server)
  • can remotely stop/start/restart instances
  • can set up scheduled restarts–e.g. make configuration changes during the day, have server auto-restart after hours
  • can schedule recurring restarts (e.g. restart daily at 1 am)
  • hyperic consumes jmx metrics and can show them in context, meaning it shows application metrics in the context of the overall server health (cpu, memory, etc.)
  • with the metrics coming in you can look at the specifics of your application and tune according to what your specific application is doing
    • has everything from cpu, memory, disk I/O metrics, to request metrics, down to spring-specific metrics
  • can add JMX instrumentation to your own classes without having to write your own MBeans–just annotate with @ManagedResource, @ManagedMetric, @ManagedOperation, @ManagedAttribute
    • this lets you get down to questions like “how many bookings per second can I handle in my travel booking application?” which lets you plan infrastructure and scalability in a very granular way
  • hyperic does metric baselining for your specific app so you’ll get alerts based on the baseline metrics for your application
  • can enable/disable metrics and alerts across multiple servers from a single interface
  • can see metrics globally (across a cluster) or individually on each server

Deployment Flexibility

  • Lean server (10MB) for virtual environments
  • template-drive server instance creation
  • integrated experience with vmware environments
  • open, secure API for all operations
  • server-specific settings like ports, etc. have been taken out and put into a catalina properties file, so server.xml can be applicable to all servers
  • streamlines process of spinning up new server instances
  • shared binaries for upgrades
  • multiple server versions can be installed per machine
  • complete flexibility for various “sizes” of VMs