|Cloud, Digital, SaaS, Enterprise 2.0, Enterprise Software, CIO, Social Media, Mobility, Trends, Markets, Thoughts, Technologies, Outsourcing|
Linkedin Facebook Twitter Google Profile
Thursday, September 16, 2004Following the disastrous outages of 1999, online auction company eBay Inc. launched an overhaul of its IT infrastructure, rebuilding its data centers around a grid-type architecture and rewriting its applications in Java 2 Platform, Enterprise Edition. That process was completed in July.The site has been deconstructed and re-constructed—while it was live, while transactions were happening. In the course of about two years, the infrastructure and the software for the site have been rearchitected. At eBay,before the outage, all the transactions hit one massive database, which was the point of contention, along with some storage problems. Most of the meltdown was due to a monolithic application addressing everything in a single, monolithic database. One application held all the applications of eBay, with the exception of search, says,Marty Abbott, senior vice president of technology.Some of the fundamental changes made at eBay include:Job One was to eliminate the single point of failure in the huge, monolithic system. Then, reconstruction of the application to ensure that we had fault isolation and that processes and tasks of like size and cost weren't congesting and competing with each other. We disaggregated the monolithic system and ensured scale and fault tolerance. A good way to think about it is that it's one of the first examples of grid computing. It's an array of systems, each of which has a service component that answers to another system: fault tolerance meant to allow for scale. As a matter of fact, we would have potential vendors and partners come in and try to sell us on the idea of grid computing and we'd say, "It sounds an awful lot like what we were doing. We didn't know there was a name for it." She adds, "We went from one huge back-end system and four or five very large search databases. Search used to update in 6 to 12 hours from the time frame in which someone would place a bid or an item for sale. Today, updates are usually less than 90 seconds. The front end in October '99 was a two-tiered system with [Microsoft Corp.] IIS [Internet Information Services] and ISAPI [Internet Server API]. The front ends were about 60 [Windows] NT servers. Fast-forward to today. We have 200 back-end databases, all of them in the 6- to 12-processor range, as opposed to having tens of processors before. Not all those are necessary to run the site. We have that many for disaster recovery purposes and for data replication.We have two data centers in Santa Clara County [Calif.], one data center in Sacramento [Calif.] and one in Denver. When you address eBay or make a request of eBay, you have an equal chance of hitting any of those four".We've taken a unique approach with respect to our infrastructure. In a typical disaster recovery scenario, you have to have 200 percent of your capacity—100 percent in one location, 100 percent in another location—which is cost-ineffective. We have three centers, each with 50 percent of the traffic, actually 55 percent, adding in some bursts.We use Sun [Microsystems Inc.] systems, as we did before. We use Hitachi Data Systems [Corp.] storage on Brocade [Communications Systems Inc.] SANs [storage area networks] running Oracle [Corp.] databases and partner with Microsoft for the [Web server] operating system. IBM provides front and middle tiers, and we use WebSphere as the application server running our J2EE code—the stuff that is eBay. The code is also migrated from C++ to Java, for the most part. Eighty percent of the site runs with Java within WebSphere.
She explains ,"We believe the infrastructure we have today will allow us to scale nearly indefinitely. There are always little growth bumps, new things that we experience, and not a whole lot of folks from whom we can learn. But using the principles of scaling out, rather than scaling up; disaggregating wherever possible; attempting to avoid state, because state is very costly and increases your failure rate; partnering with folks like Microsoft and IBM, Sun, Hitachi Data Systems, where they feel they have skin in the game and are actually helping us to build something; and then investing in our people, along with commodity hardware and software—applying those principles, we think we can go indefinitely.We're in a continuous state of re-evaluation, and we're not afraid to swap out where necessary.We deliver the content for most countries from the U.S. The exceptions are Korea and China, which have their own platforms. In the other 28 countries, when you list an item for sale or when you attempt to bid or buy an item, that comes back to the U.S. We distribute the content around the world through a content delivery network. We put most of the content that's downloaded—except for the dynamic pieces—in a location near where you live. That's about 95 percent of the activity, making the actions or requests that come back to eBay in the U.S. very lightweight. A page downloads in the U.K. in about the same time that it downloads in the U.S., thanks to our partner Akamai [Technologies Inc.], whose content delivery network resides in just about every country, including China. The vision for Business Intelliegence id that while We're in a continual improvement process, where our community tells us in real time what's working and what isn't. We've got 114 million people working in our behalf, telling us what to do. We're the heart, the brain, the soul. The greatest thing about this place is that we get real-time feedback from everyone. An inspiring and insightful account of how IT is deployed inside one of the most famous online instituitions in the world.
|Sadagopan's Weblog on Emerging Technologies, Trends,Thoughts, Ideas & Cyberworld