Cloud, Digital, SaaS, Enterprise 2.0, Enterprise Software, CIO, Social Media, Mobility, Trends, Markets, Thoughts, Technologies, Outsourcing


Contact Me:

Linkedin Facebook Twitter Google Profile


wwwThis Blog
Google Book Search



  • Creative Commons License
  • This page is powered by Blogger. Isn't yours?
Enter your email address below to subscribe to this Blog !

powered by Bloglet


Sunday, December 19, 2004

Too Much Information & How To Process

( Via Senthil) Forbes has come with an excellent article about too much of data in corporates and the rising expectations in processing them. Excerpts with edits and my comments:

Cheap disk drives make it possible to store every piece of data your company creates. Billions, trillions, even quadrillions of bytes are piling up in computer centers. What to do with it all? Forty-five thousand slot machines generate $4 billion in annual revenue for Harrah's Entertainment, the world's largest casino operator. But that's only part of their job. With each push of the button, each swipe of the casino's loyalty card, these noisy bandits also gather information, zipping records of their 100 million daily spins across the Internet from 28 casinos to a computer in. By morning Harrah's knows which customers should be rewarded with free show tickets, dinner vouchers or room upgrades, enticing them to spend more of their gambling dollars in Harrah's rather than in rival casinos. The company is working to shrink this information loop to a matter of minutes. "We can see how much money is going through a machine, and how frequently it pays out, and how much it pays out, and what type of player is on it, male or female, and what age they are," says Tim S. Stanley, chief information officer at Harrah's in Las Vegas. "It's no different from what a good retailer or grocery store does. We're trying to figure out which products sell, and we're trying to increase our customer loyalty." Companies are always hoping to figure out what makes their customers tick, but never before have they been able to do it on the scale possible today. Casinos, retailers, airlines and banks are piling up volumes so vast it would have been unthinkable only a few years ago. Harrah's data storehouse holds 30 terabytes, or 30 trillion bytes, of data, roughly three times the number of printed characters in the Library of Congress.

"A decade ago the biggest data centers in the U.S. had 10 terabytes of storage, and there were only five or ten of them, today there are enterprises with 2 or 3 petabytes," says Gil Press, senior director , EMC. Visa, the credit card company, manages more than a petabyte, or 1,000 terabytes. EMC says one of its biggest customers, a global retailer, expects to buy 3 petabytes of capacity this year (not all from EMC). Two years ago the same company bought 300 terabytes. It's the curse of cheap storage. All that customer data is out there, and it seems a shame to throw it away. But doing something with it is almost beyond the reach of the available microprocessors and database software. How do you scroll through a spreadsheet 1 billion rows deep? "The situation we're in is like having a dam that's filling up with water, getting bigger and bigger, and we're trying to get water out of it with a straw," says James Gray, manager of Microsoft's Bay Area Research Center, where scientists are studying ways to speed things up. Storage shipments this year will top 22 exabytes-or 22 million trillion bytes-of hard disk space, says market researcher IDC. That is four times the space needed to store every word ever spoken by every human being who has ever lived, and it's more than double the amount sold in 2002. By 2006 storage shipments will nearly double again, IDC estimates.
Demand for storage is so strong that even with prices plunging 35% a year, storage hardware makers EMC and Network Appliance will grow 30% this year, six times the rate of the overall data processing market. "Companies want to look at all the data. They want atomic-level data," says Sanju K. Bansal, chief operating officer at MicroStrategy , whose software analyzes vast databases, looking for customer trends. "Instead of analyzing data at a weekly level and a category level and a regional level-say, sales of men's socks in one week in all the stores in Virginia-now retailers are looking at gold-toe socks by Calvin Klein, in black, size 8, in every store, and by the hour. A decade ago our biggest customers were analyzing 10 gigabytes at a time. Today more than a third of our customers are using data sets that are larger than a terabyte," Bansal says. As data repositories grow larger, even simple chores like backup and recovery become a giant pain in the neck. "Managing all this stuff in an effective way is just enormously difficult to do," says Scott Thompson, chief information officer at Visa, the credit card company. "We're processing a billion transactions a day. And every one of those gets stored. That's why our storage keeps on growing. You're talking about very large data sets. Just backing them up becomes very difficult." This year Visa's network will process transactions worth $1.7 trillion, up 55% from $1.1 trillion in 2002. Much of the new volume comes from commercial purchases that are more data rich, carrying more than just dollars spent. Last year Visa built a data warehouse that produces 15,000 reports a day based on billions of rows of data, examining statistics on things such as fraud and chargebacks. At Premier, an alliance of 200 hospitals headquartered in San Diego, some queries were so complex that the company's IBM system couldn't return an answer in any amount of time. "Users would be frustrated. We had all this data, but it was useless," says Gary Feierstein, vice president of information technology at Premier. In 2003 Premier took a flyer on a Boston-area startup called Netezza whose specialized box zips through data 15 times as fast as Premier's IBM system, Feierstein says, enabling managers to query away to their heart's content. No wonder Netezza is on a growth spurt . Teradata, a division of NCR in Dayton, Ohio, made the multimillion-dollar machines running the world's biggest data warehouses, for customers like Wal-Mart, Bank of America and Verizon Wireless. Teradata boasts that it is stealing customers from IBM and Oracle, whose products, Teradata says, were designed to handle online transaction processing. "Customers try to use them for data warehousing, and they hit a brick wall," says Stephen Brobst, chief technology officer at Teradata. Its sales grew 11% in the first nine months of this year to $949 million, double the rate of the overall infotech industry.

The Teradata setup for Harrah’s, chugs through hundreds of jobs each day, analyzing 18 terabytes of data streamed in from 28 casinos around the country. The Teradata system helps Harrah's decide where on the floor to place its slot machines, which machines it should buy more of and even how much it should pay for them. A piece of visualization software generates a dynamic "heat map" of a casino floor. Machines glow red as they get busy, then turn blue, or white, as the action moves elsewhere. Because 75% of its 250,000 daily customers use a rewards program card, the casino can track them as they gamble, learning how long they spend at a certain machine, how often they visit a casino and for how many days. Harrah's keeps records on 30 million people and can combine those records with slot records to find out which games are most popular with women, or with tourists or with locals, or how much a certain person won or lost on recent visits and on which machines, and how long he usually plays before calling it quits. Harrah's claims that its loyalty program has boosted its share of customers' gambling budgets from 36% a few years ago to nearly 50%. Harrah's next ambition is to make these calculations in real time, from the moment a customer slips his rewards card into a slot machine or shows it to the front desk clerk at the hotel check-in. "The hotel clerk can see your history and determine whether you should get a room upgrade, based on booking levels in the hotel at that time and on your past level of play," Stanley says. "A person might walk up to you while you're playing and offer you $5 to play more slots, or a free meal, or maybe just wish you a happy birthday." If Harrah's planned $9 billion acquisition of Caesars Entertainment goes through, Stanley says he might need to double his Teradata system to handle Caesar's volume. He may double his Teradata system next year anyway, deal or no deal. "You just have this endless desire to put more information into the warehouse and to do more with what's already in there."
ThinkExist.com Quotes
Sadagopan's Weblog on Emerging Technologies, Trends,Thoughts, Ideas & Cyberworld
"All views expressed are my personal views are not related in any way to my employer"