Reprints from my posting to SAN-Tech Mailing List and ...


[san-tech][01874] DRAM信頼性についての報告

Date: Fri, 09 Oct 2009 22:20:48 +0100
[san-tech][01877] Re: DRAM信頼性についての報告
以前 CMU Gibson教授のところでディスクの信頼性を研究されていた Bianca

"DRAM errors in the wild: A Large-Scale Field Study."
 B. Schroeder, E. Pinheiro, W.-D. Weber. Sigmetrics/Performance 2009
"The goal of this paper is to answer questions such as the follow-
 ing: How common are memory errors in practice? What are their
 statistical properties? How are they affected by external factors,
 such as temperature and utilization, and by chip-specific factors,
 such as chip density, memory technology and DIMM age?"

"This paper provides the first large-scale study of DRAM memory
 errors in the field. It is based on data collected from Google's
 server fleet over a period of more than two years making up many
 millions of DIMM days."

何故 Googleが出てくるかと言うと、共著者のお二人は Googleの方です
Eduardo Pinheiro
Wolf-Dietrich Weber
"Failure Trends in a Large Disk Drive Population",  Eduardo Pinheiro,
 Wolf-Dietrich Weber, Luiz Andre Barroso, 5th USENIX Conference on
 File and Storage Technologies (FAST 2007)
の共著者です。同じ FAST 2007での Schroederさんの発表
"Disk failures in the real world: What does an MTTF of 1,000,000 hours
 mean too you?", Bianca Schroeder, Garth Gibson.

FAST 2007 Technical Session

Bianca Schroeder, Assistant professor
Computer Science Department
University of Toronto
PDSI at CMU: Analyzing Failure Data

SIGMETRICS/Performance 2009, June 15 - 19, 2009
Sigmetrics Best Presentation Awardを受賞されたとのことですが、

上記の件は James Hamilton氏の Blogで知りました:
"You really DO need ECC Memory", 2009年10月7日
"Ten Ways to Waste a Parallel Computer",
 Katherine Yelick (Professor, U.C. Berkeley and Director of NERSC),
 Keynote, ISCA 2009, June 22, 2009
Katherine Yelick

ISCA 2009の発表資料:
[san-tech][03151] "Survey of Error and Fault Detection Mechanisms", Technical report, April 2011

0 件のコメント: