Reprints from my posting to SAN-Tech Mailing List and ...

2011/07/22

[san-tech][03296] Violin Memoryをメタデータに使用して、"GPFS Scans 10 Billion Files in 43 Minutes", July 22, 2011

Date: Fri, 22 Jul 2011 14:47:47 +0900
--------------------------------------------------
IBM GPFS (General Parallel File System) のメタデータに Violin Memoryを
使用したら、ファイルスキャンが 2007年の報告比べて 37倍になったとの報告
です:

"Violin Memory Breaks Existing General Parallel File System World
 Record By 37 Times Using IBM Research Storage Technology"
 Violin's Flash Memory Arrays Enable IBM's GPFS to Scan 10 Billion Files
 in 43 Minutes, Setting a New Standard for Big Data Applications
 July 22, 2011

  "... today announced that IBM Research used Violin Memory's 3200
   Flash Memory Arrays to break IBM’s previous General Parallel File
   System (GPFS) world record. Leveraging Violin's technology, IBM's
   GPFS scanned 10 billion files in 43 minutes, 37 times faster than
   the previous record of one billion files in three hours."

"scanned" が 37倍です。今回の試験では実データへのアクセスはありません。


"GPFS Scans 10 Billion Files in 43 Minutes"
 Richard Freitas, Joseph Slember, Wayne Sawdon and Lawrence Chiu
 IBM Advanced Storage Laboratory, IBM Almaden Research Center
 2011 (24 Page)

Abstract
  "By using a small cluster of ten IBM xSeries servers, IBM's cluster
   file system (GPFS), and by placing file system metadata on a new
   solid-state storage appliance from Violin Memory, IBM Research
   demonstrated, for the first time, the ability to do policy-guided
   storage management (daily tasks such as file selection for backup,
   migration, etc.) for a 10-billion-file environment in 43 minutes.
   This new record shatters previous record by factor of 37. GPFS also
   set the previous record in 2007."
.....
  "This document describes a demonstration that shows GPFS taking
   43 minutes to process the 6.5 TBs of metadata needed for a file
   system containing 10 Billion files. This accomplishment combines
   the use of enhanced algorithms in GPFS with the use of solid-state
   storage as the GPFS metadata store."


試験構成 (Figure 7: Test stand block diagram, Page 11)
  (IBM 3650M2 (Client & NSD) x 2 <=  PCIe => Violin 3205) x 4
   IBM 3650M2 (Client) x 2
  InfiniBand (Total 10 IBM 3650M2s via InfiniBand)

  Metadataは 10Billion files情報に必要な 6.5TB、全て Violin SSD
  今回の試験では、実データへのアクセスはありません

3.1 Test stand (Page 12)
Violin Memory 3205 Solid-state Storage Systems
  *)Aggregate bandwidth 5 GB/s
  *)1.8 TBs formatted per 3205, aggregate usable capacity 7.2 TBs
  *)Two 14x 128GB partitions
  *)Two 10x 180 GB partitions
  *)Aggregate 4 KB read operation rate > 1 MIOPS
  *)Typical write latency at 4KB: 20us
  *)Typical read latency at 4 KB: 90us

3.3 Procedure (Page 16)
  "We used the product GPFS, version 3.4, with enhancements that have
   subsequently been released into the product service stream. 
   The Violin storage is configured into 48 LUNs and used to create
   a standard file system using a 1 MB block size with all data and
   metadata residing solely in solidstate storage. We then populated
   the file system with 10 Billion zero length files. Since the file
   system policy scan being measured accesses only the files' metadata,
   we omitted all file data." ..... "In all cases, the location of
   the data has no impact on policy scan's performance. The files were
   spread evenly across a little more than 10 Million directories."
......

General Parallel File System, IBM

Storage Systems, IBM Research - Almaden

Violin 3200, Violin Memory


解説記事:
"IBM Demos Record-Breaking Parallel File System Performance"
 July 22, 2011, HPCwire, Blog: From the Editor


ちなみに 2007年の報告 (プレスリリース)
"IBM Supercharges Management of Massive Amounts of Data 
 -- A Billion Files at Lightning Speed"
 With New Policy-Based Automation, Latest Version of Parallel File System
 Coordinates Tiering of Information for Business or Scientific Use
 02 Oct 2007

0 件のコメント:

コメントを投稿