IBM GPFS (General Parallel File System) のメタデータに Violin Memoryを
使用したら、ファイルスキャンが 2007年の報告比べて 37倍になったとの報告
"Violin Memory Breaks Existing General Parallel File System World
Record By 37 Times Using IBM Research Storage Technology"
Violin's Flash Memory Arrays Enable IBM's GPFS to Scan 10 Billion Files
in 43 Minutes, Setting a New Standard for Big Data Applications
July 22, 2011
"... today announced that IBM Research used Violin Memory's 3200
Flash Memory Arrays to break IBM’s previous General Parallel File
System (GPFS) world record. Leveraging Violin's technology, IBM's
GPFS scanned 10 billion files in 43 minutes, 37 times faster than
the previous record of one billion files in three hours."
"scanned" が 37倍です。今回の試験では実データへのアクセスはありません。
"GPFS Scans 10 Billion Files in 43 Minutes"
Richard Freitas, Joseph Slember, Wayne Sawdon and Lawrence Chiu
IBM Advanced Storage Laboratory, IBM Almaden Research Center
2011 (24 Page)
"By using a small cluster of ten IBM xSeries servers, IBM's cluster
file system (GPFS), and by placing file system metadata on a new
solid-state storage appliance from Violin Memory, IBM Research
demonstrated, for the first time, the ability to do policy-guided
storage management (daily tasks such as file selection for backup,
migration, etc.) for a 10-billion-file environment in 43 minutes.
This new record shatters previous record by factor of 37. GPFS also
set the previous record in 2007."
"This document describes a demonstration that shows GPFS taking
43 minutes to process the 6.5 TBs of metadata needed for a file
system containing 10 Billion files. This accomplishment combines
the use of enhanced algorithms in GPFS with the use of solid-state
storage as the GPFS metadata store."
試験構成 (Figure 7: Test stand block diagram, Page 11)
(IBM 3650M2 (Client & NSD) x 2 <= PCIe => Violin 3205) x 4
IBM 3650M2 (Client) x 2
InfiniBand (Total 10 IBM 3650M2s via InfiniBand)
Metadataは 10Billion files情報に必要な 6.5TB、全て Violin SSD
3.1 Test stand (Page 12)
Violin Memory 3205 Solid-state Storage Systems
*)Aggregate bandwidth 5 GB/s
*)1.8 TBs formatted per 3205, aggregate usable capacity 7.2 TBs
*)Two 14x 128GB partitions
*)Two 10x 180 GB partitions
*)Aggregate 4 KB read operation rate > 1 MIOPS
*)Typical write latency at 4KB: 20us
*)Typical read latency at 4 KB: 90us
3.3 Procedure (Page 16)
"We used the product GPFS, version 3.4, with enhancements that have
subsequently been released into the product service stream.
The Violin storage is configured into 48 LUNs and used to create
a standard file system using a 1 MB block size with all data and
metadata residing solely in solidstate storage. We then populated
the file system with 10 Billion zero length files. Since the file
system policy scan being measured accesses only the files' metadata,
we omitted all file data." ..... "In all cases, the location of
the data has no impact on policy scan's performance. The files were
spread evenly across a little more than 10 Million directories."
General Parallel File System, IBM
Storage Systems, IBM Research - Almaden
Violin 3200, Violin Memory
"IBM Demos Record-Breaking Parallel File System Performance"
July 22, 2011, HPCwire, Blog: From the Editor
ちなみに 2007年の報告 (プレスリリース)
"IBM Supercharges Management of Massive Amounts of Data
-- A Billion Files at Lightning Speed"
With New Policy-Based Automation, Latest Version of Parallel File System
Coordinates Tiering of Information for Business or Scientific Use
02 Oct 2007