High Availability and Performance Computing Workshop (HAPCW 2008)

http://xcr.cenit.latech.edu/hapcw2008/

 April 3-4, 2008, Denver, Colorado

HAPCW Co-Chairs

Stephen L. Scott, Oak Ridge National Lab and Chokchai ÒBoxÓ Leangsuksun, Louisiana Tech University

Program co-Chairs

Xubin (Ben) He, Tennessee Tech University and Christian Engelmann, Oak Ridge National Laboratory



 

Title

Authors


Thursday, April 3, 2008

1:30-2:00pm

Welcome

Stephen Scott and Box Leangsuksun

2:00-2:30pm

A Neural Networks Approach for Intelligent Fault Prediction in HPC Environments

Kulathep Charoenpornwattana, Chokchai Leangsuksun, Anand Tikotekar, , Geoffroy Vallée, Stephen L. Scott

 

2:30-3:00pm

An Online Controller Towards Self-Adaptive File System Availability and Performance

 

Xin Chen, Ben Eckart, Xubin He, Christian Engelmann, Stephen scott

 

3:00-3:30pm

Coffee break

 

3:30-4:00pm

Using Log Information to Perform Statistical Analysis on Failures Encountered by Large-Scale HPC Deployments

Narate Taerat1, Nichamon Naksinehaboon, Clayton Chandler, James Elliott,

Chokchai (Box) Leangsuksun, George Ostrouchov, Stephen L. Scott

 

4:00-4:30pm

Towards a Fault-aware Computing Environment

Xian-He Sun, Zhiling Lan, Yawei Li, Hui Jin, and Ziming Zheng

 

Friday, April 4, 2008

 

8:00-8:30am

Reliability Aware Optimal K Node of Parallel applications in Large Scale HPC Systems

Narasimha R. Gottumukkala, Chokchai Box Leangsuksun, Raja Nassar, Mihaela Paun, Dileep Sule, Stephen L. Scott

 

8:30-9:00am

An Efficient Virtual Machine Checkpointing Mechanism for Hypervisor-based HPC

K. Chanchio, C. Leangsuksun, H. Ong, V. Ratanasamoot, A. Shafi

9:00-9:30am

Impact of Fault-Tolerance Policies: Feasibility Study

Geoffroy Vallée, Anand Tikotekar, Chokchai Leangsuksun, Stephen L. Scott

 9:30-10:00am

Coffee  Break

 

10:00-11:30am

Panel Discussion

 

11:30am

Concluding Remarks

Stephen and Box Leangsuksun