logo

Registration Tech Program • Accommodations • Transportation Contacts
Overview:
HAPCW 2008 High Availability (HA) Computing and Resiliency have always played a critical role in commercial mission critical applications. Likewise, High Performance Computing (HPC) has equally been a significant enabler of the R&D community because their scientific discoveries. Serviceability aims toward effective means by which corrective and preventive maintenance can be performed on a system. Higher serviceability improves availability and resiliency helps retaining quality, performance and continuity of services at expected levels. Together, the combination of HA, resiliency, and HPC will clearly lead to even more benefits to critical shared major HPC resource environments.

The 5th High Availability and Performance Computing Workshop  (HAPCW) 2008 will be held in conjunction with the  High-Performance Computer Science Week (HPCSW) 2008 event on  April 3-4 at the Grand Hyatt Hotel in Denver, CO, USA.

This workshop aims to provide a forum for researchers to discuss  state-of-the-art and on-going research and development, and to  share their findings and ideas in high availability and  performance computing  (HAPC). Since 2003, we have held four consecutive successful workshops in conjunction with the Los  Alamos Computer Science Institute (LACSI) Symposium at the  Eldorado Hotel in Santa Fe, NM, USA. This workshop represents a  continuation as part of the High-Performance Computer Science  Week. In addition to the presentation of reviewed papers, the  workshop will include a panel discussion of relevant topics.

Submission Guidelines:
Original, unpublished work is required. Extended abstracts are not to exceed 2 pages (two columns, single space, 10 point font), including tables and illustrations. Accepted contributions will be published in the proceedings website and CD which will be available at the workshop. The final manuscript shall be a maximum of 6 IEEE style pages in camera-ready format. Please send all extended abstracts by email, in Postscript or PDF format to Dr. Ben He, hexb@tntech.edu

Topics of interest are those relevant to HAPC including the following:
• Hardware for fault detection and resiliency
• System-level resiliency for HPC
• Statistical methods to improve system resiliency
• Fault tolerance mechanisms and experiments
• Resource management for system resiliency and availability
• Resilient systems based on hardware probes
• Reliability and robustness in HPC applications and systems
• Failure recovery strategies in Grid computing and HPC
• Reliable communication in HPC environments
• Architecture and tools supporting HAPC
• High availability computing
• Experience in creating HPAC environments
• Self-healing, self-configuration, self-optimization, fault prevention,
detection and recovery, fault tolerance and autonomic computing
• Configuration, resource and fault management
• Mission critical HPC applications


Web sites:
• HAPCW : http://xcr.cenit.latech.edu/hapcw2008
• HPCSW : http://www.hpcsw.org/

Important Date
• Feb 28, 2008 Extended Abstract Due
• March 10, 2008 Acceptance Notification
• March 20, 2008 Final paper due (electronic copy)
• April 3-4, 2008 Workshop (HAPCW) at HPCWS08

 

The HAPCW2008 is supported by  the fastOS program,