User Tools

Site Tools


presentations:high_performance_computing_and_engineering

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
presentations:high_performance_computing_and_engineering [2018/03/14 20:34]
shriyaa
presentations:high_performance_computing_and_engineering [2018/03/15 15:46] (current)
shriyaa
Line 15: Line 15:
 ==== Chaos Minios === ==== Chaos Minios ===
  
-In her own words, Prachi makes a living at Intel Corp by breaking supercomputers and she is proud of it! In her work as a High Performance Computing (HPC) Systems Engineer, she tries to harass the HPC system and then determines diagnostics on the observed failures of the system. In a complex system such as HPC, there are many components in the underlying architecture. There are compute nodes which can be thousands in number, ​both CPUs and GPUs, login nodes, storage disks, a management node, a high performance interconnect ​which varied topologies, among others. With this level of complexity, the system will inevitably fail, it is with such surety that the quality of the system is measured via a metric called MTTF - Mean Time To Fail (Umm, not the most encouraging part of #GHC17 for me)! As a results, there was a need for a tool to harass ​the HPC system in order to determine ​the weak links and the loads the different components are able to handle. This is the space that Chaos Minions was developed to fill. Chaos Minions has the capability to run harassor scripts for different components of the system or even combine multiple components to harass. Not only does it run tests, it tracks the different ​test cases in order to ensure that they can be reproduced at a later date to test improvements or determine ​better statistics. Although, the program has not been tested on large scale yet, it is soon in the pipeline ​to //break a large supercomputer// via harassing it.  +In her own words, Prachi ​Janardan ​makes a living at Intel Corp by breaking supercomputers and she is proud of it! As a High Performance Computing (HPC) Systems Engineer, she tries to harass the HPC system and then determines diagnostics on the observed failures of the system. In a complex system such as HPC, there are many components in the underlying architecture. There are thousands of compute nodes - comprising of both CPUs and GPUs, login nodes, storage disks, a management node, a high performance interconnect ​with varied topologies, among others. With this level of complexity, the system will inevitably fail. It is with such surety that the quality of the system is measured via a metric called MTTF - Mean Time To Fail (hmm ... not the most encouraging part of #GHC17 for me)! As a result, there was a need for a tool to burden ​the HPC system in order to characterize ​the weak links and the load limits that the different components are able to handle. This is the space that Chaos Minions was developed to fill. Chaos Minions has the capability to run harassor scripts for different components of the system or even combine multiple components to harass. Not only does it run tests, it tracks the test cases in order to ensure that they can be reproduced at a later date to test improvements or for better statistics. Although, the program has not been tested on large scale yet, it is soon set to break a large supercomputer via harassing it.
  
 ==== Building-up from the Bottom-up for Faster "​1"​s and "​0"​s ==== ==== Building-up from the Bottom-up for Faster "​1"​s and "​0"​s ====
  
-Jeewika, a quick witted lady (although not the only humorous woman I had the opportunity to hear at #GHC17, @smithmegan @NoraDenzel),​ described the new microprocessor from Oracle. With the Moore'​s law reaching its limit due to physical constraints,​ the need for the new range of microprocessors is to build-up vertically. It is not a new conceptit has been used to solve the space crunch in metropolitan cities ​bu building skyscrapers and vertical gardens, ​and so the question Oracle asked was, why not transistors?​ The result was the newest processor called FinFET which has high capabilities,​ although initially the team did face large number of obstacles due to their novel design ​high parasitic capacitance,​ high parasitic resistance and the need for double/​triple patterning. However, which the release just two weeks ago, Oracle sees the potential of such a nanometer sized microprocessor in high-end networks, wearables and automotive. It is no wonder that 18 new generations of the same are planned in the future. Whether the Moore'​s law is or is not valid anymore, the ingenuity of the idea is great. In another talk at another session about neuro-morphic computer hardware design, another architecture ​designed ​noted that, at least the number of people who believe Moore'​s law is dead doubles every two years!+Jeewika ​Ranaweera, a quick witted lady (although not the only humorous woman I had the opportunity to hear at #GHC17, @smithmegan @NoraDenzel),​ described the new microprocessor from Oracle. With the Moore'​s law reaching its limit due to physical constraints,​ the need for the new range of microprocessors is to build-up vertically. It is not a new concept ​it has been used to solve the space crunch in metropolitan cities ​by building skyscrapers and vertical gardens. Hence, the question Oracle asked was: Why not transistors?​ The result was the newest processor called FinFET which has high capabilities,​ although initially the team did face huge obstacles due to their novel design ​such as high parasitic capacitance,​ high parasitic resistance and the need for double/​triple patterning. However, which the release just two weeks ago, Oracle sees the potential of such a nanometer sized microprocessor in high-end networks, wearables and automotive. It is no wonder that 18 new generations of the same are planned in the future. Whether the Moore'​s law is or is not valid anymore, the ingenuity of the idea is great. In another talk at another session about neuro-morphic computer hardware design, another architecture ​designer ​noted that, at least the number of people who believe Moore'​s law is dead doubles every two years! 
presentations/high_performance_computing_and_engineering.txt · Last modified: 2018/03/15 15:46 by shriyaa