Skip to content

Free Research Papers On Distributed Computing Cluster

  •    

    A Computational Model for TensorFlow (An Introduction)

    Martin Abadi, Michael Isard, Derek G. Murray

    1st ACM SIGPLAN Workshop on Machine Learning and Programming Languages (MAPL 2017) (2017)

  •    

    Affinity Clustering: Hierarchical Clustering at Scale

    MohammadHossein Bateni, Soheil Behnezhad, Mahsa Derakhshan, MohammadTaghi Hajiaghayi, Raimondas Kiveris, Silvio Lattanzi, Vahab Mirrokni

    NIPS 2017, pp. 6867-6877

  •    

    Geo-Distribution of Actor-Based Services

    Philip A. Bernstein, Sebastian Burckhardt, Sergey Bykov, Natacha Crooks, Jose Faleiro, Gabriel Kliot, Alok Kumbhare, Muntasir Raihan Rahman, Vivek Shah, Adriana Szekeres, Jorgen Thelin

    Proc. of ACM Programming Languages, OOPSLA (2017)

  •    

    LB3D: A parallel implementation of the Lattice-Boltzmann method for simulation of interacting amphiphilic fluids

    Sebastian Schmieschek, Lev Shamardin, Stefan Frijters, Timm Krüger, Ulf Schiller, Jens Harting, Peter Coveney

    Computer Physics Communications, vol. 217 (2017), pp. 149-161

  •    

    Prochlo: Strong Privacy for Analytics in the Crowd

    Andrea Bittau, Úlfar Erlingsson, Petros Maniatis, Ilya Mironov, Ananth Raghunathan, David Lie, Mitch Rudominer, Ushasree Kode, Julien Tinnes, Bernhard Seefeld

    Proceedings of the Symposium on Operating Systems Principles (SOSP) (2017) (to appear)

  •    

    RFC 8145 - Signaling Trust Anchor Knowledge in DNS Security Extensions (DNSSEC)

    Duane Wessels, Verisign, Warren Kumari, Google, Paul Hoffman, ICANN

    Internet Engineering Task Force (IETF), IETF (2017)

  •    

    RFC 8198 - Aggressive Use of DNSSEC-Validated Cache

    Kazunori Fujiwara, Akira Kato, Warren Kumari

    Internet Engineering Task Force (IETF) (2017)

  •    

    Reliability When Everything Is a Platform: Why You Need to SRE Your Customers

    Dave Rensin

    (2017)

  •  

    SRE your gRPC

    Gráinne Sheerin, Gabe Krabbe

    (2017)

  •    

    Spanner, TrueTime and the CAP Theorem

    Eric Brewer

    Google (2017)

  •    

    Spanner: Becoming a SQL System

    David F. Bacon, Nathan Bales, Nico Bruno, Brian F. Cooper, Adam Dickinson, Andrew Fikes, Campbell Fraser, Andrey Gubarev, Milind Joshi, Eugene Kogan, Alex Lloyd, Sergey Melnik, Rajesh Rao, Dave Shue, Chris Taylor, Marcel van der Holst, Dale Woodford

    Proc. SIGMOD 2017, pp. 331-343 (to appear)

  •   

    Streambox: Modern Stream Processing on a Multicore Machine

    Felix Xiaozhu Lin, Gennady Pekhimenko, Heejin Park, Hongyi Xin, Kathryn Stuart McKinley, Myeongjae Jeon

    The USENIX Annual Technical Conference, San Jose, CA. (2017)

  •    

    Structural Analysis and Optimal Design of Distributed System Throttlers

    Milad Siami, Joëlle Skaf

    IEEE Transactions on Automatic Control, vol. PP Issue: 99 (2017)

  •    

    Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering

    KK Yap, Murtaza Motiwala, Jeremy Rahe, Steve Padgett, Matthew Holliman, Gary Baldus, Marcus Hines, TaeEun Kim, Ashok Narayanan, Ankur Jain, Victor Lin, Colin Rice, Brian Rogan, Arjun Singh, Bert Tanaka, Manish Verma, Puneet Sood, Mukarram Tariq, Matt Tierney, Dzevad Trumic, Vytautas Valancius, Calvin Ying, Mahesh Kallahalla, Bikash Koley, Amin Vahdat

    Sigcomm (2017)

  •    

    TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow

    Danijar Hafner, James Davidson, Vincent Vanhoucke

    arXiv preprint arXiv:1709.02878 (2017)

  •    

    Uncanny Valleys in Declarative Language Design

    Mark S. Miller, Daniel von Dincklage, Vuk Ercegovac, Brian Chin

    SNAPL 2017, Summit on Advances in Programming Languages (to appear)

  •    

    Borg, Omega, and Kubernetes

    Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, John Wilkes

    ACM Queue, vol. 14 (2016), pp. 70-93

  •   

    Building Blocks for Site Reliability

    Sebastian Kirsch

    International Industry-Academia Workshop on Cloud Reliability and Resilience, EIT Digital, Berlin, Germany (2016)

  •    

    Design patterns for container-based distributed systems

    Brendan Burns, David Oppenheimer

    The 8th Usenix Workshop on Hot Topics in Cloud Computing (HotCloud '16) (2016)

  •    

    DieHard: reliable scheduling to survive correlated failures in cloud data centers

    Mina Sedaghat, Eddie Wadbro, John Wilkes, Sara De Luna, Oleg Seleznjev, Erik Elmroth

    International Symposium on Cluster, Cloud and Grid Computing (CCGrid), IEEE/ACM, Cartagena, Colombia (2016), pp. 52-59

  •    

    Disks for Data Centers

    Eric Brewer, Lawrence Ying, Lawrence Greenfield, Robert Cypher, Theodore T'so

    Google (2016), pp. 1-16

  •   

    Distributed Authorization in Vanadium

    Ankur Taly, Asim Shankar

    Foundations of Security Analysis and Design VIII, Springer-Verlag (2016)

  •   

    Distributed Balanced Partitioning via Linear Embedding

    Kevin Aydin, Mohammadhossein Bateni, Vahab Mirrokni

    WSDM 2016: Ninth ACM International Conference on Web Search and Data Mining, ACM (to appear)

  •   

    EYEORG: A Platform For Crowdsourcing Web Quality Of Experience Measurements

    Dina Papagiannaki

    ACM CoNEXT 2016 (2016) (to appear)

  •    

    Engineering Reliability into Sites

    Alexander Perry

    RAM Huntsville (2016)

  •   

    Evaluation Metrics of Service-Level Reliability Monitoring Rules of a Big Data Service

    Keun Soo Yim

    In Proceedings of the IEEE International Symposium on Software Reliability Engineering (ISSRE) (2016), pp. 376-387

  •    

    Evolve or Die: High-Availability Design Principles Drawn from Google's Network Infrastructure

    Ramesh Govindan, Ina Minei, Mahesh Kallahalla, Bikash Koley, Amin Vahdat

    ACM SIGCOMM (2016)

  •    

    Federated Optimization: Distributed Machine Learning for On-Device Intelligence

    Jakub Konečný, H. Brendan McMahan, Daniel Ramage, Peter Richtarik

    Google, Inc. (2016)

  •    

    Firmament: Fast, Centralized Cluster Scheduling at Scale

    Ionel Gog, Malte Schwarzkopf, Adam Gleave, Robert N. M. Watson, Steven Hand

    12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), USENIX Association (2016), pp. 99-115 (to appear)

  •    

    Improving Resource Efficiency at Scale with Heracles

    David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, Christos Kozyrakis

    ACM Transactions on Computer Systems (TOCS), vol. 34 (2016), 6:1-6:33

  •    

    Incremental, iterative data processing with timely dataflow

    Derek G. Murray, Frank McSherry, Michael Isard, Rebecca Isaacs, Paul Barham, Martin Abadi

    Communications of the ACM, vol. 59 (2016), pp. 75-83

  •    

    Invent More, Toil Less

    Betsy (Adrienne Elizabeth) Beyer, Brendan Gleason, Dave O'Connor, Vivek Rau

    :login;, vol. 41, issue 3 (2016), pp. 44-48

  •    

    Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network

    Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, Anand Kanagala, Hong Liu, Jeff Provost, Jason Simmons, Eiichi Tanda, Jim Wanderer, Urs Hölzle, Stephen Stuart, Amin Vahdat

    Communications of the ACM, vol. Vol. 59, No. 9 (2016), pp. 88-97

  •    

    Maglev: A Fast and Reliable Software Network Load Balancer

    Daniel E. Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, Jinnah Dylan Hosein

    13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), USENIX Association, Santa Clara, CA (2016), pp. 523-535

  •    

    Modular Composition of Coordination Services

    Kfir Lev-Ari, Edward Bortnikov, Idit Keidar, Alexander Shraer

    USENIX Annual Technical Conference (ATC) (2016)

  •    

    Optimizing Distributed Actor Systems for Dynamic Interactive Services

    Andrew Newell, Gabriel Kliot, Ishai Menache, Aditya Gopalan, Soramichi Akiyama, Mark Silberstein

    EuroSys 2016, ACM – Association for Computing Machinery (to appear)

  •    

    RFC 7871 - Client Subnet in DNS Queries

    Carlo Contavalli, Wilmer van der Gaast, David C Lawrence, Warren Kumari

    IETF, IETF (2016)

  •    

    RSSAC002 version 2 - RSSAC Advisory on Measurements of the Root Server System

    Warren Kumari, Duane Wessels, Shumon Huque, John Bond, Ray Bellis

    ICANN (2016)

  •    

    Revisiting Distributed Synchronous SGD

    Jianmin Chen, Rajat Monga, Samy Bengio, Rafal Jozefowicz

    International Conference on Learning Representations Workshop Track (2016)

  •    

    Robust Large-Scale Machine Learning in the Cloud

    Steffen Rendle, Dennis Fetterly, Eugene J. Shekita, Bor-yiing Su

    Proceedings of the 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Francisco, CA, USA (2016)

  •    

    Robust and Probabilistic Failure-Aware Placements

    Madhukar Korupolu, Rajmohan Rajaraman

    ACM Symposium on Parallel Algorithms and Architectures (SPAA), California, USA (2016)

  •    

    SWIFT: Using task-based parallelism, fully asynchronous communication, and graph partition-based domain decomposition for strong scaling on more than 100000 cores.

    Pedro Gonnet

    PASC16, EPFL, Lausanne, Switzerland (2016)

  •    

    TensorFlow: A system for large-scale machine learning

    Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zheng

    12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), USENIX Association (2016), pp. 265-283

  •    

    The Zero Touch Network

    Bikash Koley

    International Conference on Network and Service Management (2016) (to appear)

  •   

    Trumpet: Timely and Precise Triggers in Data Centers

    Masoud Moshref, Minlan Yu, Ramesh Govindan, Amin Vahdat

    ACM SIGCOMM (2016)

  •    

    Ubiq: A Scalable and Fault-tolerant Log Processing Infrastructure

    Alexander Smolyanov, Ashish Gupta, Divy Agrawal, Haifeng Jiang, Manish Bhatia, Manpreet Singh, Monica Chawathe Lenart, Namit Sikka, Navin Melville, Scott Holzer, Shan He, Shivakumar Venkataraman, Tianhao Qiu, Venkatesh Basker, Vinny Ganeshan, Yuri Vasilevski

    Workshop on Business Intelligence for the Real Time Enterprise (BIRTE), Springer (2016)

  •   

    Can Traditional Programming Bridge the Ninja Performance Gap for Parallel Computing Applications?

    Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, Rakesh Krishnaiyer, Mikhail Smelyanskiy, Milind Girkar, Pradeep Dubey

    Communications of the ACM, vol. 58 (2015), pp. 77-86

  •    

    Computing weak consistency in polynomial time

    Wojciech Golab, Xiaozhou (Steve) Li, Alejandro López-Ortiz, Naomi Nishimura

    Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, ACM, New York, NY, USA, pp. 395-404

  •    

    Continuous Pipelines at Google

    Dan Dennison

    SRECon Europe 2015, USENIX, Dublin, Ireland, pp. 12

  •    

    Dynamic iSCSI at Scale: Remote Paging at Google

    Nick Black

    Linux Plumbers Conference 2015

  •    

    Efficient and Scalable Algorithms for Smoothed Particle Hydrodynamics on Hybrid Shared/Distributed-Memory Architectures

    Pedro Gonnet

    SIAM Journal on Scientific Computing, vol. 37(1) (2015)

  •    

    Federated Optimization: Distributed Optimization Beyond the Datacenter

    Jakub Konečný, H. Brendan McMahan, Daniel Ramage

    NIPS Optimization for Machine Learning Workshop (2015), pp. 5

  •    

    Heracles: Improving Resource Efficiency at Scale

    David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, Christos Kozyrakis

    Proceedings of the 42th Annual International Symposium on Computer Architecture (2015)

  •    

    High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads

    Ashish Gupta, Jeff Shute

    Workshop on Business Intelligence for the Real Time Enterprise (BIRTE), Springer (2015) (to appear)

  •    

    Kubernetes - Scheduling the Future at Cloud Scale

    David K. Rensin

    O'Reilly and Associates, 1005 Gravenstein Highway North Sebastopol, CA 95472, All

  •    

    Large-scale cluster management at Google with Borg

    Abhishek Verma, Luis Pedrosa, Madhukar R. Korupolu, David Oppenheimer, Eric Tune, John Wilkes

    Proceedings of the European Conference on Computer Systems (EuroSys), ACM, Bordeaux, France (2015)

  •    

    Poster Paper: Automatic Reconfiguration of Distributed Storage

    Artyom Sharov, Alexander Shraer, Arif Merchant, Murray Stokely

    The 12th International Conference on Autonomic Computing, IEEE (2015), pp. 133-134

  •    

    RFC7535 - AS112 Redirection Using DNAME

    Warren Kumari, Joe Abley, Brian Dickson, George Michaelson

    IETF RFCs, Internet Engineering Task Force (2015), pp. 16

  •    

    RFC7706 - Decreasing Access Time to Root Servers by Running One on Loopback

    Warren Kumari, Paul Hoffman

    IETF RFCs, Internet Engineering Task Force (2015), pp. 12

  •    

    RSSAC003 - RSSAC Report on Root Zone TTLs

    Warren Kumari

    ICANN Root Server System Advisory Committee ( RSSAC ) Reports and Advisories, Internet Corporation for Assigned Names and Numbers (ICANN) (2015), pp. 35

  •    

    Randomized Composable Core-sets for Distributed Submodular Maximization

    Vahab S. Mirrokni, Morteza Zadimoghaddam

    STOC (2015), pp. 153-162

  •    

    Take me to your leader! Online Optimization of Distributed Storage Configurations

    Artyom Sharov, Alexander Shraer, Arif Merchant, Murray Stokely

    Proceedings of the 41st International Conference on Very Large Data Bases, VLDB Endowment (2015), pp. 1490-1501

  •    

    TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

    Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, Xiaoqiang Zheng

    tensorflow.org (2015)

  •    

    The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

    Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J. Fernández-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, Frances Perry, Eric Schmidt, Sam Whittle

    Proceedings of the VLDB Endowment, vol. 8 (2015), pp. 1792-1803

  •   

    The rise of cloud computing systems

    Jeffrey Dean

    SOSP History Day (2015), 12:1-12:40

  •  

    Timely Dataflow: A Model

    Martín Abadi, Michael Isard

    FORTE (2015), pp. 131-145

  •    

    Tunable Performance and Consistency Tradeoffs for Geographically Replicated Cloud Services (COLOR)

    Wenbo Zhu, C. Murray Woodside

    Cyber Security and Cloud Computing (CSCloud), 2015 IEEE 2nd International Conference on, IEEE, pp. 457-463

  •   

    Author Retrospective for A NUCA Substrate for Flexible CMP Cache Sharing

    Jaehyuk Huh, Changkyu Kim, Hazim Shafi, Lixin Zhang, Doug Burger, Stephen W. Keckler

    ICS 25th Anniversary Volume, ACM SIGARCH (2014)

  •   

    Characterization of Impact of Transient Faults and Detection of Data Corruption Errors in Large-Scale N-Body Programs Using Graphics Processing Units

    Keun Soo Yim

    IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2014), pp. 458-467

  •    

    Connected Components in MapReduce and Beyond

    Raimondas Kiveris, Silvio Lattanzi, Vahab Mirrokni, Vibhor Rastogi, Sergei Vassilvitskii

    SOCC 2014

  •    

    Coupled and k-Sided Placements: Generalizing Generalized Assignment

    Madhukar Korupolu, Adam Meyerson, Rajmohan Rajaraman, Brian Tagiku

    Integer Programming and Combinatorial Optimization (IPCO) (2014)

  •  

    Diff-Index: Differentiated Index in Distributed Log-Structured Data Stores

    Wei Tan, Sandeep Tata, Yuzhe Tang, Liana Fong

    EDBT (2014) (to appear)

  •    

    Distributed Balanced Clustering via Mapping Coresets

    Mohammadhossein Bateni, Aditya Bhaskara, Silvio Lattanzi, Vahab Mirrokni

    NIPS, Neural Information Processing Systems Foundation (2014)

  •    

    Evaluating job packing in warehouse-scale computing

    Abhishek Verma, Madhukar Korupolu, John Wilkes

    IEEE Cluster, Madrid, Spain (2014)

  •   

    Eventually consistent: Not what you were expecting?

    Wojciech Golab, Muntasir R. Rahman, Alvin AuYoung, Kimberly Keeton, Xiaozhou (Steve) Li

    Communications of the ACM, vol. 57, no. 3 (2014), pp. 38-44

  •    

    From Research to Practice: Experiences Engineering a Production Metadata Database for a Scale Out File System

    Charles Johnson, Kimberly Keeton, Charles B. Morrey III, Craig A. N. Soules, Alistair Veitch, Stephen Bacon, Oskar Batuner, Marcelo Condotta, Hamilton Coutinho, Patrick J. Doyle, Rafael Eichelberger, Hugo Kiehl, Guilherme Magalhaes, James McEvoy, Padmanabhan Nagarajan, Patrick Osborne, Joaquim Souza, Andy Sparkes, Mike Spitzer, Sebastien Tandel, Lincoln Thomas, Sebastian Zangaro

    Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 2014), USENIX

  •    

    Long-term SLOs for reclaimed cloud computing resources

    Marcus Carvalho, Walfredo Cirne, Franciso Brasileiro, John Wilkes

    ACM Symposium on Cloud Computing (SoCC), ACM, Seattle, WA, USA (2014), 20:1-20:13

  •    

    Low-Overhead Network-on-Chip Support for Location-Oblivious Task Placement

    Gwangsun Kim, Lee, M.M.-J., John Kim, Dennis Abts, Michael R. Marty

    IEEE Transactions on Computers, vol. Volume 63, Issue 6 (2014), pp. 1487 - 1500

  •  

    MPIDepQBF: Towards Parallel QBF Solving without Knowledge Sharing

    Charles Jordan, Lukasz Kaiser, Florian Lonsing, Martina Seidl

    SAT (2014), pp. 430-437

  •    

    Macaroons: Cookies with Contextual Caveats for Decentralized Authorization in the Cloud

    Arnar Birgisson, Joe Gibbs Politz, Úlfar Erlingsson, Ankur Taly, Michael Vrable, Mark Lentczner

    Network and Distributed System Security Symposium, Internet Society (2014)

  •    

    Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing

    Ashish Gupta, Fan Yang, Jason Govig, Adam Kirsch, Kelvin Chan, Kevin Lai, Shuo Wu, Sandeep Dhoot, Abhilash Kumar, Ankur Agiwal, Sanjay Bhansali, Mingsheng Hong, Jamie Cameron, Masood Siddiqi, David Jones, Jeff Shute, Andrey Gubarev, Shivakumar Venkataraman, Divyakant Agrawal

    VLDB (2014)

  •    

    Near-Data Processing: Insights from a MICRO-46 Workshop

    Rajeev Balasubramonian, Jichuan Chang, Troy Manning, Jaime H. Moreno, Richard Murphy, Ravi Nair, Steven Swanson

    IEEE Micro (Special Issue on Big Data), vol. 34 (2014), pp. 36-43

  •   
  • Obtaining CPU cycles on an HPC cluster is nowadays relatively simple and sometimes even cheap for academic institutions. However, in most of the cases providers of HPC services would not allow changes on the configuration, implementation of special features or a lower-level control on the computing infrastructure and networks, for example for testing new computing patterns or conducting research on HPC itself. The variety of use cases proposed by several departments of the University of Torino, including ones from solid-state chemistry, high-energy physics, computer science, big data analytics, computational biology, genomics and many others, called for different and sometimes conflicting configurations; furthermore, several R&D activities in the field of scientific computing, with topics ranging from GPU acceleration to Cloud Computing technologies, needed a platform to be carried out on. The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multi-purpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Torino branch of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible, reconfigurable and extendable infrastructure to cater to a wide range of different scientific computing needs, as well as a platform for R&D activities on computational technologies themselves. Extending it with novel architecture CPU, accelerator or hybrid microarchitecture (such as forthcoming Intel Xeon Phi Knights Landing) should be as a simple as plugging a node in a rack. The initial system counts slightly more than 1100 cpu cores and includes different types of computing nodes (standard dual-socket nodes, large quad-sockets nodes with 768 GB RAM, and multi-GPU nodes) and two separate disk storage subsystems: a smaller high-performance scratch area, based on the Lustre file system, intended for direct computational I/O and a larger one, of the order of 1PB, to archive near-line data for archival purposes. All the components of the system are interconnected through a 10Gb/s Ethernet layer with one-level topology and an InfiniBand FDR 56Gbps layer in fat-tree topology. A system of this kind, heterogeneous and reconfigurable by design, poses a number of challenges related to the frequency at which heterogeneous hardware resources might change their availability and shareability status, which in turn affect methods and means to allocate, manage, optimize, bill, monitor VMs, virtual farms, jobs, interactive bare-metal sessions, etc. This poster describes some of the use cases that prompted the design ad construction of the HPC cluster, its architecture and a first characterization of its performance by some synthetic benchmark tools and a few realistic use-case tests.