FPGA routing is one of the most time-consuming steps of FPGA compilation, often preventing fast edit-compile-test cycles in prototyping and development. There have been attempts to accelerate FPGA routing using algorithmic improvements, multi-core or multi ...
Despite the high costs of acquisition and maintenance of modern data centers, machine resource utilization is often low. Servers running online interactive services are over-provisioned to support peak load (which only occurs for a fraction of the time), d ...
Simulations of electrical activity of networks of morphologically detailed neuron models allow for a better understanding of the brain. State-of-the-art simulations describe the dynamics of ionic currents and biochemical processes within branching topologi ...
As hardware evolves, so do the needs of applications. To increase the performance of an application,
there exist two well-known approaches. These are scaling up an application, using a larger multi-core
platform, or scaling out, by distributing work to mul ...
Nowadays, the ubiquity of smart appliances in our everyday lives is increasingly strengthening the links between humans and machines. Beyond making our lives easier and more convenient, smart devices are now playing an important role in personalized health ...
This paper presents shared-variable synchronization approaches for dataflow programming. The mechanisms do not require any substantial model of computation (MoC) modification, and is portable across both for hardware (HW) and software (SW) low-level code s ...
Our work addresses the problem of placement of threads, or virtual cores, onto physical cores in a multicore NUMA system. Different placements result in varying degrees of contention for shared resources, so choosing the right placement can have a large ef ...
Modern computing systems are based on multi-processor systems, i.e. multiple cores on the same chip. Hard real-time systems are required to perform particular tasks within certain amount of time; failure to do so characterises an unaccepted behavior. Hard ...
The limitations of clock frequency and power dissipation of deep sub-micron CMOS technology have led to the development of massively parallel computing platforms. They consist of dozens or hundreds of processing units and offer a high degree of parallelism ...
This paper investigates the problem of deriving a white box performance model of Hardware Transactional Memory (HTM) systems. The proposed model targets TSX, a popular implementation of HTM integrated in Intel processors starting with the Haswell family in ...