• Home
  • Current congress
  • Public Website
  • My papers
  • root
  • browse
  • IAC-18
  • D1
  • IP
  • paper
  • Beyond the Trundling: A Tightly Coupled Multicore Processor for Extreme Performance in Space Missions

    Paper number

    IAC-18,D1,IP,18,x43480

    Author

    Dr. Hui Cao, China, Xi'an Microelectronics Technology Institute, China Academy of Space Electronics Technology (CASET), China Aerospace Science and Technology Corporation (CASC)

    Coauthor

    Mr. Weiqiang He, China, Xi'an Microelectronics Technology Institute, China Aerospace Science and Technology Corporation (CASC)

    Coauthor

    Mr. Fei Yu, China

    Coauthor

    Mr. Yulin Jin, China, Xi'an Microelectronics Technology Institute, China Academy of Space Electronics Technology (CASET), China Aerospace Science and Technology Corporation (CASC)

    Year

    2018

    Abstract
    This article elaborates the architectural exploration of a general computing platfom for next-generation extreme performance spaceborne electronics from the perspectives of system an applications. In terms of processor architecture,bandwidth and processing units are two essential
    factors affecting the computation performance. On demanding extreme computing capability in missions, we emphasize analysis on the relationship between data transferring and computing of a set of signal processing kermels in applications like remote sensing,landing,interactive docking,reentry and so on.
    
    Traditionally,Von Neumann computer finishes a computation following the flow of date accessing, computing,then data storing. This could realized by a series of instructions, however, which results in 'bubbles' due to the resources conflicts such as registers or memory accessing the same addresses. We call this 'trundling' since the bubbles pose a gap away from the peak performance. Trundling can significantly decrease the performance of this kind tasks as single or image processing and matrix calculation. Newly developed processor architecture should be fostered for such new demanding beyond the conventional architecture like SPARC,LEON,PowerPC or customized circuits on a programmable platform.
    
    In this paper, a new type of processor architecture aiming at speeding up the aforementioned applications is proposed. This model constructs a tightly coupled processing hierarchy for massive data. Three couplings are identified in this presentation, which are Task-level coupling,Streaming-level coupling and Data-level coupling. Implemented in a hardware (HW) supported fashion as NoC(Network-on-Chip) meshed heterogeneous (NoC-HETERO), on-the-fly streaming transfering (OTF-STREAM) and streaming engine (STREAM), the proposed coupling schemes are cooperated with flexible software(SW) programming to release computing capability. The
    similar thought can be found in NASA's newly developed Mastero with 49 cores. But we have made aggressive promoting for performance efficiency.
    
    With the support of such HW/SW cooperation,the benchmarks, as FFT, General Matrix Multiplication (GEMM), can approach the peak performanc with only at most 16  power consumption. The mechanisms fostered are implemented with synthesizable RTL(Register Transfer Level) codes on a digital signal procesor with all self owned properties. The processor is fabricated on a COTS(Commercial Of the Shelves)semiconductor 65mm proces,runing at 400MHz
    with a performance of 51.2GFLOPS and 102.4GMACS.
    Abstract document

    IAC-18,D1,IP,18,x43480.brief.pdf

    Manuscript document

    (absent)