Efficient Read-Port-Count Reduction Schemes for the Centralized Physical Register File in a Superscalar Microprocessor
Abstract
The physical register file supports increasing the execution width and depth of a superscalar microprocessor to exploit more instruction-level parallelism. The efficient design of the physical register file is critical since its resources, such as the number of read and write ports, have a significant impact on CPU power consumption. Reducing the number of ports to the physical register file is a well-known direction for optimization. For port-count reduction schemes, balancing the trade-off between the scheme's complexity and performance is crucial. In our work, we introduce a high-level analysis method to estimate the complexity of the schemes during microarchitectural design. Furthermore, we explore the structure of different port-count reduction schemes and introduce a practical approach to constructing low-complexity read-port-count reduction schemes for the centralized integer physical register file. We show that the read-port-count reduction schemes designed with this approach can reduce the number of read ports by a factor of two (from 17 to 8 read ports) with the Geomean performance degradation of only 0.1% IPC across the SPECrate CPU 2017 Integer workloads.
Full Text:
PDFReferences
R. Shioya, K. Horio, M. Goshima, and S. Sakai, "Register cache system not for atency reduction purpose," in 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, Atlanta, GA, USA, 2010, pp. 301-312, DOI: 10.1109/MICRO.2010.43.
S. Rixner, W. J. Dally, B. Khailany, P. Mattson, U. J. Kapasi and J. D. Owens, "Register organization for media processing," in Proc. of the Sixth International Symposium on High-Performance Computer Architecture, Touluse, France, 2000, pp. 375-386, DOI: 10.1109/HPCA.2000.824366.
S. Thoziyoor, N. Muralimanohar, J. Ahn, and N. Jouppi, “Cacti 5.1,” HP Laboratories, Palo Alto, Tech. Rep. HPL-2008-20, 2008
S. Mittal, “A Survey of Techniques for Designing and Managing CPU Register File,” Concurrency and Computation: Practice and Experience, vol. 29, no. 4, pp. 1-23, 2017, DOI: 10.1002/cpe.3906.
S. Sirsi and A. Aggarwal, "Exploring the limits of port reduction in centralized register files," in 2009 22nd International Conference on VLSI Design, New Delhi, India, 2009, pp. 535-540, DOI: 10.1109/VLSI.Design.2009.29.
R. Balasubramonian, S. Dwarkadas, and D. H. Albonesi, "Reducing
the complexity of the register file in dynamic superscalar processors," in Proc. of the 34th ACM/IEEE International Symposium on Microarchitecture, Austin, TX, USA, 2001, pp. 237-248, DOI: 10.1109/MICRO.2001.991122.
J. A. Swensen and Y. N. Patt, “Hierarchical registers for scientific computers,” in Proc. of the 2nd International Conference on Supercomputing, St. Malo, France, 1988, pp. 346-354, DOI: 10.1145/55364.55398.
J.-L. Cruz, A. Gonzalez, M. Valero, and N. P. Topham, "Multiple-banked register file architectures," in Proc. of 27th International Symposium on Computer Architecture, Vancouver, BC, Canada, 2000, pp. 316-325.
R. Nalluri, R. Garg, and P. R. Panda, "Customization of Register File Banking Architecture for Low Power," in 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems, Bangalore, India, 2007, pp. 239-244, doi: 10.1109/VLSID.2007.58.
S. Wang, H. Yang, J. Hu, and S. G. Ziavras, "Asymmetrically Banked Value-Aware Register Files," in IEEE Computer Society Annual Symposium on VLSI, Porto Alegre, Brazil, 2007, pp. 363-368, DOI: 10.1109/ISVLSI.2007.27.
R. Sangireddy and A. K. Somani, "Exploiting quiescent states in registerl lifetime," in Proc. of the IEEE International Conference on Computer Design: VLSI in Computers and Processors, San Jose, CA, USA, 2004, pp. 368-374, DOI: 10.1109/ICCD.2004.1347948.
A. Aggarwal and M. Franklin, "Energy efficient asymmetrically ported register files," in Proc. of the 21st International Conference on Computer Design, San Jose, CA, USA, 2003, pp. 2-7, DOI: 10.1109/ICCD.2003.1240865.
T. M. Jones, M. F. O’Boyle, J. Abella, A. Gonzalez, and O. Ergin, “Energy-efficient register caching with compiler assistance,” ACM Trans. Archit. Code Optim, vol. 6, no. 4, Article 13, 2009, DOI: 10.1145/1596510.1596511.
J. A. Butts and G. S. Sohi, "Use-based register caching with decoupled indexing," in Proc. of the 31st Annual International Symposium on Computer Architecture, Munich, Germany, 2004, pp. 302-313, DOI: 10.1109/ISCA.2004.1310783.
D. A. Los and I.V. Smirnov, “Caching physical register file in a modern superscalar microprocessor,” in Proc. of 61th MIPT Scientific Conference. Radio engineering and computer technologies, Moscow, Russia, 2018, pp. 18-19.
N. Goel, A. Kumar, and P. R. Panda, “Shared-port register file architecture for low-energy VLIW processors,” ACM Trans. Archit. Code Optim, vol. 11, no. 1, Article 1, 2014, DOI: 10.1145/2533397.
J. Busek et al., “SPEC CPU2017: Next-Generation Compute Benchmark,” in Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, Berlin, Germany, 2018, pp. 41-42, DOI: 10.1145/3185768.3185771.
V.P. Nelson et al., “Digital Circuit Analysis and Design,” Prentice Hall, 1995, p. 234
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162