CN115481074A - Heterogeneous acceleration server architecture and server based on same - Google Patents
Heterogeneous acceleration server architecture and server based on same Download PDFInfo
- Publication number
- CN115481074A CN115481074A CN202211155162.3A CN202211155162A CN115481074A CN 115481074 A CN115481074 A CN 115481074A CN 202211155162 A CN202211155162 A CN 202211155162A CN 115481074 A CN115481074 A CN 115481074A
- Authority
- CN
- China
- Prior art keywords
- pcie
- acceleration
- card
- cpu
- interface
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/17—Interprocessor communication using an input/output type connection, e.g. channel, I/O port
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/177—Initialisation or configuration control
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
Abstract
The invention provides a heterogeneous acceleration server architecture and a server based on the same. The server architecture comprises two CPUs, a south bridge chip and a plurality of PCIE acceleration cards which are respectively provided with a plurality of memory banks and are in communication connection; the two sets of PCIE acceleration cards are respectively accessed to two CPUs, and each PCIE acceleration card is configured as a signal processing acceleration card or a data processing acceleration card; the south bridge chip is connected with the peripheral equipment and is accessed to one of the CPUs through the DMI bus; the two PCIE acceleration cards with the communication relation in the same group realize communication based on the optical communication modules of the two parties. The server comprises a mainboard and the server framework arranged on the mainboard. According to the invention, the problems that the cost of the memory bandwidth caused by the communication between the PCIE acceleration cards in the existing heterogeneous acceleration server architecture is large, and the adopted PCIE acceleration cards can not give consideration to both the signal processing task and the data processing task.
Description
Technical Field
The invention belongs to the field of heterogeneous acceleration servers, and particularly relates to a heterogeneous acceleration server architecture and a server based on the same.
Background
At present, the existing heterogeneous acceleration server architecture usually adopts a form of a CPU + multiple PCIE acceleration cards, and the PCIE acceleration cards also adopt a form of an architecture of an FPGA + DSP. However, in practical use, the heterogeneous acceleration server architecture of the above form mainly has the following two problems:
1. communication between two PCIE acceleration cards needs a CPU for transfer, so that the cost on memory bandwidth is large;
2. the PCIE accelerator card adopting the FPGA + DSP architecture has a single architecture, and cannot give consideration to the requirements of a signal processing task and a data processing task on the performance of the accelerator card.
Disclosure of Invention
The invention aims to solve the problems that the cost of the communication between PCIE acceleration cards on the memory bandwidth is large and the adopted PCIE acceleration cards can not give consideration to both the signal processing task and the data processing task in the existing heterogeneous acceleration server architecture.
In order to achieve the above object, the present invention provides a heterogeneous acceleration server architecture and a server based on the same.
According to a first aspect of the present invention, a heterogeneous acceleration server architecture is provided, which includes a first CPU configured with a plurality of memory banks, a second CPU configured with a plurality of memory banks, a south bridge chip, and a plurality of PCIE acceleration cards;
the plurality of PCIE accelerator cards are divided into two groups, wherein the first group of PCIE accelerator cards are all accessed to the first CPU, the second group of PCIE accelerator cards are all accessed to the second CPU, and each PCIE accelerator card in the plurality of PCIE accelerator cards is configured into a signal processing accelerator card or a data processing accelerator card according to a corresponding task to be executed;
the first CPU is connected with the second CPU through a UPI bus;
the south bridge chip is connected with preset peripheral equipment and is accessed to the first CPU through a DMI bus;
the PCIE accelerator cards are configured with optical communication modules, and two PCIE accelerator cards in the same group, which have a communication relation with each other, realize communication based on the optical communication modules of both sides.
Optionally, the heterogeneous acceleration server architecture further includes a plurality of PCIE SWITCH chips;
each group of PCIE accelerator cards is accessed to a corresponding CPU through a corresponding number of PCIE SWITCH chips.
Optionally, the signal processing acceleration card is configured with 2 FPGA chips and 2 DSP chips, and the data processing acceleration card is configured with 1 FPGA chip and 4 DSP chips;
the PCIE acceleration card is a full-high full-length PCIE card.
Optionally, the peripheral device includes an IO interface module, a hard disk, a BMC chip, a BIOS chip, a TEM interface diagnostic card, and a clock generator chip.
Optionally, the IO interface module includes an RJ45 interface module, a USB interface module, a VGA interface module, and an IPMI interface module;
the hard disks comprise M.2 interface solid state hard disks and SATA interface solid state hard disks, and the SATA interface solid state hard disks form a hard disk array.
Optionally, the RJ45 interface module includes a gigabit RJ45 interface module and a gigabit RJ45 interface module;
the gigabit RJ45 interface module is realized by adopting a network control chip of RTL8211E type, and the gigabit RJ45 interface module is realized by adopting a network control chip of X557-AT2 type;
the USB interface module comprises a USB2.0 interface module and a USB3.0 interface module.
According to a second aspect of the present invention, a heterogeneous acceleration server is provided, where the heterogeneous acceleration server includes a chassis, a motherboard, a power module, a fan module, and any one of the above heterogeneous acceleration server architectures, and the heterogeneous acceleration server architecture is integrally disposed on the motherboard.
Optionally, the power supply module is a multi-path redundant power supply module.
Optionally, the chassis adopts a heat dissipation mode of front air inlet and rear air outlet.
The invention has the beneficial effects that:
in the heterogeneous acceleration server architecture, on one hand, a flexible architecture design is adopted for the PCIE acceleration card, and specifically, the PCIE acceleration card is divided into a signal processing acceleration card and a data processing acceleration card, the signal processing acceleration card is used for carrying out heterogeneous acceleration of signal data, and the data processing acceleration card is used for carrying out heterogeneous acceleration of data processing, so that the problem that the PCIE acceleration card adopted by the existing heterogeneous acceleration server architecture cannot give consideration to both signal processing tasks and data processing tasks is solved. On the other hand, for the heterogeneous acceleration server architecture of the present invention, the PCIE acceleration card is configured with the optical communication module, and two PCIE acceleration cards in the same group, which have a communication relationship with each other, implement communication based on the optical communication modules of both sides, that is, the two PCIE acceleration cards having a communication relationship can directly communicate without being transferred by the CPU, and the overhead for the memory bandwidth is small, thereby effectively solving the problem of the existing heterogeneous acceleration server architecture that the overhead for the memory bandwidth is large in the communication between the PCIE acceleration cards.
The heterogeneous acceleration server and the heterogeneous acceleration server architecture of the present invention belong to a general inventive concept, and have at least the same beneficial effects as the heterogeneous acceleration server architecture, and the beneficial effects thereof are not described herein again.
Additional features and advantages of the invention will be set forth in the detailed description which follows.
Drawings
The invention may be better understood by reference to the following description taken in conjunction with the accompanying drawings, in which like or similar reference numerals identify like or similar parts throughout the figures.
Fig. 1 shows a block diagram of a heterogeneous acceleration server architecture according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art can more fully understand the technical solutions of the present invention, exemplary embodiments of the present invention will be described more fully and in detail below with reference to the accompanying drawings. Obviously, the one or more embodiments of the present invention described below are only one or more of specific ways to implement the technical solutions of the present invention, and are not exhaustive. It should be understood that the technical solutions of the present invention can be implemented in other ways belonging to one general inventive concept, and should not be limited by the exemplary described embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any inventive step on the basis of one or more embodiments of the present invention, shall fall within the scope of protection of the present invention.
Example (b): fig. 1 shows a structural block diagram of a heterogeneous acceleration server architecture according to an embodiment of the present invention. Referring to fig. 1, the heterogeneous acceleration server architecture according to the embodiment of the present invention includes a first CPU configured with a plurality of memory banks DDR, a second CPU configured with a plurality of memory banks DDR, a south bridge chip, and a plurality of PCIE accelerator cards;
the system comprises a plurality of PCIE acceleration cards, a first CPU, a second CPU, a signal processing acceleration card and a data processing acceleration card, wherein the PCIE acceleration cards are divided into two groups, the first group of PCIE acceleration cards are all connected with the first CPU, the second group of PCIE acceleration cards are all connected with the second CPU, and each PCIE acceleration card in the PCIE acceleration cards is configured into the signal processing acceleration card or the data processing acceleration card according to a corresponding task to be executed;
the first CPU is connected with the second CPU through a UPI bus;
the south bridge chip is connected with preset peripheral equipment and is accessed to the first CPU through a DMI bus;
the PCIE accelerator cards are configured with optical communication modules, and two PCIE accelerator cards in the same group, which have a communication relation with each other, realize communication based on the optical communication modules of both sides.
Further, the heterogeneous acceleration server architecture of the embodiment of the present invention further includes a plurality of PCIE SWITCH chips;
each group of PCIE accelerator cards is accessed to a corresponding CPU through a corresponding number of PCIE SWITCH chips.
Furthermore, in the embodiment of the present invention, the signal processing acceleration card is configured with 2 FPGA chips and 2 DSP chips, and the data processing acceleration card is configured with 1 FPGA chip and 4 DSP chips;
the PCIE accelerator card is a full-high full-length PCIE card.
Still further, in the embodiment of the present invention, the peripheral device connected to the first CPU through the south bridge chip includes an IO interface module, a hard disk, a BMC chip, a BIOS chip, a TEM interface diagnostic card, and a clock generator chip.
Still further, in the embodiment of the present invention, the IO interface module includes an RJ45 interface module, a USB interface module, a VGA interface module, and an IPMI interface module;
the hard disks comprise M.2 interface solid state hard disks and SATA interface solid state hard disks, and the SATA interface solid state hard disks form a hard disk array.
Still further, in the embodiment of the present invention, the RJ45 interface module includes a gigabit RJ45 interface module and a gigabit RJ45 interface module;
the gigabit RJ45 interface module is realized by adopting a network control chip of RTL8211E type, and the gigabit RJ45 interface module is realized by adopting a network control chip of X557-AT2 type;
the USB interface module comprises a USB2.0 interface module and a USB3.0 interface module.
Correspondingly, on the basis of the heterogeneous acceleration server architecture provided by the embodiment of the invention, the embodiment of the invention also provides the heterogeneous acceleration server, the heterogeneous acceleration server comprises a case, a mainboard, a power supply module, a fan module and the heterogeneous acceleration server architecture, and the heterogeneous acceleration server architecture is integrally arranged on the mainboard.
Further, in the embodiment of the present invention, the power supply module is a multi-path redundant power supply module.
Furthermore, in the embodiment of the invention, the chassis adopts a heat dissipation mode of front air inlet and rear air outlet.
The heterogeneous acceleration server of the embodiment of the present invention is described in more detail below:
the heterogeneous acceleration server adopts a standard 4U rack type structure, can support 6 to 8 special PCIE acceleration board cards, and the acceleration board cards are divided into signal processing acceleration board cards and data processing acceleration board cards. The signal processing acceleration board card is used for carrying out heterogeneous acceleration on signal data, a full-height full-length PCIE board card structure is adopted, and 2 FPGA (XC 7V 690) and 2 DSP (TMS 320C 6678) blocks are built in the full-height full-length PCIE board card; the data processing acceleration board card is used for carrying out heterogeneous acceleration of data processing, adopts a full-height full-length PCIE board card structure, and is internally provided with 1 FPGA (XC 7V 690) and 4 DSPs (TMS 320C 6678).
The basic parameters of the heterogeneous acceleration server are as follows:
a CPU: two paths of CPU, CPU0 and CPU1.
Memory: 24DIMM slots can be supported, the memory capacity can be selected according to requirements, and the maximum support is 6TB.
Chip set: C622.
hard disk: the highest scalable supports up to 24 2.5 "SATA disks.
Size: 447X 17.5X 790mm.
Operating the system: centos 7.5.
PCIE: 6 to 8 signal/data processing acceleration boards are compatible (PCIE 8x,6Pin power supply).
The requirements of the working environment are as follows:
the heterogeneous acceleration server is a rack server, and the working environment temperature is as follows: 0 ℃ to 35 ℃.
The heterogeneous acceleration server is realized in a form of a server platform and a special board card. The server uses double-channel CPUs, the CPUs are interacted through a 3-channel UPI bus, and the double-channel CPUs are dispatched through an operating system. And each CPU is externally hung with 12 DIMM memories and drives 4 PCIE acceleration boards. The board cards can be connected through the tera optical module, various high-speed serial bus data interaction can be carried out, and the DDR bandwidth overhead of the server is reduced. In addition, the CPU0 is connected with the PCH through the DMI3, and further connected with various medium-speed devices, including 24 high-reliability SATA disks with the maximum size of 2.5 inches, and can form RAID 0, 1, 10, 5, 50 and 6.
The server uses a two-way CPU. Two routes of CPU mount 12 DIMM DDR and 6 way PCIE respectively, connect through the UPI bus between the two routes of CPU, wherein CPU0 inserts PCH chip (platform HUB controller), and then connects other medium speed equipment.
The PCH chip realizes the management and communication between the CPU and peripheral low-speed and medium-speed devices, and the PCH and the CPU carry out data transmission through a DMI3 (direct media access) high-speed bus. The peripheral equipment connected with the PCH is provided with a network (1 RJ45 network and 1 trillion RJ45 network port); the storage comprises 1 M.2 disk and a default 8-channel SATA 2.5-inch SSD disk; 4-way TYPE 4 USB2.0; a BMC chip; a BIOS chip, a TPM HEADER Debug Card.
The BMC chip is BMC AST2500, and is mainly responsible for control and management of power supply time sequence, management of fans, remote NCSI network and other services. TPM HEADER Debug Card is a Debug interface through which error or abnormal code can be observed and debugged. The BIOS chip contains a basic input/output system, and plays a role of a bridge between hardware and an operating system. DB1900 is a clock generator chip of the substrate, which provides the clock of the substrate.
The Power supply adopts 4 paths of 2000W redundant Power supply input, and the 4 paths of input Power supply support PMBus (Power Management Bus). Under the condition that the single-circuit power supply is input with 230V60HZ alternating current, the operation conditions of different loads are as follows:
table 1 power supply operating conditions
The heterogeneous acceleration server provides two paths of CPU 8Pin direct current power supplies, and the power consumption of the provided direct current power supplies is 205W.
The heterogeneous acceleration server is provided with 16 direct-current sockets, 12V power supply of 16 PCIE equipment at most is supported, pins are defined as standard 8Pin GPU power supply, 6Pin GPU power supply can be compatible through connecting lines, and output power meets PCIE standards.
The heterogeneous acceleration server adopts an X557-AT2 10G network controller chip to realize an RJ45 ten-gigabit network, adopts a PHY RTL8211E to realize a kilomega RJ45 network function, has high compatibility with a server platform, is self-adaptive and compatible with 100BASE-T, 1000BASE-T and 10GBASE-T, and has a self-adaptive Ethernet interface which meets the requirements of a case and external data transmission, thereby realizing the characteristics of comprehensive integration, combination, easy maintenance and use and the like, and improving the reliability and the applicability of the equipment.
TL8211E is a highly integrated network receiving PHY chip released by Realtek Ruyi, is a highly mature gigabit Ethernet PHY, and has stable performance. It conforms to the standards of 10Base-TIEEE802.3ab,100Base-TXIEEE802.3u and 1000Base-T IEEE 802.3.
The heterogeneous acceleration server reserves common IO interfaces, including 2 USB3.0,2 USB2.0, two RJ45 Ethernet interfaces (one gigabit Ethernet and one gigabit Ethernet), one VGA interface and one IPMI interface.
The chassis adopts the heat dissipation design of front air intake and rear air outlet, 4 pairs of 12V 3.6A fans are installed in the middle of the chassis, and when the chassis works, cold air of a system enters air from the holes of the front panel, and air flow is emitted from the rear part of the chassis through the hard disk, the CPU and the memory. The CPU and the GPU are layered up and down, independent air ducts are designed, no heat dissipation cascade connection is realized, and the optimal heat dissipation effect is obtained. The power module is arranged below the rear part of the case, and an independent air duct is reserved on the side surface of the case.
The chassis is a standard 19 inch racking chassis structure with a height of 4U. The dimensions width × height × depth =447mm × 175mm × 790mm.
The signal processing acceleration board card and the data processing acceleration board card are full-high full-length PCIE board cards and can be inserted into a slot position of a server supporting the standard structure. The maximum board card can support 6-8 paths of full-height full-length 8XPCIE boards.
The heterogeneous acceleration server provided by the embodiment of the invention is suitable for image, remote sensing and radar systems, can realize real-time processing of continuous and uninterrupted images or time domain sequences, meets the requirement of deploying a large-scale parallel data processing algorithm, and meets the project requirements of high performance, high concurrency, programmability, low delay and high reliability.
Although one or more embodiments of the present invention have been described above, it will be appreciated by those skilled in the art that the present invention can be embodied in any other forms without departing from the spirit or scope thereof. Accordingly, the above-described embodiments are intended to be illustrative, not limiting, and many modifications and alterations may be apparent to those of ordinary skill in the art without departing from the spirit and scope of the invention, as defined by the following claims.
Claims (9)
1. A heterogeneous acceleration server architecture is characterized by comprising a first CPU configured with a plurality of memory banks, a second CPU configured with a plurality of memory banks, a south bridge chip and a plurality of PCIE acceleration cards;
the plurality of PCIE accelerator cards are divided into two groups, wherein the first group of PCIE accelerator cards are all accessed to the first CPU, the second group of PCIE accelerator cards are all accessed to the second CPU, and each PCIE accelerator card in the plurality of PCIE accelerator cards is configured into a signal processing accelerator card or a data processing accelerator card according to a corresponding task to be executed;
the first CPU is connected with the second CPU through a UPI bus;
the south bridge chip is connected with preset peripheral equipment and is accessed to the first CPU through a DMI bus;
the PCIE accelerator cards are configured with optical communication modules, and two PCIE accelerator cards in the same group, which have a communication relation with each other, realize communication based on the optical communication modules of both sides.
2. The heterogeneous acceleration server architecture of claim 1, further comprising a plurality of PCIE SWITCH chips;
each group of PCIE accelerator cards is accessed to a corresponding CPU through a corresponding number of PCIE SWITCH chips.
3. The heterogeneous acceleration server architecture of claim 1, wherein the signal processing acceleration card is configured with 2 FPGA chips and 2 DSP chips, and the data processing acceleration card is configured with 1 FPGA chip and 4 DSP chips;
the PCIE acceleration card is a full-high full-length PCIE card.
4. The heterogeneous acceleration server architecture of claim 1, wherein the peripheral devices comprise an IO interface module, a hard disk, a BMC chip, a BIOS chip, a TEM interface diagnostic card, and a clock generator chip.
5. The heterogeneous acceleration server architecture of claim 4, wherein the IO interface modules comprise RJ45 interface modules, USB interface modules, VGA interface modules, and IPMI interface modules;
the hard disks comprise M.2 interface solid state hard disks and SATA interface solid state hard disks, and the SATA interface solid state hard disks form a hard disk array.
6. The heterogeneous acceleration server architecture of claim 5, wherein the RJ45 interface modules comprise gigabit RJ45 interface modules and gigabit RJ45 interface modules;
the gigabit RJ45 interface module is realized by adopting a network control chip of RTL8211E type, and the gigabit RJ45 interface module is realized by adopting a network control chip of X557-AT2 type;
the USB interface module comprises a USB2.0 interface module and a USB3.0 interface module.
7. A heterogeneous acceleration server, comprising a chassis, a motherboard, a power module, and a fan module, further comprising the heterogeneous acceleration server architecture of any one of claims 1 to 6, the heterogeneous acceleration server architecture being integrally disposed on the motherboard.
8. The heterogeneous acceleration server of claim 7, wherein the power modules are multiple redundant power modules.
9. The heterogeneous acceleration server of claim 8, wherein the chassis employs a heat dissipation mode with front intake and rear outlet.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211155162.3A CN115481074A (en) | 2022-09-22 | 2022-09-22 | Heterogeneous acceleration server architecture and server based on same |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211155162.3A CN115481074A (en) | 2022-09-22 | 2022-09-22 | Heterogeneous acceleration server architecture and server based on same |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115481074A true CN115481074A (en) | 2022-12-16 |
Family
ID=84424199
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211155162.3A Pending CN115481074A (en) | 2022-09-22 | 2022-09-22 | Heterogeneous acceleration server architecture and server based on same |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115481074A (en) |
-
2022
- 2022-09-22 CN CN202211155162.3A patent/CN115481074A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20080259555A1 (en) | Modular blade server | |
US7787482B2 (en) | Independent drive enclosure blades in a blade server system with low cost high speed switch modules | |
JP3157935U (en) | server | |
US20020124128A1 (en) | Server array hardware architecture and system | |
JP2017531856A (en) | Active storage units and arrays | |
CN1901530B (en) | Server system | |
CN202443354U (en) | A multi-node cable-free modular computer | |
WO2021174724A1 (en) | Blade server mixed insertion topological structure and system | |
CN211427335U (en) | Novel high-end eight-path server | |
CN111258948B (en) | Novel GPU server system | |
CN111008174B (en) | ATCA-based 100GE high-density server system | |
CN115481068B (en) | Server and data center | |
CN116700445A (en) | Full flash ARM storage server based on distributed storage hardware architecture | |
CN102541714B (en) | The implementation method of chip monitoring and device | |
CN115481074A (en) | Heterogeneous acceleration server architecture and server based on same | |
CN214011980U (en) | Server with RAS (remote server system) characteristic | |
CN215932518U (en) | Cloud computing ultra-fusion all-in-one machine equipment | |
CN115268581A (en) | AI edge server system architecture with high performance computing power | |
CN202795157U (en) | 5U14 blade high-density calculating system | |
CN113220080B (en) | Modularized multi-computing-node GPU server structure | |
CN217587961U (en) | Artificial intelligence server hardware architecture based on double-circuit domestic CPU | |
CN217847021U (en) | AI edge server system architecture with high performance computing power | |
CN112260969B (en) | Blade type edge computing equipment based on CPCI framework | |
CN218886525U (en) | General computer rack type server based on Loongson | |
CN117971566B (en) | Management board, interface module, industrial control server and industrial control system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |