CPU appears inlarge-scale integrated circuitTimes,processorIterative update of architecture design andIntegrated circuit processThe continuous improvement of promotes its continuous development and improvement.From the original dedicated to mathematical calculation to widely used inGeneral calculation, from 4 bits to 8 bits, 16 bits32-bit processor, last64 bit processor, from incompatibility to difference among manufacturersInstruction setWith the appearance of architecture specification, CPU has been developing rapidly since its birth.[1]
CPU has developed for more than 50 years.We usually divide it into six stages.[3]
(1) The first stage (1971-1973).This is the low gear of 4-position and 8-positionmicroprocessorThe representative products of the times areIntelfour thousand and fourProcessor.[3]
(2) The second stage (1974-1977).This is the era of 8-bit middle and high-end microprocessors. The representative products areIntel 8080。hereCommand systemAlready quite perfect.[3]
(3) The third stage (1978-1984).This is the era of 16 bit microprocessors. The representative products areIntel 8086。It is relatively mature.[3]
(4) The fourth stage (1985-1992).This is the era of 32-bit microprocessors. The representative products areIntel 80386。Already competentMultitasking, multi-user jobs.[3]
(6) The sixth stage (after 2005).More processorscore, higherParallelismdevelopment.Typical representatives areIntelOfCoRE Series processors andAMDOfturion Series processors.[3]
Von NeumannArchitecture is the foundation of modern computer.Under this architecture,programanddataUnified storage,instructionsanddataNeed to be from the samestorage space Access via the sameBusThe transmission cannot be overlapped.according toVon Neumann systemThe work of the CPU is divided into the following five stages: instruction fetching stage, instruction decoding stage, instruction execution stage, access number and result write back.[1]
Fetching instruction (IF,instruction fetch),To change an instruction fromMain memoryGet fromInstruction registerProcess.Program counterThe value in indicates the position of the current instruction in main memory.When an instruction is taken out,Program counterThe value in (PC) will be automatically increased according to the instruction word length.[1]
Instruction decoding stage (ID,instruction decode),After taking out the command,Instruction decoderAs scheduledInstruction format, split and interpret the retrieved instructions, identify different instruction categories and various methods of obtaining operands.Modern CISC processors will split to improve parallelism and efficiency.[1]
Execute command phase (EX,execute), specifically realizing the function of the command.Different parts of the CPU are connected to perform the required operations.
In the MEM (memory) stage, the CPU accesses the main memory and reads the operands according to the instruction needs, gets the address of the operands in the main memory, and reads the operands from the main memory for operation.If some instructions do not need to access main memory, this phase can be skipped.[1]
resultWritebackWB (write back), as the last stage, the result write back stage "writes back" the running result data of the execution instruction stage to some storage form.The result data will generally be written to the internal registers of the CPU for subsequentinstructionsFast access;Many commands will changeProgram Status Word RegisterThe status of flag bits in. These flag bits identify different operation results and can be used to affect the action of the program.[1]
After the instruction is executed and the result data is written back, if there is noAccidents(such as result overflow, etc.)Program counterGet the address of the next instruction, start a new cycle, the nextInstruction cycleTake out the next instruction in sequence.[1]Many complex CPUs can extract multipleinstructions、decode, and execute at the same time.
brief introduction
Announce
edit
Central processing unit (CPU), which is electroniccomputerOne of the main equipment of the computer, the core accessory of the computer.Its function is mainly to explainComputer instructionAnd processing data in computer software.CPU is responsible for reading instructions, decoding instructions andExecute instructionThe core component of.The central processing unit mainly consists of two parts, namely, the controllerArithmetic unit, which also includesCache memoryAnd realize the connection between themdata, ControlledBus。The three core components of electronic computers are CPUInternal memory, Input/output device。The functions of central processing unit mainly include processing instructions, executing operations, controlling time and processing data.[2]
stayComputer ArchitectureIn, CPU refers to all hardware resources of the computer (such asstorage, I/O unit) is the core hardware unit that performs control allocation and general operation.CPU iscomputerThe core of operation and control.computer systemThe operations of all software layers in theInstruction setOperation mapped to CPU.[1]
Performance structure
Announce
edit
Performance measures
For CPU, the main indicators that affect its performance areDominant frequency, CPU bits, CPU cacheInstruction set、CPU coreNumber and IPC(Instructions per cycle)。The main frequency of the CPU isclock frequency, which directly determines the performance of the CPU. You can improve the main frequency of the CPU by overclocking to achieve higher performance.The number of CPU bits refers to those that can be calculated by the processor at one timeFloating point numberIn general, the higher the number of CPU bits, the faster the CPU will perform operations.After the 1920spersonal computerThe CPU used is generally 64 bit, because 64 bit processors can process a larger range of data and natively support highermemory addressing Capacity, improving people'swork efficiency。The cache instruction set of the CPU is stored inside the CPU, mainly referring to the hardware program that can guide and optimize the CPU's operation.Generally speaking, CPU cache can be divided intoL1 cache、L2 CacheandL3 cache, cache performance directly affects CPU processing performance.Some CPUs with special functions may be equipped with Level 4 cache.[4]
CPU structure
Generally speaking, the structure of CPU can be roughly divided into operationsLogic unit、registerComponents and control components, etc.The so-called operation logic unit can mainly carry out relevantLogical operation, for example, you can perform a shift operation andLogical operation, in addition, you can also perform fixed-point orfloating-pointArithmetic operationOperation, address operation and conversion commands are multifunctionalArithmetic unit。andregisterComponents are used to temporarily store instructions, data andaddressOf.The control unit is mainly used to analyze commands and send correspondingcontrol signal。
For CPU, it can be regarded as a large-scaleIntegrated circuitIts main task is to process and process various data.Traditional computerIts storage capacity is relatively small, which makes it difficult to process large-scale data, and the processing effect is relatively low.With the rapid development of information technology in China, high configuration processor computers have emergedcontrol center, for improving computer CPUStructure functionPlay an important role.The core part of the central processor is the controller and arithmetic unit, which play an important role in improving the overall function of the computer. It can realize the spread of multiple functions such as register control, logic operation, signal receiving and sending, and lay a good foundation for improving the performance of the computer.[2]
The integrated circuit plays the role of regulating signals in the computer, according to the userOperation instructionPerform different command tasks.The CPU is a very large scale integrated circuit.It consists of arithmetic unit, controller, register, etc., as shown in the figure below. The key operation is to process and process various data.[5]
CPU architecture[5]
Traditional computerstorage capacity Small, facing large scaledata setThe operation efficiency is low.The new generation of computers use highly configured processors as the control center, and the CPU has great room for improvement in structure and function.The central processor takes the arithmetic unit and controller as the main devices, and gradually spreads to multiple functions such as logic operation, register control, program coding, signal receiving and sending, etc.These have accelerated the optimization and upgrading of CPU regulation performance.[5]
CPU bus
CPU bus is the fastest bus in the computer system, and it is alsoChipsetWith the motherboard core.People usually connect theLocal busbe calledCPU busOr calledInternal bus, integrating those with various commonExpansion slotThe connected local bus is calledsystem bus OrExternal bus。In CPUs with a single internal structure, only one set is often setdata transferThe internal bus of the CPU is used to connect the registers and arithmetic logic operation units inside the CPU. Therefore, this type of bus can also be calledALUBus.The bus in the component connects each chip together by using a set of buses, so it can be called a componentInternal bus, generally includingAddress lineAnd data lines.System bus refers to the line connecting various components of the system togethersystemThe whole of the foundation connected together;The bus outside the system is the basic circuit connecting the computer and other equipment.[4]
(1) Arithmetic logic unit(ALU)。Arithmetic logic unitIt means that multiple groups can be realizedArithmetic operationLogicalCombinational logic circuit, which is an important part of central processing.Arithmetic logic unit mainly performs binary arithmetic operations, such as addition, subtraction and multiplication.In the operation process, the arithmetic logic unit mainly usesComputer instructionCentralized execution of arithmetic andLogical operationIn general, ALU can play the role of direct read in and read out, which is embodied in the processor controller, memory andInput/output deviceIn terms of input and output, the implementation is based on the bus.The input command contains aInstruction word, which includesOperation code, format code, etc.[2]
(2) Middleregister(IR)。Its length is 128 bits, and it passes throughOperandsTo determine the actual length.IR at“StackIt plays an important role in the "parallel data retrieval" instruction. During the execution of this instructionACCThe content of is sent to IR, then the operand is taken to ACC, and then the IR content is put on the stack.[2]
(3) Operationaccumulator(ACC)。CurrentregisterGenerally, they are single accumulators with a length of 128 bits.For ACC, it can be regarded as a variable length accumulator.In the process of describing instructions, the expression of ACC length is generally based on the value of ACS, and ACS length and ACC length haveDirect contact, doubling or halving the length of ACS can also be regarded as doubling or halving the length of ACC.[2]
(4) Description word register (DR).It is mainly used to store and modify description words.The length of DR is 64 bits. To simplifydata structureProcessing, using descriptors plays an important role.[2]
(5) B register.It plays an important role in the modification of instructions. The length of the B register is 32 bits, and the address modification amount can be saved during the address modification process,Main storageThe address can only be modified with the description word.Pointing to the first element in the array is the descriptor. Therefore, to access other elements in the array, you should use the modifier.For array members, they are composed of data of the same size or the same sizeElement compositionAnd continuous storage. The common access method is vector description word. Because the address in the vector description word is a byte address, the basic address should be added first in the conversion process.For conversion, it is mainly caused byHardwareAutomatic implementation. Pay special attention to alignment during this process to avoid exceeding array limits.[2]
controller
Controller refers to changing in a predetermined sequenceMain circuitOr control the wiring of the circuit and change the resistance value in the circuit to control the starting, speed regulation, braking and reversing of the motor.The controller is programmedStatus registerPSR,System status register SSR,Program counterPC,Instruction registerAs“decision-making body”, the main task is to issue orders, playing the wholecomputer systemOperation coordination and command function.The classification of control mainly includes two types, namely, combined logic controllerMicroprogram controllerBoth parts have their own advantages and disadvantages.Including combinationLogic controllerThe structure is relatively complex, but the advantage is fast;MicroprogramThe structure of the controller design is simple, but aMachine commandsIn the function, all microprograms need to be reprogrammed.[2]
Godson-2 series is 64 bit high-performance for desktop and high-end embedded applicationsLow power processor。Godson-2 products includeloongson2e , 2F, 2H, 2K1000, etc.Godson 2E has achieved external production and sales authorization for the first time.Godson2FThe average performance is more than 20% higher than that of Godson2E, which can be used in personal computers, industrial terminals, industrial control, data collection, network security and other fields.Godson 2H launched its official product in 2012, which is applicable to computersCloud terminal、network equipment , consumer electronics, etcHTperhapsPCI-Full function chipset of e interface.In 2018, Godson launchedGodson 2K1000Processor, which is mainly oriented to the field of network security andMobile intelligenceDual core processing chip in the field, with the main frequency up to 1GHz, can meetIndustrial Internet of ThingsRapid developmentAutonomous and controllableIndustrial safety system requirements.[6]
Godson-3 series is aimed atHigh performance computer, servers and high-end desktop applicationsMulti-core processor, featuring high bandwidth, high performance and low power consumption.Godson 3A3000/3B3000 processor is independentMicrostructureDesign, main frequency can reach 1.5GHzabove;Market orientedGodson3A4000It is the first quad core chip of Godson's third-generation product. Based on the 28nm process, the chip adopts the newly developed GS464V 64 bit high-performance processor core architecture and realizes 256 bitsVector instructionAt the same time, optimize the on-chip interconnection and memory access, and integrate 64 bitsDDR3/4Memory controller, integrated on-chip security mechanism, main frequency and performance will be greatly improved again.[6]
Godson 7A1000 bridge piece is the first special bridge piece set product of Godson, which aims to replaceAMDRS780+SB710 bridge piece setGodson processorprovideNorth South BridgeFunction.It was released in February 2018 and is currently matchedGodson3A3000And Ultraviolet 4GDDR3 memoryApplied in a high-performancenetwork platform On.Compared with 3A3000+780e platform, the overall performance of this scheme is greatly improved, with high national production rate and highperformance、high reliabilityEtc.[6]
Intel
according toIntelProduct line planning, Intel's 11th generation by 2021Consumer gradeCoRE There are six categories of products: i9/i7/i5/i3/Pentium/celeron 。In addition, there are server orientedxeon Platinum/gold/silver/bronze and Xeon W series for HEDT platform.
AMD
according toAMDProduct line planning, AMD by 2021turion The 5000 series processor has four consumer product lines: Ryzen 9/Ryzen 7/Ryzen 5/Ryzen 3.In additionThe serverThe third-generation Xiaolong EPYC processor in the market and the thread ripper series for the HEDT platform.[7]
Shanghai Megacore
Shanghai Zhaoxin Integrated Circuit Co., LtdIt is a state-owned holding company established in 2013. Its processors adopt x86 architecture, and its products mainly include ZX-AZX-c/ZX-C+、 ZX-D、First KX-5000 and KX-6000;Kaisheng ZX-C+, ZX-D, KH-20000, etc.Among them, the first KX-5000 series processor adopts 28 nm process and provides four core or eight core versions. Its overall performance is up to 140% higher than that of the previous generation, reaching the international mainstreamProcessor performanceThe standard can fully meet the requirements of the party and government desktop office applications, including4KUltra HDVideo viewing and other entertainment applications.Kaisheng KH-20000 series processors are the CPU products launched by Mega Core for servers and other devices.Kaixian KX-6000 seriesProcessor main frequencyUp to 3.0 GHz, compatible with the full rangeWindows operating systemAnd Zhongke FangdeWinning the bid Kirin, Puhua and other domestic autonomous and controllable operating systems, whose performance is similar to that of Intel's seventh generationCore i5Equivalent.[6]
Shanghai Shenwei
Shenwei processorIt is called "Sw processor" for short. It comes from Alpha 21164 of DEC and adoptsAlphaArchitecture, with completely independent intellectual property rights, its products include single core Sw-1, dual core Sw-2, four core Sw-410, sixteen core SW-1600/SW-1610, etc.Shenwei Blu raySupercomputer8704 SW-1600 are used, equipped withShenwei Ruisi operating systemAll software and hardware have been localized.But based on Sw-26010“Shenwei · Light of Taihu Lake”Since its release in June 2016, supercomputers have occupied the first place in the world's top 500 supercomputers for four consecutive times. The two 10 million core complete machine applications on "Shenwei · Taihu Light" have covered the world in 2016 and 2017High performance computingHighest award in application field“Gordon Bell ”Award.[6]
The traditional embedded field refers to a wide range of fields, and is the main processor in addition to the server and PC fieldsapplication area 。The so-called "embedded" means that in many chips, the processors contained in them are as unknown as if they were embedded in them.[8]
In recent years, with the further development of various new technologies and fields, the embedded field itself has also been developed into several different sub fields, resulting in differentiation.[8]
First, withIntelligent mobile phone(Mobile Smart Phone) andHandsetWith the development of Mobile Device, the mobile field has gradually developed into an independent field with a scale comparable to that of PC field.Because the processors in the Mobile field need to be loadedLinuxOperating system also involves complex software ecology, so it has the same serious dependence on software ecology as the PC field.[8]
The second is the real time embedded field.Relatively speaking, software in this field is not so seriousdependence, so there is no absolute monopoly, but becauseARMThe success of processor IP commercial promotion is still dominated by ARM processor architecturemarket share, other processor architectures such asSynopsysARC and others also have good market performance.[8]
The last is the deep embedded field.This field is more like the traditional embedded field mentioned above.In this fieldrequirementVery large, but often pay attention tolow power consumption, low cost and highEnergy efficiency ratio, no need to load imageLinuxFor such large-scale application operating systems, most of the software needs to be customizedbare pagerProgram or simplereal-time operating system Therefore, the dependence on software ecology is relatively low.[8]
Mainframe CPU
mainframe, or mainframe.Mainframes use dedicated processor instruction sets, operating systems, and application software.The term mainframe originally refers to a large computer system installed in a very large iron box with a frame, which is used to compare with smallerMinicomputerandmicrocomputerThere are differences.[9]
Reducing mainframe CPU consumption is an important task.Save everyCPU cycles, which can not only delay hardware upgrade, but also reduceSoftware LicensingFees.
mainframeArchitectureIt mainly includes the following two points: high virtualization,system resourceShare all.Mainframe can integrate a large number of loads and realize resourcesUtilizationMaximization of;asynchronousI/O operations。That is, when theI/ODuring operation, the CPU willI/O InstructionsGive to I/OSubsystemTo complete, the CPU itself is released to execute other instructions.Therefore, the host can perform other tasks while performing heavy I/O tasks.[9]
Control technology form
Announce
edit
The powerful data processing power of the central processor effectively improves the computer'swork efficiency, onData processingThe operation is not just a simple operation. The operation of the central processing unit is based on the instructions given by the computer usersExecute instructionIn the task process, the user inputControl commandCorresponds to the CPU.With the rapid development of information technology in China, computers are widely used in people's life, work and enterprise office automation. As a master control device, computers play an important role in promoting the development of e-commerce networksFacilitationThe upgrade process of CPU control performance is greatly improved.Command controlActual control, operation control, etc. is the computer CPUTechnology applicationFunction performance.[2]
(1) Select Control.The operation of centralized processing mode is based on specific program instructions, so as to meet the needs of computer users. The CPU can be selected according to the actual situation during the operation to meet the needs of usersData flowDemand.instructionscontrol technology Play an important role.Formulate the operation mode according to the user's needs, so thatData instructionThe orderly formulation of actions is well maintained.During the execution of CPU, the implementation of each instruction of the program is completed smoothly. Only by making it follow a certain order, can the computer use effect be guaranteed.CPU is mainly expandeddata setAutomated processing, which implementscentralized controlIts core is instruction control operation.[2]
(2) Insert control.CPU for operationcontrol signalThe generation of is mainly realized by the function of commands, and the purpose of controlling these components is achieved by sending commands to corresponding components.Implement aCommand function, which is mainly completed by executing a sequence of operations by components in the computer.More smallControl elementsIt is the key to build a centralized processing mode, which aims to better complete CPU data processing operations.[2]
(3)time control。Applying time timing to various operations is calledtime control。When executing an instruction, it should be completed within the specified time. The CPU instruction is generated from theCache memoryorstorageAnd then perform instruction decoding operation, mainly inInstruction registerIn this process, attention should be paid to strictcontrol programTime.[2]
Compare with GPU
Announce
edit
GPU
GPUI.eimage processor , CPU and GPUWorkflowandPhysical structureIt is roughly similar. Compared with CPU,GPUOur work is more simple.In mostpersonal computerGPU is only used to drawimageOf.If the CPU wants to draw a two-dimensional graph, it only needs to send an instruction to the GPU, and the GPU can quickly calculate all the graphspixel, and onmonitorDraw the corresponding figure at the specified position on the.Since GPU generates a lot of heat, there is usually an independent cooling device on the graphics card.[3]
Design structure
CPU has powerful arithmeticArithmetic unit, can be used in a fewClock cycleThe arithmetic calculation is completed within.At the same time, there is a large cache that can store a lot of data in it.In addition, there is complex logiccontrol unit When the program has multiple branchesBranch predictionTo reduce latency.GPU is based on largethroughputDesign, a lot of arithmeticArithmetic unitAnd very little cache.At the same time, the GPU supports a large number of threads to run at the same time. If they need to access the same data, the cache will merge these accesses, which will naturally cause a delay problem.Although there is delay, because of the large number of arithmetic operation units, it can achieve a very large throughput.[3]
Use Scenarios
Obviously, because the CPU has a lot of cache and complexlogical control So it is very good at logic control and serial operation.In comparison, GPU is good at large-scale computing because it has a large number of arithmetic operation units, so it can perform a large number of calculations at the same timeConcurrent computing, large amount of calculation but nothingtechnical contentAnd repeated many times.In this way, we use GPU to improve the programOperation speedThe method is obvious.Using CPU to do complex logic control and GPU to do simple but large amount of arithmetic operations can greatly improve the program'srunning speed 。[3]
safety problem
Announce
edit
With the vigorous development of CPUsafety problem。Appeared in 1994PentiumFDIV on processorbug(PentiumFloating point divisionError) will result inFloating point numberDivision error;1997PentiumThe F00F exception instruction on the processor can cause the CPU tocrash;Intel processors in 2011Trusted Execution Technology(TXT, trusted execution technology) existsout of bufferProblem, which can be used by attackers for privilege promotion;2017 Intel Management Engine(ME, management engine) can lead to remote unauthorized arbitrarycodeImplementation;In 2018, Meltdown andSpectreTwo CPU vulnerabilities affect almost every kind of computing device manufactured in the past 20 years, making the privacy information stored on billions of devices at risk of disclosure.These security problems seriously endanger the countrynetwork security, key infrastructure security and key industriesinformation safety, has caused or will cause huge losses.[1]
Future development
Announce
edit
The general central processing unit (CPU) chip isIT industryThe basic part ofweaponryThe core device of.China lacksIndependent intellectual property rightsThe CPU technology and industry ofnational securityIt is also difficult to obtain comprehensive protection.During the "Tenth Five Year Plan" period, the national "863 Program" began to support independent research and development of CPUs.During the 11th Five Year Plan periodelectronic device、High end general-purpose chipAnd foundationsoftware product ”(“Nuclear high base”)Major projects introduced CPU achievements in the "863 Program" into the industry.Since the 12th Five Year Plan, China has independently developed CPU applications andlaunch a pilot projectHas formed an independent technology and industrial system within a certain range, which can meet the requirements of weapons, equipmentpromotion of information technologyAnd other fields.However, foreign CPUs have been monopolized for a long time, and it will take some time for China to develop its own CPU products and mature its market.[10]