# Computer
# ++++++++

## IOMMU - Input Output Memory Management Unit
(needed for secure pass through to vm)
intel calls it VT-d, Virtualization Technology for directed IO
system pages are mapped to virtual address and scatter-gather returns physical
memory for a virtual address, essentially creating contiguous virtual memory
## ACPI - Advanced Computer Power Interface
## APIC - Advanced Programmable Interrupt Controller
## AMBA - Advanced Microcontroller Bus Architecture
### APB -  Advanced Peripheral Bus
### AXI - Advanced Extensible Interface
- for high speed soc components like memory
## SATA - Serial Advanced Technology Attachment
## SCSI - Small Computer Simple Interface
## ISA - Instruction Set Architecture
## HT - HyperThreading; multiple threads per core
## DMA - Direct Memory Access
## DMAR - DMA remapping
## DMAR-IR - DMAR interrupt request
## IRQ - Interrupt Request Queue
## UART - Universal Asynchronous Receiver and Transmitter
## CTS - Clear To Send
## DCD - Data Carrier Detect (DCE sends it to DCD, its the led that )
## DTE - Data Terminal Equipment (our computer)
## DCE - Data Circuit Terminating Equipment (telephone line end device)
## DSR - Data Set Ready
## RTS - Request To Send
## DTR - Data Terminal Ready
## RI - Ring Indicator (Incoming call ring)
## UEFI - Unified Extensible Firmware Interface
## OVMF - Open Virtual Machine Firmware enables UEFI for virtual machines
## OEM - Original Equipment Manufacturer
## KVM - Kernel-based Virtual Machine is a hypervisor
## PCI - Peripheral Component Interconnect Bus
```
+----------SoC-----------------------------------------------+
|                                                            |
|+-----+   +-----------------RootComplex--------------------+|
||     |<->| Host Bridge  (BUS 0)                           ||
|| CPU |   |      ^                                         ||
||     |   |      |                                         ||
|+-----+   |      +------------------+-----------------+    ||
|          |      |                  |                 |    ||
|+-----+   | +----+--------+ +-------+-----+ +---------+---+||
||     |<->| | VirtualPCI- | | VirtualPCI- | | VirtualPCI- |||
|| RAM |   | | PCIBridge   | | PCIBridge   | | PCIBridge   |||
||     |   | +--+----------+ +-------+-----+ +---------+---+||
|+-----+   |    |                     |                 |   ||
|          +----|---------------------|-----------------|---+|
+---------------|---------------------|-----------------|----+
                |                     |                 |
                | BUS 1               | BUS 3           | BUS 9
                v                     v                 v
       +--------+-------+         +---+---+     +-------+----+
       | PCI Express    |         |       |     | PCIExpress |
       | to PCI/PCX     |         | Switch|     | Endpoint   |
       | Bridge         |         |       |     +-------+----+
       +--------+-------+         +-+-+-+-+             |
                |                   ^ ^ ^               Device 0
               [ ] BUS 2            | | |
                v                   | | | BUS 5/6/7/8
       +--------+-------+           | | |
       | PCI/PCX        |     <-----+ | +----->  [PCIExpressEndPts]
       | Legacy Devices |             |
       +----------------+             +----->  [PCIExpressEndPts]

BUS 1: rx+|rx-|tx+|tx-|refclk+|refclk-|
perst#(usedToTellWhenTheClockNvoltageSignalsAreStable)|wake#|
prsnt1#|prsnt2#(usedForHotPlugDetection)|
jtag#|+12v#|rx+|rx-|tx+|tx-|gnd

pcie doesn't irq unlike pci, it sends irq via txNrx

SoC
+-----------------------------------------------------------------+
|                                                                 |
|   +-------+           +-------------------------------------+   |
|   |       |---------->|            Root Complex             |   |
|   |  CPU  |           +-------------------------------------+   |
|   |       |           |          Configuration Space        |   |
|   +---+---+           |                 4KB                 |   |
|       |               +-------------------------------------+   |
|       v               |             IP Registers            |   |
|   +-------+           +-------------------------------------+   |
|   |       |           | Source Addr | Dest Addr | Size |Type|   |
| Z |Memory |<--------- |-------------+-----------+------+----|   |
|   |       |           |             |           |      |    |   |
|   +-------+           +-------------------------------------+   |
|      1GB              |       Configurable Address Space    |   |
|                       |                                     |   |
|                       +-------------------------------------+   |
|                                                                 |
+-----------------------------------------------------------------+
       PCIe Address Space             PCIe Endpoint
  +--------------------------+  +-----------------------+
0 |                          |  |                       |
  | - - - - - - - - - - - -  |  |  Configuration Space  |
  |           ^              |  |                       |
  |           |              |  |  N:1               =  |
  |           |              |  +-----------------------+
  |           v              |  |                       |
C +--------------------------+  |      Memory Space     |
  |           ^              |  |                       |
  |           |              |  |                       |
X +--------------------------+  +-----------------------+
  |           |              |
Y +--------------------------+
  |           |              |
  |           v              |
D +--------------------------+
  |                          |
E +--------------------------+
  |                          |
Z |       1GB MEM ADDR       |
  |                          |
  +--------------------------+
2^32/64 - 1

```
- cpu can program the below by addressing root complex's registers
  - IP(Intellectual property) registers
  - lane width
  - speed mode: gen1/gen2
  - registers to Address translation unit(cpu address to pci address)
  - configurable address space: 
    
  - configuration, IO, memory, message(non physical) spaces are defined by pci 
    standard 
- pcie endpoint
  - configuration space has all the info about device like deviceId, vendorId,
    classCode, various capabilities and it will have registers to configure
    the device for (eg. to get into low power state)
  - pcie configuration space is backward compatible with pci and increase from
    256bytes to 4kb. first 64 bytes is called standard header, is of 2 types.
    type1: rootports/bridges/switches(primary,secondary,subordinate bus 
    number), type0: endpoints
## platform bus- for soc units
## i2c, spi bus- for bridges and panels
## smpi bus - spi bus for power management
## MIPI - Mobile Industry Processor Interface
## DSI - MIPI Display Serial Interface
## CSI - MIPI Camera Serial Interface
## SMP - Symmetric Multi Processing
- multiple identical proccessors inter connected to a single shared memory,
  and to all the I/O devices, unlike asymmetric MP(CPU+dGPU).
## I2c - Inter Integrated Controller
- half-duplex: Data transmitted in both direction but not at the same time
- only 2 wires are used: SDA(SerialData) and SCL(SerialClk)
- it's a overkill for communication between few devices as it needs addressing
  system: [start][slaveAddress][r/w][ACK][Data(8bits)][Ack]... r: slave reads
## SPI - Serial Peripheral Ineterface
- Full duplex: Data transmitted in both direction at the same time
- needs 3+N wires: SCLK, MOSI(MasterOutSlaveIn), MISO(MasterInSlaveOut) + 
  SS1(SlaveSelect),2,3,..
## PCIe - Peripheral Component Interconnect Express
- PCIe lanes come directly from CPU except PCH PCIe lanes. Each generation is
  2x faster than its predecessor: PCIe 5.0 is 32 GT/s, 4.0 is 16 GT/s and are
  2-way compatible across generations.
### PCIe vs PCI
- usage model and software interface same as PCI but hardware is not compatible
- pcie can work with serial technology while pci works with parallel bus
  technology
- Root Complex: pcie host controller present inside SOC
- Endpoint: a pcie device
- cpu_addr - CPU physical address (proc/iomem). pci has its own address space,
  32bit/64bit depending on the root complex, this address space is visible only
  to pci components(root complex, endpoints, switches and bridges)

## PCH - Peripheral Controller Hub
- Usually manages features on MoBo like USB, WiFi, Ethernet, Sound. But its
  speed limited to x8 3.0. PCH lanes connect to CPU via DMI link.
## DMI - Direct Media Interface
- Intel's proprietary link between the northbridge(or CPU) and southbridge(PCH)
  It supports concurrent and ischronous traffic.
## ASPM - Active-State Power Management
- Saves power in PCIe subsytems by setting a lower power state for PCIe links
  when the devices to which they connect are not in use.
## hypervisor - software, firmware or hardware that creates and runs
virutal machines
## device passthrough - CPU must support hardware virtualization

## USB - Universal Serial Bus
- Pins: Gnd, Vcc, D+, D-
- D+, D-: differential signalling: if D+=high then D-=low
  - D+ - D- cancels interference 
  - J: D+=high(>Vihz) D-=low(<Vil)
  - K: D+=low D-=high
  - SE0: single-ended-zero D+=D-=low: line inactive
  - SE1: invalid D+=D-=high
- Vihz: min input voltage recognized as high: 2V
  - z: high impedence state
- Vil: max input voltage recognized as low: 0.8V
- every packet starts with sync pattern
  - 3KJ pairs followed by 2Ks
- NRZI encoding: Non Return To Zero Inverted
  - 1 = no state change: KK or JJ
  - 0 = state change: KJ or JK
  - state change detection is reliable than high-low detection
- Bit stuffing: 0 is inserted after every 6 1s
- Low speed:  D- > Vihz, D+ < Vil
- Full speed: D- < Vil,  D+ > Vihz
- plug in device
- detect connection
- set address
- get device info
- choose configuration
- choose drivers for interfaces
- use it
### keyboard
- computer sends packet with device id
  - SYNC|IN|DevId|EndPoint|CRC(DevId|EndPoint): 24 bit
    - IN: 10010110
    - every byte is BigBit Endian(need to reverse bit order to get endianness)
- keyboard sends SYNC|8 byte key code followed by 16bit CRC
- computer sends SYNC|8bit ACK
- endpoint can be matched with lsusb output corresponding to keyboard.
- every time computer asks(polls) keyboard for keys, unlike PS2 interface
- full speed keyboard 12Mbps with polling interval 1ms
- slow speed keyboard 1.5Mbps with polling interval 16ms

## ESP32
emerge -av dev-vcs/git wget flex bison gperf python pyserial pyelftools cmake 
ninja ccache xtensa-esp32-elf

## memory pages - memory is divided into pages to handle memory fragmentation
- also used to allocate memory greater than available memory to swap unused
  memory to hard disk
- page fault occurs if required page isn't in main memory.
- this is called virtual memory

## ALU
`T add_sat (T x, T y)` add and saturate, adds x and y and if overflowed then
returns the maximum value.

## CM4 connection

## GUID
- Global Unique Identifier

## ACPI
- It is the way BIOS send structured location of devices

## Computer architecture
- Outlines the system's functionality, design and compatibility
### System disign
- Design of data processors, DMA, GPU, data paths, memory controllers and 
  miscellaneous things such as virutalization and multiprocessing
### ISA
- Defines CPU capabilities and functions like data formats, memorty addressing
  modes, processor register types. word size and the instruction set
### Microarchitecture
- also known as computer organization, defines storage elements, data 
  processing and data paths and how they should be implemented in ISA
## componenets of microprocessor
- alu, registers, cantrol units to move data between processors, memory caches
## MESI
- states of chace block, MOdified(pendingWriteToMain)-Exclusive(OnlyThis CacheHasBlockAndIsClean)-Shared(anotherCacheAlsoHoldingThisBlockUnmodified) 
  and Invalid(modifiedInAnotherCache). also known as illinois protocol. used 
  to maintain cache coherancy(copy of same memory block across processor 
  cores) in hierarchical memory.
  it is the most common protocol that supports write-back cache.
- Directory based coherency: cache state of varioush memory block is
  maintained in a central or distributed directory as a block->caches map.
- snooping coherency: each cache keeps track of coherency of physical memory
  block its holding.
- snooping and directory can be mixed in case of multichip multiprocessor.
- used by many coherant memories. non-coherant memories need software based 
  syncing
## MOESI
- O(owned): The other caches will get the block from the cache that has 'O' bit.
- there are others similar to MESI MOESI
## what is a snooping protocol?
- also called bus snooping protocol, maintains cache coherency in symmetric
  multiprocessing environments. Whenever a processor writes to its cache, it
  broadcasts the address of the modified block to the bus. other processors
  that have a copy of the same block in their caches can either invalidate
  or update it, depending on the protocol variant. But bus can become a
  bottleneck as the number of processors and cache accesses increase. the
  protocol also requires all the caches to monitor the bus constantly comsuming
  power and bandwidth. it is not suitable for distributed network where the bus
  is replaced by a network.
### mutex
- implemented using atomic operations like LOCK prefixed ops like LOCKXCHG.
- LOCKXCHG swaps register value with memory value in uniterrupted step.
- its achieved through cache coherency on modern systems. on older systems
  bus locking is used.
- memory barrier is called before LOCK op to ensure all prevoius memory 
  operations are completed.
## different hazards
- structural hazards: occur from resource conflicts when the hardware can't 
  support all the possible combinations of instructions in synchronized
  overlapped execution
- data hazards: data being corrupted due to being modified by different stages
  of pipeline
- control hazards: occur from the piplelining of branches and other 
  instructions
## what is pipelining
- keeping all stages of execution engaged all the time to maximize the work
  done is called pipelining.
## type of interrupts
- internal interrupts(software interrupts) caused by software instruction
  representing an event like SIGINT(Ctrl+c).
- external interrupts(hardware interrupts) caused by external hardware module.
## cache mapping: maps memory blocks and cache locations.
- Direct mapping: easiest way, maps each block of the main memory into only 
  one possible cache line. When a new block needs to be laoded, the old block
  is trashed. `i = j % m`(i:cache line no., j: main memory block no.,
  m: number of lines in the cache)
- Associative mapping: fastest and most flexible, any block can go into any 
  line of the cache. The work id bits are used to identify which word in the
  block is needed.
- Set-Associative mapping: cache is divided into sets and a memory blocked can
  be mapped to a cache set and is loaded into any location in that set.
## common rules of assembly language
- the label field can either be empty or may define a symbolic address.
- instruction fields can specify machine pseudo instructions
- comment fields can be commented with or left empty
- in case of symbolic addrsses, up to 4 char are allowed
- comment field begins with '/', symbolic address field terminate by ","
## RAID: Redundant Array of Independent Disks
## Hardware methods to establis a priority:
- Parallel priority: 

## JTAG (Joint Test Action Group)
- TDI(TestDataIn), TDO(TDOut), TMS(TModeSelect), TCK(TClock), TRST(TReset),
  SRST(SystemRST), RTCK(ReturnTCK) pins
- configure multi-core debugging
- C232HM-EDHSL(5v/450mA) and C232HM-DDHSL(3.3v/250mA(RPi)) JTAGtoUSB adapters
- OpenOCD(OnChipDebugger): software to which gdb can connect and debug
  instructions that the cpu executing currently

## Parallel port
- 3 8bit registers
- 1st 8bit register connected to 5 GPIO outputs and this can be connected to 
  a 7 segment register for example
- 2nd byte connected to 5 GPIO inputs with internal pull up registers.
- 3rd byte is 4 outputs.

## AMD - Advanced Micro Devices
Zen architecture for x86-64 based Ryzen series in 2017 by Jim Keller.

## ARM - Advanced RISC Machines
### Architectures
#### v4T
- Halfword and signed
- Halfword/byte support
- System mode
- Thumb instruction set
#### v5TE
- Improved ARM/Thumb
- Interworking
- CLZ
- Saturated arithmetic
- DSP multiply-accumulate
#### v6
- SIMD instructions
- multi-processing
- v6 Memory architecture
- unaligned data support
- Extension:
  - Thumb-2 (v6T2)
  - TrustZone (v6Z)
  - Multicore (v6K)
  - Thumb only (v6-M)
#### v7
- Thumb2
- NEON
- TrustZone
- Virtualization
- Architecture Profiles
  - v7-A (Applications): NEON
    - MMU, high efficiency, multitasking, trustzone, 40bitAdrressing,
    virtualization extensions
  - v7-R (Real-time): Hardware divide
    - Protected memory (MPU): no virtual memory
    - Low latency predictability real-time needs
    - tightly coupled memories for fast, deterministic access
  - v7-M (Microcontroller): Hardware divide,Thumb-2 only
    - low gate count =  low cost
    - deterministic and predictable behavior a key priority
    - deeply embedded use

- architecture specifies instruction set but can have different
  implementations
  - Cortex-A8 core is v7-A with 13-stage pipeline
  - Cortex-A9 core is v7-A with 8-stage pipeline
  - ![CoresArchsFeatures](images/ArmArchsNCores.png)

### Data Size and Instruction Sets
- Now a days though many instructions are not in RISC mode, most instructions
  execute in a single cycle, orthogonal register set, load-store architecture
- ARM is 32-bit load-store architecture; most internal registers are 32bit
  - the only memory accesses allowed are loads and stores
- ARM instruction set: 32 bit
- Thumb Instruction set: 32/16 bit
- switching arm n thumb is called interworking handled by compile n linker
- Older cores support 16-bit thumb instructions only
  - Thumb-2 technology in current cores adds 32-bit instructions to Thumb
  - ARMv7M only support thumb
### Processor Modes
- Most ARM cores have seven basic operating modes
- each mode has access to its own stack space and a different subset of
  registers called register banking
- Some operations can only be carried out in a privileged mode
- previleged modes
  - unrestricted access to hardware componentes and can execute previleged
    instructions which can be dangerous
  - supervisor mode: entered on reset and on supervisor call instruction
  - FIQ: when a high priority interrupt is raised
  - IRQ: on normal priority interrupt is raised
  - Abort: Used to handle memory access violations
  - Undef: used to handle undefined instructions
    - all the above are called exception modes, register banking makes nested
      excpetions of different kind handling much more efficient, but same kind
      is complicated
  - System: previleged mode using the same registers as user mode
- unprevileged mode
  - user mode is unprevileged, cant disable interrupts or reconfigure mem 
    access
- this mode structure only applies to cortex A n cortex Arm, for cortexM its
  completely different, it has:
  - Thread Mode(Unprevileged): for application code
  - Handler Mode(previleged): for exception handlers
  - above modes are switched upon exception entry or return
  - by default they operate on seperate stacks but can be configured to same
    stack and also both modes can be configured to be previleged.
- switching occurs by register organization
  - ![modeSwitching](images/ArmModeSwitch.gif)
  - spsr savedProgramStatusRegister holds pinter snapshot of current system 
    state at the moment of exception
- when operating in thumb state the fields in instruction are not large enough
  to address all the registers so they can directly address only low
  registers(r0-r7), there are only 1 or 2 thumb instrcutions that can access
  high registers(r8-r15)

### Cortex-M register set
- 13 general purpose registers r0-r7(low) r8-r12(high)
- - StackPointer(SP): r13
  - LinkRegister(LR): r14
  - ProgramCounter(PC): r15
- 1 special register xPSR ProgramStatusRegisters
- sp is switched between HandlerMode and ThreadMode

### sPSR
  - 31|30|29|28|27      24|23     19    |16|15      10|09|08|07|06|05|04    00|
    N  Z  C  V  Q       J           GE[3:0]            E  A  I  F  T   mode
  - Condition code flags
    - N: negative result from ALU
    - Z: zero form ALU
    - C: ALU operation Carried out
    - V: ALU op results in signed oVerflow
  - Q: Stick overflow flag used by saturating instructions
  - GE[3:0] used to record multiple results from SIMD instructions
  - 31 to 27 are only bits that are modified by user mode instructions
  - mode bits specify the current processor mode. in previleged mode these can
    be changed manually to change processor mode.
  - state bits
    - T: tells if executing ARM or Thumb instruction
    - J: Jazell state: tells if any of the cores executing java byte code.
  - I: irq, F: frq enabling or disabling, A: can disable asynchronous data
    aborts, E: can change endianness of data interface dynamically
  - remaining bits indicate internal system state and should never be modified.
### xPSR
  - 31|30|29|28|27      24|23     19    |16|15      10|09|08|07|06|05|04    00|
    N  Z  C  V          T                                          ExceptionNo.
  - T is always set to 1 coz its M
### Exceptions
  - internal or external
  - synchronous or asynchronous
  - when exceptions occurs a snapshot of current state is saved by coyping
    CPSR to SPSR, PC to LR, switches to appropriate exception mode, disables
    interrupts, uses vector table to find exception handler
  - Vector Table ponter to handler of each exception type
    0xIC FIQ
    0x18 IRQ
    0x14 Reserved
    0x10 Data Abort
    0x0C Prefetch Abort
    0x08 Sofware Interrupt
    0x04 Undefinde Instruction
    0x00 Reset
  - once handler is done mode is switched copying cpsr to spsr and LR to PC
### Exceptions in Cortex M are completely different
### Security Extensions (TrustZone)
- implements 2 virtual machines on same hardware
  Normal                          Secure
  Applications|Applications       Trusted sevices
  GuestOS     |GuestOS            Trusted OS
          Hypervisor
  Transition is handled by Secure Monitor Program
  
### ARM Instruction Set are 32bit
- Most instructions can be contionally executed, each instruction has condition
  field and is not executed if its not matched with current status of ALU in
  CPSR. majority are conditonal
- load/store instuction set - no direct manipulation of memory
- syntax of instruction
  SUB r0,r1,#5 => r0=r1-5
  ADD r2,r3,r3,LSL #2 => r2=r3+(r3*4) i.e r3 shifted left 2 places
  ANDS r4,r4,#0x20 //notice suffix S;means ALU condition codes in CPSR will b
  //updated, by default they are not changed if no suffix S
  ADDEQ r5,r5,r6 i.e. if (EQ) r5 = r5 + r6, //should google what EQ is
  B <Label> // branch instructions are PC relative, the offset is comuted 
  during compilation
  LDR r0,[r1] => r0=*r1 it simply loads memory pointed by r1 to r0
  STRNEB r2,[r2,r4] => if (NE) *(r3+r4) = r2 //stores least significant byte of
  r2 to address pointed by r3+r4 if NE condition is true

### Thumb instruction set are 16bit provides 35% of code density
- Thumb2 extension enables mixed instruction set. which gives ARM performance
  with Thumb code density. Majority of c/c++ code is compiles to thumb for 
  thumb2 capable cores. compilers choose thumb and if you are writing assembly
  then prefer ARM for ease.
- some ARM devices support other instruction sets too.
  - VFP vector floating point instructions supported by many processors with
    floating point coprocessor and/or software suppor libraries
  - NEON is a wide SIMD data processing architecture, intended for media apps
  - can code in assembly, use compiler intrinsics, 
    arm c compiler can automaically vectorizing code to take advantage of NEON
    C code, or use the OpenMAX DL Libraries
  
## Asimov's 3 laws:
- A robot may not injure a human being or, through inaction, allow a human
  being to come to harm
- A robot must obey the orders given it by human beings except where such
  order would conflict with the first law
- A robot must protect its own existence as long as such protection does not
  conflict with the first or second law