# Artificial Neural Networks

- A computer program is said to learn from experience E wrt class of tasks T
  and performance measure P, if P improve with E@T.
  - Tom Mitchell
- T: learning rules/functions from examples.
- Example is some vector of real numbers or boolean variables.
- Experience E: More examples should give better Performance P at task T
- P: some metric that quantifies quality of learnt rules or functions.
- Supervised learning: labeled examples used. it can do classification. 
- Unsupervise learning: unlabeled examples used and distance metrics are
  computed
- Reinforced learning: learns by rewards obtained in agent-environment
  interaction.
- loss function J(w)=1/2sig(n,i=1)(yi-oi)^2
- SRGAN: Super Resolution Generative Adverserial Network
- DCSRN: Densely Connected Super Resolution Network
- SSIM: Structural Similarity Index Measure
- CNN: Convolution Neural Network
- LSTM: Long short-term memory in an RNN
- RNN: Recurrent neural network
- NCHW: batch N, channels C, depth D, height H, width W
- mAP: mean Average Precision

## Training using backpropogation
- loss function: `J(w)=1/2(sig(i=1,n)(yi - oi)^2)`
- yi is actual classification
- oi is infered output classification of neural network
- To obtain the weights with minimum loss function, instead of finding root of d(J(w))/dw as it got hyper dimensions which is impractical, we increment weights with stochasitc gradient descent of loss function.
- `C_0 = Σ[j=0, nL-1](a_j_L - y_j)^2`
- let `Z_j_L = w_j0_L*a_0_(L-1) + w_j1_L*a_1_(L-1) + w_j2_L*a_2_(L-1) + b_j_L`
- `a_j_L = σ(Z_j_L)`
- StochasticGradientDecent:
  - its one of the optimizers. recently Adam optimizer is choosen
  ```
  ∂C_0/∂w_jk_L = (∂Z_j_L/∂w_jk_L)*(∂a_j_L/∂Z_j_L)*(∂C_0/∂a_j_L)
               = a_k_(l-1)σ'(Z_j_(l))(∂C/∂a_j_l)
  ∂C_0/∂b_l = (∂Z_L/∂b_L)*(∂a_j_L/∂Z_j_L)*(∂C_0/∂a_j_L)
            = 1.σ'(Z_j_(l))(∂C/∂a_j_l)
  ∂C/∂a_j_l = Σ[j=0, n_(l+1)-1](w_jk_(l+1)σ'(Z_j_(l+1))(∂C/∂a_j_(l+1)))
  ∂C_0/∂a_k_(L-1) = 
  Σ[j=0, nL-1](∂Z_j_L/∂a_k_(L-1))*(∂a_j_L/∂Z_j_L)*(∂C_0/∂a_j_L)
  ```
- Back propogation
  ```
  ∂w_L = A^T*∂^1
  ∂C/∂W_L1 = X^T*∂^2
  ∂C/∂W_Ll = -Σ(Y-O)f'(G)∂G/∂W_Ll
  =(𝛿_1)^1*a_11
  𝛿^2 = 𝛿^1*W'^T*∂C/∂a_j_l
  ∂C_0/∂w_jk_L = X^T*
  ∂J/∂w_L
  
  or
  
  gradientOfLayer = dirivativeOfLayer*gradientOfPreviousLayer
  ```
## regression
- Regularization: prevents models from overfitting on the training data.

## multclass classification

## CNN
- layers:
  1. Input layer
  2. Convolution layer
  3. Pooling layer: Down samples image spatially and is purely a transformation
     typically max pooling or average pooling.
     - its hyper parameters are
       - F: Filter size
       - S: Stride
       - lr: learning rate
     - It produces 3D volume of dimensions C'xH'xW' where
       - C' = C
       - H' = 1 + (H-F)/S
       - W' = 1 + (W-F)/S
  4. RELU: Rectified Linear Unit
  5. Fully Connected layer
     - 2, 3, 4 layers reapeat and finally we have fully connected layer.

- Convolution
  - input image dimensions C(channels)xH(height)xW(width)
  - hyper parameters:
    - M: number of filters
    - F: Filter size
    - S: Stride
    - P: Zero padding
  - final 3d volume dimensions are C'xH'xW'
    - C': M
    - H': 1+(H-F+2*P)/S
    - W': 1+(W-F+2*P)/S
- Regularization: prevents models from overfitting on the training data.
## Transformers
- Attention is all that is needed.

## convert llama original to gguf
```
pip install transformers transformers[torch] tiktoken blobfile sentencepiece
hf auth login
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
python .venv/lib/python3.11/site-packages/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir Meta-Llama-3.1-8B-Instruct/ --model_size 8B --output_dir hf --llama_version 3.1 --instruct True
python3 convert_hf_to_gguf.py --outtype f32 --outfile ../meta-llama-3.1-8B-instruction_f32.gguf ../hf/
llama.cpp-testing/llama-quantize meta-llama-3.1-8B-instruction_f32.gguf meta-llama-3.1-8B-instruction_Q8_0.gguf Q8_0

../../convert_hf_to_gguf.py --outfile llama.gguf --outtype bf16 --model-name unsloth/Llama-3.2-1B-Instruct --verbose /home/Necktwi/.cache/huggingface/hub/models--unsloth--Llama-3.2-1B-Instruct/snapshots/5a8abab4a5d6f164389b1079fb721cfab8d7126c/
```

## build llama.cpp
```
cd ~/workspace/llama.cpp/build
sudo apt install libstdc++-14-dev
export GGML_CUDA_ENABLE_UNIFIED_MEMORY=1
#-DGGML_VULKAN=ON #very slow for now\
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" cmake -DGGML_HIP=ON -DGGML_HIPBLAS=ON -DAMDGPU_TARGETS=gfx1101 -DGPU_TARGETS=gfx1101 -DCMAKE_BUILD_TYPE=Debug -DGGML_HIP_ROCWMMA_FATTN=ON -DCMAKE_C_COMPILER=amdclang -DCMAKE_CXX_COMPILER=amdclang++ -DLLAMA_BUILD_COMMON=ON -DLLAMA_BUILD_TOOLS=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_EXAMPLES=ON -DCMAKE_CXX_FLAGS="-g -O0" ..
cmake --build . --config Debug -- -j `nproc`

```
## run llama.cpp
```
~/workspace/llama.cpp/build/bin/llama-cli -m ~/workspace/DeepSeek-R1-Distill-Llama-8B/hf/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf -ngl 999 -i --prompt-cache ~/DSCache -c 5110

./llama-cli -hf unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF:Q3_K_M -ngl 999 -i --prompt-cache /home/gowtham/llama.cpp.cacheDSR114B -c 29999 -mli -sysf sysp.md -f prompt.txt -fa

# Load and run the model:
llama-server -hf unsloth/gpt-oss-20b-GGUF:Q4_K_M
```

- sk-93b27bb5f2c54eab898c30d08a982963
- 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

## Transformers: attention is all you need
### Scaled Dot-Product Attention
- `Attention(Q,K,V)=softmax(QK'/sqrt(d_k))V`
  - Q: query: what is the token looking for
  - K: key: what does the token has
  - V: value: what does the token give

## train gpt-oss

### build triton
```
git clone https://github.com/triton-lang/triton
cd triton/
export TRITON_ROCM_ARCH=gfx1101
export PYTORCH_ROCM_ARCH=gfx1101
export HSA_OVERRIDE_GFX_VERSION=11.0.1
export MAX_JOBS=1
export CMAKE_BUILD_PARALLEL_LEVEL=1
export NINJA_NUM_JOBS=1
export MAKEFLAGS="-j1"
export TRITON_CODEGEN=hip
export HCC_AMDGPU_TARGET=gfx1101 
pip install -r python/requirements.txt
pip install -e . --verbose --no-build-isolation
pip install -e python/triton_kernels
```

### install gpt-oss
```
git clone https://github.com/openai/gpt-oss.git
cd gpt-oss
pip install -e ".[triton]" --no-deps
```


## opencode
```

```