while doing this entire fiasco sort of it has to be a full rank matrix if not the matrix has to be modified to only include independent column vectors?

Yes, exactly! This highlights a crucial point about least squares solutions:

For AᵀAx = Aᵀb to have a unique solution, AᵀA must be invertible. This requires that:

  1. A must have full column rank (all columns are linearly independent)
  2. The number of rows (observations) must be ≥ number of columns (parameters)

If A is rank-deficient (has linearly dependent columns):

  1. AᵀA will be singular (non-invertible)
  2. There will be infinitely many solutions
  3. To fix this, we need to either:

In practical terms, when columns are nearly linearly dependent (multicollinearity), even though AᵀA is technically invertible, the solution can be numerically unstable. This is why techniques like QR decomposition are often preferred over directly computing (AᵀA)⁻¹.