A Performance Model For Gpu Architectures: Analysis And Design Of Fundamental Algorithms