Code is not only an implementation, but also a presentation of a way of thinking 📅 2025-12-26 ✍️ 965 字 ⏱️ 3 min read Soft Skills
My first Multi-GPU kernel: Writing All-to-all for AMD MI300X 📅 2025-11-02 ✍️ 10412 字 ⏱️ 24 min read CUDA
Writing Speed-of-Light Flash Attention for 5090 in CUDA C++ 📅 2025-08-23 ✍️ 8753 字 ⏱️ 20 min read CUDA FlashAttention