-
Notifications
You must be signed in to change notification settings - Fork 697
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PyTorch] Expose function to bulk-allocate tensors backed by the same buffer
#2900
opened Apr 18, 2026 by
timmoon10
Collaborator
Loading…
9 of 13 tasks
add support for enabling cuda graph under thd format in megatron.
#2898
opened Apr 17, 2026 by
HaochenYuan
Loading…
13 tasks
Improve the dimension checks for the FP8 recipes
#2894
opened Apr 16, 2026 by
ptrendx
Member
Loading…
13 tasks
Scaled Bias Add support after CUBLAS GGEMM
#2885
opened Apr 15, 2026 by
vthumbe1503
Collaborator
Loading…
13 tasks
[Debug] Add AutoswitchGEmm for Debug Precision Tool
#2883
opened Apr 15, 2026 by
shangxiaokang
•
Draft
3 of 13 tasks
fix(readme): update broken links and modernize project description
#2879
opened Apr 14, 2026 by
sbhavani
Collaborator
Loading…
3 of 13 tasks
[PyTorch] Split TE ops op_forward into op_forward and setup_context
#2877
opened Apr 14, 2026 by
pggPL
Collaborator
Loading…
5 of 7 tasks
[DONOT MERGE] Wgrad cute dsl v2
#2872
opened Apr 13, 2026 by
vthumbe1503
Collaborator
•
Draft
13 tasks
[JAX] Add debug validation mode for runtime group size alignment
#2867
opened Apr 11, 2026 by
jberchtold-nvidia
Collaborator
•
Draft
13 tasks
Optimizations for MXFP8/NVFP4 dequantize kernels
#2865
opened Apr 10, 2026 by
YigongQin
Loading…
8 of 13 tasks
Adds GEMM Profiling Guide to TE
#2863
opened Apr 9, 2026 by
jomitchellnv
Contributor
Loading…
7 tasks
Add cpplint and ruff linter to pre-commit and fix lint violations
#2853
opened Apr 8, 2026 by
pstjohn
Contributor
Loading…
Bump transformers from 4.55.0 to 5.0.0rc3 in /docs/examples/te_gemma
dependencies
Pull requests that update a dependency file
python
Pull requests that update python code
#2851
opened Apr 8, 2026 by
dependabot
bot
Loading…
Bump transformers from 4.57.0 to 5.0.0rc3 in /docs/examples/te_llama
dependencies
Pull requests that update a dependency file
python
Pull requests that update python code
#2850
opened Apr 8, 2026 by
dependabot
bot
Loading…
Skip activation kernels when tensor size is zero
bug
Something isn't working
#2848
opened Apr 8, 2026 by
timmoon10
Collaborator
Loading…
8 of 13 tasks
[Core] Report CUDA versions when NVRTC compilation fails
enhancement
New feature or request
#2842
opened Apr 7, 2026 by
timmoon10
Collaborator
Loading…
8 of 13 tasks
[CUDNN] Update frontend to version 1.22 and add cuDNN 9.20 path for SM arch >100
#2838
opened Apr 5, 2026 by
zmelumian972
Loading…
2 of 3 tasks
[PyTorch] Fix FlashAttention 2 head_dim > 192 on sm103 and other architectures
#2836
opened Apr 4, 2026 by
pedramr
Loading…
1 task done
[Pyt][Common] Enabling/Guarding sm120 support (non - attention)
2.15.0
#2833
opened Apr 3, 2026 by
KshitijLakhani
Collaborator
•
Draft
4 of 13 tasks
Add capture_time_hooks to make_graphed_callables for non-capturable per-callable hooks
#2831
opened Apr 3, 2026 by
buptzyb
Contributor
Loading…
1 of 13 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2026-04-14.