[transformers] warmup_ratio is deprecated and will be removed in v5.2. Use `warmup_steps` instead.
trainable params: 5,070,848 || all params: 757,463,872 || trainable%: 0.6695
  0%|          | 0/12 [00:00<?, ?it/s]/opt/conda/lib/python3.10/site-packages/torch/autograd/graph.py:829: UserWarning: Attempting to run cuBLAS, but there was no current CUDA context! Attempting to set the primary context... (Triggered internally at /workspace/framework/mcPytorch/aten/src/ATen/cuda/CublasHandlePool.cpp:183.)
  return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2026-05-15 17:03:20 | ERROR    | peft-platform | Training failed for job bc4d7b3d-6f50-4877-aae7-1a1d0fc16da2: out of resource: shared memory, Required: 106496, Hardware limit: 65536. Reducing block sizes or `num_stages` may help.
2026-05-15 17:03:20 | ERROR    | peft-platform | Job bc4d7b3d-6f50-4877-aae7-1a1d0fc16da2 failed: out of resource: shared memory, Required: 106496, Hardware limit: 65536. Reducing block sizes or `num_stages` may help.