Grouped Relative Policy Optimization