The objective of a Reinforcement Learning Policy Gradient agent is to maximize the “expected” reward when following a policy
| Framework | MITRE D3FEND |
| Ontology URI | d3f:PolicyGradient |
| Local Identifier | PolicyGradient |
| Publication Status | Exists in ontology only |