What is spinning up tools used for?

leonhartdreyse · November 26, 2023, 4:11pm

For trying the logic training? Anything else?
serach for spinningup openai com you’ll find the detailed engineering tutorials. But what is it for?
Or only for some specific functionality in specific fields?

_j · November 26, 2023, 5:37pm

It is about machine reinforcement learning software stacks from 2018.

Good place for conspiracy theorists to search for things they don’t understand, like words starting with Q.

github.com

openai/spinningup/blob/master/docs/spinningup/keypapers.rst

=====================
Key Papers in Deep RL
=====================

What follows is a list of papers in deep RL that are worth reading. This is *far* from comprehensive, but should provide a useful starting point for someone looking to do research in the field.

.. contents:: Table of Contents
    :depth: 2


1. Model-Free RL
================

a. Deep Q-Learning
------------------


.. [#] `Playing Atari with Deep Reinforcement Learning <https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf>`_, Mnih et al, 2013. **Algorithm: DQN.**

.. [#] `Deep Recurrent Q-Learning for Partially Observable MDPs <https://arxiv.org/abs/1507.06527>`_, Hausknecht and Stone, 2015. **Algorithm: Deep Recurrent Q-Learning.**

This file has been truncated. show original

leonhartdreyse · November 27, 2023, 6:12am

Appreciation on all of these resources~
I’ve learned policy-iteration and value-iteration on lectures.
In my understanding it’s something about creating an executor like game-player or auto-drivers.
Why would Reinforcement Learning be used for large language models in NLP field? I can see nothing boosting those projection layers like attention or maksed-skeleton attentions.
I don’t know what the exact intuition behind the personalized user feedback models. Does it really matter to add those MDP scores into those over-parameterized feed-forward sub-layers? Or I just tried a wrong guess on the application of this repository.

_j · November 27, 2023, 7:04am

Reinforcement learning is also the game played with token sequences to give a reward model to optimize generation.

The documentation there is at the infancy of transformer-based language models, many pages older than Google’s paper.

leonhartdreyse · November 27, 2023, 11:44am

So if available any recent influential papers that have advanced the field in the last year or two after BERT created?

vb · November 27, 2023, 2:05pm

Check out this thread which also covers newer papers.

Topic		Replies	Views
Spinningup & Learning AI Documentation	2	1334	January 3, 2024
Stanford AI Lab: Putting GPT-3’s In-context Learning to paces Community	3	1166	December 14, 2023
Language Modelling at Scale Community	11	795	January 3, 2024
How does fine tuning really work? API	6	16212	January 18, 2023
How would you build a content improver/positive spinning engine on top of GPT-3? API	4	530	November 9, 2021

What is spinning up tools used for?

Related topics