The modular design of SLM Lab implies that its components can also be used in other projects. In fact, SLM Lab can be installed as a pip module and used like a typical Python package. This means you can just do a
pip import slm_lab (or
python setup.py install without
pip) and initialize an RL agent for use in your Python project.
This is especially crucial for those who want to use these algorithms in an industrial application. Often the app is part of a massive industrial system, and it is difficult or impossible to wrap that inside the lab. The agent must be made into an importable module to be used inside the app, either for training or for inferencing in deployment.
For demonstration, we have created a standalone script to show how to do this. The solution is very lightweight. The proper spec format will initialize the agent as usual, and as long as the proper agent APIs are called, all agent functionalities will work. Of course, to make use of the lab’s full potential such as distributed training and parameter search, you would still need to use SLM Lab.
The demo below uses a simplified form of the SLM Lab’s Session class. This shows that the main control loop and API methods are already generic.
# Installation:# 1. Clone SLM-Labgit clone https://github.com/kengz/SLM-Lab.gitcd SLM-Lab# 2. Install SLM Lab as pip modulepip install -e .
Let's see how we can implement a Session in an external project.
import os# NOTE increase if needed. Pytorch thread overusage https://github.com/pytorch/pytorch/issues/975os.environ['OMP_NUM_THREADS'] = '1'from slm_lab.spec import spec_utilfrom slm_lab.lib import logger, utilfrom slm_lab.experiment import analysisfrom slm_lab.env.openai import OpenAIEnvfrom slm_lab.agent import Agent, Bodyimport torchclass Session:'''A very simple Session that runs an RL loop'''def __init__(self, spec):self.spec = specself.env = OpenAIEnv(self.spec)body = Body(self.env, self.spec)self.agent = Agent(self.spec, body=body)logger.info(f'Initialized session')def run_rl(self):clock = self.env.clockstate = self.env.reset()done = Falsewhile clock.get('frame') <= self.env.max_frame:if done: # reset when episode is doneclock.tick('epi')state = self.env.reset()done = Falseclock.tick('t')with torch.no_grad():action = self.agent.act(state)next_state, reward, done, info = self.env.step(action)self.agent.update(state, action, reward, next_state, done)state = next_stateif clock.get('frame') % self.env.log_frequency == 0:self.agent.body.ckpt(self.env, 'train')self.agent.body.log_summary('train')def close(self):self.agent.close()self.env.close()logger.info('Session done and closed.')def run(self):self.run_rl()# this will run SLM Lab's built-in analysis module and plot graphsself.data = analysis.analyze_session(self.spec, self.agent.body.train_df, 'train')self.close()return self.data# This uses SLM-Lab's existing spec. Alternatively, you can write one yourself too (see documentation for detail).spec = spec_util.get(spec_file='slm_lab/spec/demo.json', spec_name='dqn_cartpole')os.environ['lab_mode'] = 'train' # set to 'dev' for rendering# update the tracking indicesspec_util.tick(spec, 'trial')spec_util.tick(spec, 'session')# initialize and run sessionsession = Session(spec)session_metrics = session.run()