Build and Train General AI in OpenAI Gym: Initial Setup by sykochica

ai · @sykochica · Jan 14 '17

$8.13

Build and Train General AI in OpenAI Gym: Initial Setup

The [OpenAI Universe](https://universe.openai.com/) project seeks to decentralized artificial intelligence by allowing anybody to build bots that ideally can learn to play thousands of video games well. Games ranging from classic Atari, Minecraft to Grand Theft Auto V are available in this environment, but the actual building of a bot (or agent) to act in them is left to us. This is where OpenAI Gym comes in to play.
<center>http://i.imgur.com/Yt5Ojbo.jpg</center>
Instead of having to make a bot for each individual game (i.e. task-specific AI,) the goal here is to make one bot to play many games (i.e. general AI) well without knowing anything about it at the start. All we are going to have available is the ability to view the screen (visual recognition) and what controls are available such as an Atari paddle (moving left/right) or joystick (left/right/up/down and button.)

When I first read up on this, it was stated that you only needed 9 lines of Python code (which is true) to get started! That sounded too awesome to not give it a try! 

However getting the **environment** setup to be able to run these 9 lines of code successfully was another matter. Having mostly been a Windows user, this ended up taking me a little longer than I expected, with the project only supporting Linux and OSX (Apple's operating system.) 

But as of yesterday I have a running version. While the image below may not look terribly exciting, it took me some time to get an Atari game to run and render. From here I can finally start the fun part of working on the actual intelligence of the bots to play these games.
<center>http://i.imgur.com/4AUF1Do.jpg</center>
<center>http://i.imgur.com/Xs4cJOi.png</center>

## <center>Reinforcement Learning</center>
OpenAI Universe is what sets up the environments that our bots are going to play within. This can be any of the thousands of games they already have available and can even have multiple going at the same time. 
OpenAI Gym is where we are able to build and put our game bots (or agents) to then be put into the game universe. From here our agents are able to read the game screen and try to maximize their "score." This is how we are able to close the loop for reinforcement learning.
<center>http://i.imgur.com/b2ssgOv.png</center>
Using the example of a Super Mario game, a simple game bot (agent) can be to just press right, getting a higher score the farther to the right you go. As we all know, there lot's of ways we can be killed (ending the game) such as getting hit by a monster, falling down a hole, etc. So just pressing right will only get us so far, over time we need to find a way to adapt our instruction of only pressing right to include things like jumping, shooting fireballs, etc. This is where we will end up making use of reinforcement learning.

The game world is loaded up by OpenAI Universe (the Environment,) the game bot is loaded with OpenAI Gym (the agent) and over time we will refine our actions to get the highest score (reward) possible by finishing the level. 
Follow up posts will include what I do with the actual bot training with this one solely discussing the setup.

<center>http://i.imgur.com/Xs4cJOi.png</center>
## <center>How to Setup Open AI Universe and Gym</center>
**[NOTE: There are many ways you can get OpenAI running. This was just MY successful method.]**

For those who run Windows to get a linux environment download [VirtualBox](https://www.virtualbox.org/) and install it. Next you'll need to download the linux (in my case Ubuntu 16.04) system that we'll be using from [here.](http://releases.ubuntu.com/16.04/) This file (.iso) is about 1.5 GB so it may take a little while depending on your internet connection. Make sure to save this somewhere you can easily find it, such as your desktop.

Now we'll create your first virtual machine running Ubuntu by following [these steps.](http://askubuntu.com/questions/142549/how-to-install-ubuntu-on-virtualbox) Make sure to do the last step of removing the installation file (the .iso you downloaded) from the "optical drive." Otherwise when you restart your virtual machine it will try to start the installation process all over again. [If you follow the instructions, say YES to force unmount when prompted.]

From here I followed [this guide](https://alliseesolutions.wordpress.com/2016/12/08/openai-universe-installation-guide-ubuntu-16-04/) which uses [Anaconda](https://www.continuum.io/) that includes many of the libraries we are going to need such as Scipy and Numpy. Make sure to follow the "For Ubuntu 16.04" instructions, not the 14.04 ones. Everything in this guide worked well for me, except for the Tensorflow 11.0 section. [**Pip didn't work for me with Tensorflow**]

If you get messages that the Tensorflow wheel isn't compantible/available, just goto [here](https://www.tensorflow.org/get_started/os_setup#anaconda_installation) to follow the anaconda installation which has you run the commands:
`conda create -n tensorflow python=3.5`

`$ source activate tensorflow`
`(tensorflow)$  # Your prompt should change`

`# Linux/Mac OS X, Python 2.7/3.4/3.5, CPU only:`
`(tensorflow)$ conda install -c conda-forge tensorflow`

After you have Tensorflow installed you can go back to [this Guide](https://alliseesolutions.wordpress.com/2016/12/08/openai-universe-installation-guide-ubuntu-16-04/) and pick back up at the heading of "Next we can get started by installing Docker:".

This will walk you through the setup of Docker, OpenAI Universe and OpenAI Gym.
At the end it will have you install a started agent for playing Pong. To get the MsPacman game running that I had in my screenshot, just save this code into a python file (I used test.py):
```
import gym
env = gym.make('MsPacman-v0')
env.reset()
for _ in range(1000):
    env.render()
    env.step(env.action_space.sample()) # take a random action
```
This code came from [here.](https://gym.openai.com/docs)

Additional Resource:
[OpenAI Universe Documentation](https://github.com/openai/universe)
[Open AI Gym Documenation](https://gym.openai.com/docs)
[OpenAI – Universe Installation Guide Ubuntu 16.04](https://alliseesolutions.wordpress.com/2016/12/08/openai-universe-installation-guide-ubuntu-16-04/)
[Tensorflow Guide](https://www.tensorflow.org/get_started/os_setup)

http://i.imgur.com/sKiCvWa.png
<center>http://i.imgur.com/Xs4cJOi.png</center>
## <center> @winstonwolfe's Crowdsourced Steemit Video </center>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/xg81u-24hE8" frameborder="0" allowfullscreen></iframe></center>
## <center>Are you new to Steemit and Looking for Answers? - Try https://www.steemithelp.net.</center>
<center>[![](http://i1280.photobucket.com/albums/a485/emailtooaj/JoinUs_BB_gif_zpsgyozsqu2.gif)
](https://steemit.com/beyondbitcoin/@sykochica/how-and-why-to-join-beyond-bitcoin-mumble)</center>
<center>http://i.imgur.com/tCAIqAB.png</center>
Image Sources:
[MultiGame Panel](https://universe.openai.com/)
[Screenshot is from me]
[Reinforcement Learning](https://github.com/nnrg/opennero/wiki/SystemOverview)

👍 ned, val-a, michael-b, michael-a, proskynneo, wackou, ericvancewalton, lafona-miner, shaka, ozchartart, teamsteem, thecryptofiend, delegate.lafona, sirwinchester, jrcornel, hanshotfirst, inertia, streetstyle, anyx, opheliafu, norbu, the-alien, justtryme90, bue, tuck-fheman, michelle.gent, meesterboom, lordvader, getonthetrain, lafona5, cyan91, penguinpablo, scaredycatguide, vcelier, pkattera, cristi, soulsurfer, sykochica, forrestwillie, fyrstikken, themagus, funnyman, lemouth, grandpere, proctologic, hilarski, brianphobos, beginningtoend, aleksandraz, mor, krabgat, ace108, shredlord, nelyp, almerri, kooshikoo, gonzo, aksinya, runridefly, pjheinz, gardoz32, alcibiades, jessamynorchard, stephenkendal, and 90 others

`post_id`	1,756,841
`author`	sykochica
`permlink`	build-and-train-general-ai-in-openai-gym-initial-setup
`category`	ai
`json_metadata`	"{"format": "markdown", "links": ["https://universe.openai.com/", "https://www.virtualbox.org/", "http://releases.ubuntu.com/16.04/", "http://askubuntu.com/questions/142549/how-to-install-ubuntu-on-virtualbox", "https://alliseesolutions.wordpress.com/2016/12/08/openai-universe-installation-guide-ubuntu-16-04/", "https://www.continuum.io/", "https://www.tensorflow.org/get_started/os_setup#anaconda_installation", "https://gym.openai.com/docs", "https://github.com/openai/universe", "https://www.tensorflow.org/get_started/os_setup", "https://www.steemithelp.net", "https://steemit.com/beyondbitcoin/@sykochica/how-and-why-to-join-beyond-bitcoin-mumble", "https://github.com/nnrg/opennero/wiki/SystemOverview"], "app": "steemit/0.1", "tags": ["ai", "technology", "programming", "python", "gaming"], "users": ["winstonwolfe"], "image": ["http://i.imgur.com/Yt5Ojbo.jpg"]}"
`created`	2017-01-14 18:36:00
`last_update`	2017-01-14 18:36:00
`depth`	0
`children`	5
`net_rshares`	36,761,308,686,372
`last_payout`	2017-02-14 19:37:33
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	6.776 SBD
`curator_payout_value`	1.355 SBD
`pending_payout_value`	0.000 SBD
`promoted`	0.000 SBD
`body_length`	7,100
`author_reputation`	120,534,427,956,805
`root_title`	"Build and Train General AI in OpenAI Gym: Initial Setup"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 SBD
`percent_steem_dollars`	10,000
`author_curate_reward`	""

properties (23)vote details (154)

voter	rshares	pct
proskynneo	2,051,537,826,170	25%
michael-a	2,765,068,186,940	25%
michael-b	3,686,885,718,826	25%
val-a	4,937,594,130,146	25%
ned	13,102,449,836,021	25%
wackou	1,334,654,714,584	25%
lafona-miner	743,707,516,861	25%
delegate.lafona	394,813,145,348	25%
lafona5	76,717,376,365	25%
boy	6,362,293,826	100%
bue-witness	7,747,369,974	100%
bunny	1,240,651,075	100%
bue	122,069,022,730	100%
mini	3,404,696,739	100%
moon	432,920,407	100%
pheonike	4,122,897,538	0.8%
proctologic	39,335,187,111	50%
healthcare	1,274,029,944	100%
tuck-fheman	120,083,143,469	25%
daniel.pan	2,011,391,092	100%
proctologic2	596,042,592	50%
helen.tan	584,936,386	100%
mor	31,594,100,783	100%
proctologic3	415,979,550	50%
cyan91	76,275,643,266	100%
forrestwillie	54,896,706,607	75%
teamsteem	478,266,028,082	100%
murh	1,586,862,826	13%
cryptofunk	13,685,632,352	100%
thecryptofiend	460,473,510,045	100%
brich	3,599,715,320	25%
justtryme90	131,080,336,725	100%
applecrisp	885,993,235	20%
the-alien	135,913,132,357	25%
proglobyte	385,135,731	6%
grandpere	44,001,388,819	80%
comm-press	2,962,808,106	75%
crok	5,176,345,335	100%
fyrstikken	54,640,932,951	2%
norbu	146,839,930,386	85%
ericvancewalton	958,875,055,307	100%
thebatchman	1,117,173,582	3%
anyx	196,878,548,954	25%
meesterboom	91,394,094,603	100%
thebatchman1	69,190,249	3%
pkattera	58,851,382,800	25%
luisucv34	17,005,269,052	100%
streetstyle	201,625,299,503	100%
inertia	219,788,129,467	100%
opheliafu	169,780,853,684	100%
bones	2,588,597,555	100%
shredlord	25,591,510,608	100%
raymonjohnstone	171,378,264	50%
happyphoenix	1,210,055,442	100%
krabgat	31,200,559,528	100%
kooshikoo	23,021,945,186	100%
favorit	12,068,605,108	100%
ace108	30,662,777,531	100%
brianphobos	36,111,764,926	100%
shaka	675,566,623,963	100%
proglobyte-m1	273,703,700	3.6%
craigslist	863,957,444	100%
sykochica	55,203,319,923	100%
tingaling	265,513,355	3%
ozchartart	658,478,870,087	100%
transhuman	1,908,874,700	45%
allpunk	1,511,818,784	100%
aleksandraz	32,980,692,038	100%
themagus	46,692,341,482	100%
kurtbeil	9,050,963,533	6%
orientaledu	7,164,410,887	34%
jphamer1	9,375,029,271	100%
sirwinchester	330,755,075,400	71%
cristi	58,553,569,042	100%
numberone	6,485,123,567	100%
jrcornel	329,258,750,905	100%
scaredycatguide	60,567,856,275	100%
hanshotfirst	305,811,976,529	100%
lordvader	90,789,652,664	100%
zentat	313,076,300	3.6%
gonzo	22,912,801,019	100%
gardoz32	18,528,303,223	100%
pjheinz	18,731,848,248	100%
virtualgrowth	637,859,355	2%
lemouth	44,174,890,529	100%
lamech-m	4,792,500,131	100%
abarefootpoet	4,908,635,231	100%
almerri	23,106,241,729	100%
jsantana	13,142,959,051	50%
lasseehlers	5,756,635,503	100%
sethlinson	3,507,378,718	20%
hilarski	38,318,662,569	20%
craigwilliamz	5,649,497,766	100%
steembriefing	385,754,278	4.2%
ines-f	7,553,847,404	100%
runridefly	19,031,018,114	70%
stephenkendal	17,045,730,130	100%
penguinpablo	66,342,963,088	100%
blockcodes	2,179,000,743	100%
cryptochart	77,001,371	100%
getonthetrain	89,060,206,585	100%
dailybitcoinnews	14,690,071,552	50%
jeff-kubitz	14,523,163,510	100%
bitcoinparadise	11,586,692,849	100%
funnyman	44,466,254,521	100%
nelyp	25,575,559,219	100%
michelle.gent	102,151,534,523	100%
cathi-xx	8,016,717,222	100%
thegame	114,997,402	2%
aksinya	21,054,134,741	100%
timbot606	16,581,888,767	100%
steembets	105,202,399	2%
craftyselena85	1,062,441,997	100%
steemint	271,219,033	3%
mgibson	7,440,338,685	50%
beginningtoend	35,238,377,354	100%
mitchelljaworski	8,737,338,780	100%
steem-meme	1,905,644,846	20%
cannes	1,342,916,173	25%
alcibiades	18,348,630,157	100%
sunscape	8,554,124,318	20%
jessamynorchard	18,338,927,170	100%
knittybynature	6,378,459,242	100%
lovethepeople	67,882,339	50%
starrkravenmaf	14,223,021,344	100%
steemland.com	105,113,183	2%
porco-bastardo	65,962,544	50%
orbitdrop	863,356,747	100%
djvidov	7,509,110,664	100%
countryfolk1	506,723,656	100%
steemgold	9,425,223,500	100%
teukumukhlis	1,216,694,270	100%
steemprentice	1,128,603,855	2%
spbesner	6,436,478,099	100%
wagnertamanaha	3,144,653,597	100%
reisman	720,730,108	20%
bottymcbotface	309,816,803	100%
alinamarin	9,576,087,471	100%
zedikaredirect	2,274,573,874	100%
crypto-hippy	415,549,192	75%
vcelier	59,464,023,258	25%
jcaxo83	1,217,274,049	100%
dxrafi	746,165,648	100%
crowman	1,705,150,964	100%
tamersameeh	492,839,945	100%
cryptocash	154,457,729	100%
kostaslou	1,910,495,266	100%
soulsurfer	56,989,801,003	100%
automaton	586,651,930	100%
luka.skubonja	1,424,057,219	100%
eem	417,296,432	100%
azlicr	383,911,480	100%
mightyenvz	242,003,140	100%
artbohr	0	100%

`post_id`	1,756,858
`author`	steemgold
`permlink`	re-sykochica-build-and-train-general-ai-in-openai-gym-initial-setup-20170114t183912447z
`category`	ai
`json_metadata`	"{"app": "steemit/0.1", "tags": ["ai"]}"
`created`	2017-01-14 18:39:15
`last_update`	2017-01-14 18:39:15
`depth`	1
`children`	0
`net_rshares`	0
`last_payout`	2017-02-14 19:37:33
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 SBD
`curator_payout_value`	0.000 SBD
`pending_payout_value`	0.000 SBD
`promoted`	0.000 SBD
`body_length`	9
`author_reputation`	51,549,234,710,828
`root_title`	"Build and Train General AI in OpenAI Gym: Initial Setup"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 SBD
`percent_steem_dollars`	10,000

`post_id`	1,757,151
`author`	automaton
`permlink`	re-sykochica-build-and-train-general-ai-in-openai-gym-initial-setup-20170114t194340863z
`category`	ai
`json_metadata`	"{"app": "steemit/0.1", "tags": ["ai"]}"
`created`	2017-01-14 19:43:42
`last_update`	2017-01-14 19:43:42
`depth`	1
`children`	1
`net_rshares`	0
`last_payout`	2017-02-14 19:37:33
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 SBD
`curator_payout_value`	0.000 SBD
`pending_payout_value`	0.000 SBD
`promoted`	0.000 SBD
`body_length`	12
`author_reputation`	2,803,998,505,223
`root_title`	"Build and Train General AI in OpenAI Gym: Initial Setup"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 SBD
`percent_steem_dollars`	10,000

`post_id`	1,757,312
`author`	sykochica
`permlink`	re-automaton-re-sykochica-build-and-train-general-ai-in-openai-gym-initial-setup-20170114t201251311z
`category`	ai
`json_metadata`	"{"app": "steemit/0.1", "tags": ["ai"]}"
`created`	2017-01-14 20:12:45
`last_update`	2017-01-14 20:12:45
`depth`	2
`children`	0
`net_rshares`	0
`last_payout`	2017-02-14 19:37:33
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 SBD
`curator_payout_value`	0.000 SBD
`pending_payout_value`	0.000 SBD
`promoted`	0.000 SBD
`body_length`	83
`author_reputation`	120,534,427,956,805
`root_title`	"Build and Train General AI in OpenAI Gym: Initial Setup"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 SBD
`percent_steem_dollars`	10,000

`post_id`	1,758,196
`author`	justtryme90
`permlink`	re-sykochica-build-and-train-general-ai-in-openai-gym-initial-setup-20170114t233604083z
`category`	ai
`json_metadata`	"{"app": "steemit/0.1", "users": ["sykochica"], "tags": ["ai"]}"
`created`	2017-01-14 23:36:03
`last_update`	2017-01-14 23:36:03
`depth`	1
`children`	1
`net_rshares`	0
`last_payout`	2017-02-14 19:37:33
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 SBD
`curator_payout_value`	0.000 SBD
`pending_payout_value`	0.000 SBD
`promoted`	0.000 SBD
`body_length`	71
`author_reputation`	140,173,741,834,676
`root_title`	"Build and Train General AI in OpenAI Gym: Initial Setup"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 SBD
`percent_steem_dollars`	10,000

`post_id`	1,763,161
`author`	sykochica
`permlink`	re-justtryme90-re-sykochica-build-and-train-general-ai-in-openai-gym-initial-setup-20170115t191452499z
`category`	ai
`json_metadata`	"{"app": "steemit/0.1", "tags": ["ai"]}"
`created`	2017-01-15 19:14:51
`last_update`	2017-01-15 19:14:51
`depth`	2
`children`	0
`net_rshares`	3,878,133,572
`last_payout`	2017-02-14 19:37:33
`cashout_time`	1969-12-31 23:59:59
`total_payout_value`	0.000 SBD
`curator_payout_value`	0.000 SBD
`pending_payout_value`	0.000 SBD
`promoted`	0.000 SBD
`body_length`	142
`author_reputation`	120,534,427,956,805
`root_title`	"Build and Train General AI in OpenAI Gym: Initial Setup"
`beneficiaries`	`[]`
`max_accepted_payout`	1,000,000.000 SBD
`percent_steem_dollars`	10,000
`author_curate_reward`	""