Monday, April 4, 2016

Tutorial: Theano install on Windows 7, 8, 10

Hello Everyone,

This post is a step by step tutorial on installing Theano for Windows 7, 8, and 10. It uses Theano, CUDA, and Anaconda.

Anaconda is a package manager for python that simplifies setting up python environments and installing dependencies. If you really don't want to use Anaconda, check out my older post here.

Let's get to it:
  1. Make sure your computer has a compatible CUDA graphics card: https://developer.nvidia.com/cuda-gpus 
  2. Download CUDA 
    1. https://developer.nvidia.com/cuda-downloads (I downloaded Cuda 7.5)
  3. While that's downloading, head to https://www.visualstudio.com/en-us/downloads/download-visual-studio-vs.aspx and get Visual Studio 2013 (the community version).
    1. Download and install, this will install the needed C++ compilers
    2. Couple of notes here, my install needed 7GB and took ~20 minutes to install
  4. Install CUDA ~7 minutes
    1. Note: Nsight won't install for older versions of Visual Studio if you don't have them, no worries
    2. I restarted this is windows after all...
  5. Check CUDA
    1. Navigate to C:\ProgramData\NVIDIA Corporation\CUDA Samples\v7.0\1_Utilities\deviceQuery and open the vs2013.sln file
    2. Use CTRL+F5 to run the device check and keep the cmd window open
    3. Make sure you Pass the test, otherwise there is a problem
  6. Download and setup Anaconda
    1. https://www.continuum.io/downloads. The Python 3.5 installer is fine
    2. Install it, it will take awhile ~5-10 minutes
  7. Download Theano
    1. https://github.com/Theano/Theano, Download Zip at the bottom right
    2. Extract
  8. Open CMD prompt
    1. Setup a new conda environment that uses python 3.4
      1. conda create -n name_of_your_environment python=3.4
    2. Activate your conda environment and install dependencies
      1. activate name_of_your_environment
      2. conda install numpy scipy mingw libpython
    3. Navigate to Theano extracted folder /Theano-master
    4. Use python setup.py install
      1. This automatically uses 2to3 conversion
  9. We need to add some system variables
    1. Right click Computer -> properties -> advanced system settings -> environment variables
    2. Add a new system variable
      1. Name = THEANO_FLAGS
      2. Value = floatX=float32,device=gpu,nvcc.fastmath=True
    3. Also add Visual Studio's c++ compiler to the path
      1. Add ;pathToYourVSInstallation\VC\bin\
  10. Final check
    1. Open another CMD prompt (you'll need to close the old one because it doesn't have the system variables)
    2. activate name_of_your_environment
    3. python
    4. import theano
    5. You should see something like
      1. Using gpu device 0: Quadro K1100M (CNMeM is disabled)
Now you'll be able to use Theano when you activate your conda environment.

Note: For pycharm users, PyCharm does not automatically activate the conda environment for you (bug submitted here). What you can do is just create a .bat file with these contents:
    call activate env_name
    path_to_pycharm\bin\pycharm64.exe

Let me know any questions or comments below.

Monday, September 21, 2015

Analyzing Deep Q-Learning

Deep Q-Learning is regular Q-Learning with a slight twist. The best definition of Q-Learning is at StudyWolf. There are 4 articles related to reinforcement learning and I would suggest to read all of them, they are easy to understand and very hands on. For those that don't want to read the article, or are familiar with reinforcement learning the basic definition is:
Q-Learning is a lookup table, all states (S) the agent encounters are stored in this table. When the agent receives a reward (r) the table is updated. It sets the reward value of that action (a) in that state to the reward. Then (as common in reinforcement learning) goes backward and updates the previous state/actions values with the discounted reward (q[s-1] = r*discount). Note: if you're like me and learn from code, I've written quick code at the bottom of this page to show a basic q-learning model.

DQN changes the basic algorithm a little bit. Instead of keeping a table of states, we treat the network as the table. So at each timestep we ask the network to compute the values for actions, the action with the highest value is chosen by the agent (according to an e-greedy policy). E-greedy selects a random action (instead of the highest valued one) with probability (E). When it comes time to train, the only signal the network is given is a positive or negative reward (it's not fed manually discounted values like with temporal difference). For all the states that don't have rewards the network is asked to compute what it thinks are the reward values for the next state (S t+1). Then these predictions are multiplied by the discount and used to train the network. Code for this is on GitHub in my python-dqn project under handlers.experienceHandler.train_exp(). So the signal actually given to the network is only the positive or negative reward, nothing else.

If you have a lot of time to train this algorithm will converge on a good solution hopefully memorizing when it will be rewarded and inferring what actions will get it rewarded in the future.

The issue is the 'a lot of time to train' part. DeepMind ran 5,000,000 updates to get the results they found in their paper. With one update every 4 frames that's 20,000,000 frames!
My implementation is slow (cuda_convnet and windows is still a problem I need to fix which should give about 20% speed up) at about 20FPS on a GTX970 with an i7 (this is when the agent is losing most games quickly so average FPS would be lower). So after 23 days I could get an agent that plays breakout better than a human. David Silver said it took days for them to train so my implementation may not be that bad.

The real question is what are we missing? Why does it take so long? In the next post I'll talk about some ways I believe will improve training and explore these issues in further detail.

qtable = dict()

while world.is_not_goal_state():
    state = world.get_state()
    # the line below does not take into account an e-greedy policy
    action = np.argmax(qtable[state], axis=1)
    world.take_action(action)
    rew = world.get_reward()
    if rew != 0:
        update_q(state, action, rew)
    

Friday, July 31, 2015

Deep Q-Learning (DeepMind) Arcade Learning Environment

Last time I wrote about configuring Theano with Windows 7&8 64-bit. Now lets do something fun with it.

Recently, the Arcade Learning Environment (ALE) has become really popular in the machine learning field. Google's Deepmind group integrated reinforcement learning and convolutional neural networks into an AI that can learn to play Atari Games. Original article is here. There are some really cool YouTube videos of their approach learning to play games, I'd suggest checking them out.

This post will be the first in the series of replicating their approach. In this tutorial, we will get ALE running and get it set up with Python. Of course, it's not going to be as easy as using Linux, but we all use Windows for our own reasons.

Thankfully, a lot of the legwork for this integration has been done for us by various people and this tutorial should take much less time than the last.

Note: Almost all of the work on the Visual Studio project comes from Martin Brazdil's (my code is forked from his library). The integration from C++ to python (Ale_python_interface) comes from Ben Goodrich at https://github.com/bbitmaster/ale_python_interface. My code here just combines these and makes necessary modifications for it to work in Windows (and some bug fixes).
  1. Download the updated implementation of ALE 0.4.4 in Visual Studio 2013
    1. https://github.com/Islandman93/The-Arcade-Learning-Environment-0.4.4-Visual-Studio-2013-Windows-8.1-x64
    2. Extract, go into the src folder and load the ALE.sln file
      1. NOTE: You'll need visual studio 2013... should have it from the last tutorial.
    3.  Build with CTRL+SHIFT+B or under Build->Rebuild Solution
    4. This will create the needed .dlls under the src/x64/Debug folder
      1. We'll come back to this in Step 2.2
  2. Download the Python deep q learning project. Note: this code is work in progress, the files we will be looking at are fully functional but others may confuse and misdirect. 
    1. https://github.com/Islandman93/python-deep-q-learning
    2. Extract, then copy the .dll files from the Visual Studio folder (from src/x64/Debug) into the libs folder.
      1. You can put the .dll's anywhere you want you'll just have to change the ale_python_interface.py file.
      2. Change ale_lib = cdll.LoadLibrary('ALE.dll') to the location of your .dlls
  3. Download the breakout.bin ROM file.
    1. Note: ROMs are a legal gray area, I don't condone breaking the law. Do some of your own research to determine what is legally acceptable for your uses.
    2. Once you choose your download location you'll need to load the rom using this code
      1. ale.loadROM(b'd:\_code\\breakout.bin') # do a double \\ if the next character is an escape character
This wraps up this tutorial. To run, start the breakout_dqn.py program and let it run (currently it's configured for 2000 games, it takes ~1 day to run on GTX 970 and i7. I'm planning on doing a livecoding.tv walkthrough of the project soon.

Here are the current best results I've gotten after 4000 games. Note that losing a life is -1 reward so add +5 (total number of lives) to the score to get the number of bricks broken. The best is 17 bricks which happens twice, once before 2000 games and once right after. 




Thursday, April 9, 2015

Tutorial: Python 3.4, Theano, and Windows 7&8-64bit

Hello Everyone,

Update: I wrote a new post that uses Anaconda instead of installing Mingw directly. This way you can keep the path variable clean and makes installing numpy/scipy much easier. http://www.islandman93.com/2016/04/tutorial-theano-install-on-windows-7-8.html

I'm in love with Python and I always use the latest version of everything so I'm stuck with 3.4. I use Windows for reasons... This presents a problem for machine learning. Theano is not too easy to install on Windows.

I wrote this tutorial to help others get a working Python 3 version of Theano on a Windows 7 64bit PC. It's not meant for the faint-of-heart and will probably take a decent amount of time to download and install everything but in the end it's worth it.

Let's get to it:

  1. Make sure your computer has a compatible CUDA graphics card: https://developer.nvidia.com/cuda-gpus 
  2. Download CUDA 
    1. https://developer.nvidia.com/cuda-downloads (I downloaded Cuda 7.0.28)
  3. While that's downloading, head to https://www.visualstudio.com/en-us/downloads/download-visual-studio-vs.aspx and get Visual Studio 2013 (the community version).
    1. Download and install, this will install the needed C++ compilers 
    2. Couple of notes here, my install needed 7GB and took ~20 minutes to install
    3. Restart
  4. Install CUDA ~7 minutes
    1. Note: Nsight won't install for older versions of Visual Studio if you don't have them, no worries
    2. I restarted this is windows after all...
  5. Check CUDA
    1. Navigate to C:\ProgramData\NVIDIA Corporation\CUDA Samples\v7.0\1_Utilities\deviceQuery and open the vs2013.sln file
    2. Use CTRL+F5 to run the device check and keep the cmd window open
    3. Make sure you Pass the test, otherwise there is a problem
  6. Download and install MinGW
    1. http://mingw-w64.yaxm.org/doku.php/download
      1. The Mingw-builds links are the easiest to use, link in next step.
    2. I got the Mingw-builds specifically
      1. NOTE if using the Mingw-builds install: Make sure you specify Architecture: x86_64, and Threads: win32
      2. Threads: posix may work but I didn't try it
      3. I also chose my destination folder as C:\mingw\ otherwise it installs to Program Files which has a space in it
    3. You can use the mingw of your choice there's alternative steps to take under number 11.
  7. Download Theano
    1. https://github.com/Theano/Theano, Download Zip at the bottom right
    2. Extract
  8. Dependencies
    1. Download and install Numpy, Scipy, and LibPython
    2. The best place for these is 
      1. http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy
      2. http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy
      3. http://www.lfd.uci.edu/~gohlke/pythonlibs/#libpython
    3. Get the win_amd64.whl files
    4. Use pip install numpyOrScipy.whl to install
  9. Open CMD prompt
    1. Navigate to Theano extracted folder /Theano-master
    2. Use python setup.py install
      1. This automatically uses 2to3 conversion
  10. We need to add some system variables
    1. Right click Computer -> properties -> advanced system settings -> environment variables
    2. Add a new system variable
      1. Name = THEANO_FLAGS
      2. Value = floatX=float32,device=gpu,nvcc.fastmath=True
    3. Also add Visual Studio's c++ compiler to the path
      1. Add ;pathToYourVSInstallation\VC\bin\
  11. Almost there, Open Mingw folder
    1. Run mingw-w64.bat this will open a mingw command prompt
      1. Alternative for different mingw: set Mingw as the first in the path variable.
    2. Type python and press enter
    3. Import Theano and you should be good to go!
      1. Alternative for different mingw: Change mingw to last in the path variable
    4. Test theano with theano.test()
Now you'll be able to use Theano under any environment. Use PyCharm, it's the best.

Let me know any questions or comments below.