BigARTM Developer’s Guide¶
This document describes the development process of BigARTM library.
You should not follow this guide if you are using pre-built BigARTM library via command-line interface or from Python environment. (refer to to Basic BigARTM tutorial for Windows users or Basic BigARTM tutorial for Linux and Mac OS-X users depending on your operating system).
Downloads (Windows)¶
Download and install the following tools:
- Github for Windows from https://windows.github.com/
Visual Studio 2013 Express for Windows Desktop from https://www.visualstudio.com/en-us/products/visual-studio-express-vs.aspx
- Prebuilt Boost binaries from http://sourceforge.net/projects/boost/files/boost-binaries/, for example these two:
(optional) If you plan to build documentation, download and install sphinx-doc as described here: http://sphinx-doc.org/latest/index.html
(optional) 7-zip – http://www.7-zip.org/a/7z920-x64.msi
(optional) Putty – http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe
All explicit links are given just for convenience if you are setting up new environment. You are free to choose other versions or tools, and most likely they will work just fine for BigARTM. Remember to match the following: * Visual Studio version must match Boost binaries version, unless you build Boost yourself * Use the same configuration (32 bit or 64 bit) for your Python and BigARTM binaries
Source code¶
BigARTM is hosted in public GitHub repository:
https://github.com/bigartm/bigartm
We maintain two branches: master and stable. master branch is the latest source code, potentially including some unfinished features. stable branch will be lagging behind master, and moved forward to master as soon as mainteiners decide that it is ready. Typically this should happen at the end of each month. At the same point we will introduce a new tag (something like v0.7.3 ) and produce a new release for Windows. In addition, stable branch also might receive small urgent fixes in between releases, typically to address critical issues reported by our users. Such fixes will be also included in master branch.
To contribute a fix you should fork the repository, code your fix and submit a pull request. against master branch. All pull requests are regularly monitored by BigARTM maintainers and will be soon merged. Please, keep monitoring the status of your pull request on travis, which is a continuous integration system used by BigARTM project.
Build C++ code on Windows¶
The following steps describe the procedure to build BigARTM’s C++ code on Windows.
Download and install GitHub for Windows.
Clone https://github.com/bigartm/bigartm/ repository to any location on your computer. This location is further refered to as
$(BIGARTM_ROOT)
.Download and install Visual Studio 2012 or any newer version. BigARTM will compile just fine with any edition, including any Visual Studio Express edition (available at www.visualstudio.com).
Install CMake (tested with cmake-3.0.1, Win32 Installer).
Make sure that CMake executable is added to the
PATH
environmental variable. To achieve this either select the option “Add CMake to the system PATH for all users” during installation of CMake, or add it to thePATH
manually.Download and install Boost 1.55 or any newer version.
We suggest to use the Prebuilt Windows Binaries. Make sure to select version that match your version of Visual Studio. You may choose to work with either x64 or Win32 configuration, both of them are supported.
Configure system variables
BOOST_ROOT
andBoost_LIBRARY_DIR
.If you have installed boost from the link above, and used the default location, then the setting should look similar to this:
setx BOOST_ROOT C:\local\boost_1_56_0 setx BOOST_LIBRARYDIR C:\local\boost_1_56_0\lib32-msvc-12.0
For all future details please refer to the documentation of FindBoost module. We also encourage new CMake users to step through CMake tutorial.
Install Python 2.7 (tested with Python 2.7.6).
You may choose to work with either x64 or Win32 version of the Python, but make sure this matches the configuration of BigARTM you have choosed earlier. The x64 installation of python will be incompatible with 32 bit BigARTM, and virse versus.
Use CMake to generate Visual Studio projects and solution files. To do so, open a command prompt, change working directory to
$(BIGARTM_ROOT)
and execute the following commands:mkdir build cd build cmake ..
You might have to explicitly specify the cmake generator, especially if you are working with x64 configuration. To do so, use the following syntax:
cmake .. -G"Visual Studio 12 Win64"
CMake will generate Visual Studio under
$(BIGARTM_ROOT)/build/
.Open generated solution in Visual Studio and build it as you would usually build any other Visual Studio solution. You may also use MSBuild from Visual Studio command prompt.
The build will output result into the following folders:
$(BIGARTM_ROOT)/build/bin/[Debug|Release]
— binaries (.dll and .exe)$(BIGARTM_ROOT)/build/lib/[Debug|Release]
— static libraries
At this point you should be able to run BigARTM tests, located here:
$(BIGARTM_ROOT)/build/bin/*/artm_tests.exe
.
Python code on Windows¶
Install Python 2.7 (this step is already done if you are following the instructions above),
Add Python to the
PATH
environmental variablehttp://stackoverflow.com/questions/6318156/adding-python-path-on-windows-7
Follow the instructions in
README
file in directory$(BIGARTM_ROOT)/3rdparty/protobuf/python/
. In brief, this instructions ask you to run the following commands:python setup.py build python setup.py test python setup.py install
On second step you fill see two failing tests:
Ran 216 tests in 1.252s FAILED (failures=2)
This 2 failures are OK to ignore.
At this point you should be able to run BigARTM tests for Python, located under $(BIGARTM_ROOT)/python/tests/
.
- [Optional] Download and add to MSVS Python Tools 2.0.
All necessary instructions can be found at https://pytools.codeplex.com/.
This will allow you debug you Python scripts using Visual Studio.
You may start with the following solution:
$(BIGARTM_ROOT)/src/artm_vs2012.sln
.
Build C++ code on Linux¶
Refer to Basic BigARTM tutorial for Linux and Mac OS-X users.
Working with iPython notebooks remotely¶
It turned out to be common scenario to run BigARTM on a Linux server (for example on Amazon EC2), while connecting to it from Windows through putty
.
Here is a convenient way to use ipython notebook
in this scenario:
Connect to the Linux machine via putty. Putty needs to be configured with dynamic tunnel for port
8888
as describe here on this page (port8888
is a default port foripython notebook
). The same page describes how to configure internet properties:Clicking on Settings in Internet Explorer, or Proxy Settings in Google Chrome, should open this dialogue. Navigate through to the Advanced Proxy section and add localhost:9090 as a SOCKS Proxy.
Start
ipython notebook
in your putty terminal.Open your favourite browser on Windows, and go to http://localhost:8888. Enjoy your notebook while the engine runs on remotely :)
Compiling .proto files on Windows¶
Open a new command prompt
Copy the following file into
$(BIGARTM_ROOT)/src/
$(BIGARTM_ROOT)/build/bin/CONFIG/protoc.exe
Here CONFIG can be either Debug or Release (both options will work equally well).
Change working directory to
$(BIGARTM_ROOT)/src/
Run the following commands
.\protoc.exe --cpp_out=. --python_out=. .\artm\messages.proto .\protoc.exe --cpp_out=. .\artm\core\internals.proto
Code style¶
In the code we follow google code style with the following changes:
- Exceptions are allowed
- Indentation must be 2 spaces. Tabs are not allowed.
- No lines should exceed 120 characters.
All .h and .cpp files under $(BIGARTM_ROOT)/src/artm/
must be verified for code style with
cpplint.py script.
Files, generated by protobuf compiler, are the only exceptions from this rule.
To run the script you need some version of Python installed on your machine. Then execute the script like this:
python cpplint.py --linelength=120 <filename>
On Windows you may run this master-script to check all required files:
$(BIGARTM_ROOT/utils/cpplint_all.bat
.