Contributing to freud

Code Conventions

Python

Python (and Cython) code in freud should follow PEP 8.

During continuous integration (CI), all Python and Cython code in freud is tested with flake8 to ensure PEP 8 compliance. Additionally, all CMake code is tested using cmakelang’s cmake-format. It is strongly recommended to set up a pre-commit hook to ensure code is compliant before pushing to the repository:

pip install -r requirements-precommit.txt
pre-commit install

To manually run pre-commit for all the files present in the repository, run the following command:

pre-commit run --all-files --show-diff-on-failure

Documentation is written in reStructuredText and generated using Sphinx. It should be written according to the Google Python Style Guide. A few specific notes:

  • The shapes of NumPy arrays should be documented as part of the type in the following manner:

    points ((:math:`N_{points}`, 3) :class:`numpy.ndarray`):
    
  • Optional arguments should be documented as such within the type after the actual type, and the default value should be included within the description:

    box (:class:`freud.box.Box`, optional): Simulation box (Default value = None).
    

C++

C++ code should follow the result of running clang-format-6.0 with the style specified in the file .clang-format. Please refer to Clang Format 6 for details.

When in doubt, run clang-format -style=file FILE_WITH_YOUR_CODE in the top directory of the freud repository. If installing clang-format is not a viable option, the check-style step of continuous integration (CI) contains the information on the correctness of the style.

Doxygen docstrings should be used for classes, functions, etc.

Code Organization

The code in freud is a mix of Python, Cython, and C++. From a user’s perspective, methods in freud correspond to Compute classes, which are contained in Python modules that group methods by topic. To keep modules well-organized, freud implements the following structure:

  • All C++ code is stored in the cpp folder at the root of the repository, with subdirectories corresponding to each module (e.g. cpp/locality).

  • Python code is stored in the freud folder at the root of the repository.

  • C++ code is exposed to Python using Cython code contained in pxd files with the following convention: freud/_MODULENAME.pxd (note the preceding underscore).

  • The core Cython code for modules is contained in freud/MODULENAME.pyx (no underscore).

  • Generated Cython C++ code (e.g. freud/MODULENAME.cxx) should not be committed during development. These files are generated using Cython when building from source, and are unnecessary when installing compiled binaries.

  • If a Cython module contains code that must be imported into other Cython modules (such as the freud.box.Box class), the pyx file must be accompanied by a pxd file with the same name: freud/MODULENAME.pxd (distinguished from pxd files used to expose C++ code by the lack of a preceding underscore). For more information on how pxd files work, see the Cython documentation.

  • All tests in freud are based on the Python standard unittest library and are contained in the tests folder. Test files are named by the convention tests/test_MODULENAME_CLASSNAME.py.

  • Benchmarks for freud are contained in the benchmarks directory and are named analogously to tests: benchmarks/benchmark_MODULENAME_CLASSNAME.py.

Benchmarks

Benchmarking in freud is performed by running the benchmarks/benchmarker.py script. This script finds all benchmarks (using the above naming convention) and attempts to run them. Each benchmark is defined by extending the Benchmark class defined in benchmarks/benchmark.py, which provides the standard benchmarking utilities used in freud. Subclasses just need to define a few methods to parameterize the benchmark, construct the freud object being benchmarked, and then call the relevant compute method. Rather than describing this process in detail, we consider the benchmark for the freud.density.RDF module as an example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import numpy as np
import freud
from benchmark import Benchmark
from benchmarker import run_benchmarks


class BenchmarkDensityRDF(Benchmark):
    def __init__(self, r_max, bins, r_min):
        self.r_max = r_max
        self.bins = bins
        self.r_min = r_min

    def bench_setup(self, N):
        self.box_size = self.r_max*3.1
        np.random.seed(0)
        self.points = np.random.random_sample((N, 3)).astype(np.float32) \
            * self.box_size - self.box_size/2
        self.rdf = freud.density.RDF(self.bins, self.r_max, r_min=self.r_min)
        self.box = freud.box.Box.cube(self.box_size)

    def bench_run(self, N):
        self.rdf.compute((self.box, self.points), reset=False)
        self.rdf.compute((self.box, self.points))


def run():
    Ns = [1000, 10000]
    r_max = 10.0
    bins = 10
    r_min = 0
    number = 100
    name = 'freud.density.RDF'
    classobj = BenchmarkDensityRDF

    return run_benchmarks(name, Ns, number, classobj,
                          r_max=r_max, bins=bins, r_min=r_min)


if __name__ == '__main__':
    run()

The __init__ method defines basic parameters of the run, the bench_setup method is called to build up the RDF object, and the bench_run is used to time and call compute. More examples can be found in the benchmarks directory. The runtime of BenchmarkDensityRDF.bench_run will be timed for number of times on the input sizes of Ns. Its runtime with respect to the number of threads will also be measured. Benchmarks are run as a part of continuous integration, with performance comparisons between the current commit and the master branch.

Steps for Adding New Code

Once you’ve determined to add new code to freud, the first step is to create a new branch off of master. The process of adding code differs based on whether or not you are editing an existing module in freud. Adding new methods to an existing module in freud requires creating the new C++ files in the cpp directory, modifying the corresponding _MODULENAME.pxd file in the freud directory, and creating a wrapper class in freud/MODULENAME.pyx. If the new methods belong in a new module, you must create the corresponding cpp directory and the pxd and pyx files accordingly.

In order for code to compile, it must be added to the relevant CMakeLists.txt file. New C++ files for existing modules must be added to the corresponding cpp/MODULENAME/CMakeLists.txt file. For new modules, a cpp/NEWMODULENAME/CMakeLists.txt file must be created, and in addition the new module must be added to the cpp/CMakeLists.txt file in the form of both an add_subdirectory command and addition to the libfreud library in the form of an additional source in the add_library command. Similarly, new Cython modules must be added to the appropriate list in the freud/CMakeLists.txt file depending on whether or not there is C++ code associated with the module. Finally, you will need to import the new module in freud/__init__.py by adding from . import MODULENAME so that your module is usable as freud.MODULENAME.

Once the code is added, appropriate tests should be added to the tests folder. Test files are named by the convention tests/test_MODULENAME_CLASSNAME.py. The final step is updating documentation, which is contained in rst files named with the convention doc/source/modules/MODULENAME.rst. If you have added a class to an existing module, all you have to do is add that same class to the autosummary section of the corresponding rst file. If you have created a new module, you will have to create the corresponding rst file with the summary section listing classes and functions in the module followed by a more detailed description of all classes. All classes and functions should be documented inline in the code, which allows automatic generation of the detailed section using the automodule directive (see any of the module rst files for an example). Finally, the new file needs to be added to doc/source/index.rst in the API section.