Server memory optimization

Please note that this document is in a work in progress state. The functionalities and REST API endpoints described here might be changed/removed in the future releases.

A typical use case of MadFast is to serve similarity search for as many molecules as possible. Some of the JVM settings (like -Xmx...) allows the adjustment of memory/garbage collector parameters which are crucial to memory intensive use cases.

Relevant JVM parameters

Please see Oracle's relevant documentations:

Garbage Collector Ergonomics
Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide
[http://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html]
Garbage First Garbage Collector Tuning

We will discuss hints regarding using the following parameters

-Xmx Maximum heap size
-XX:NewRatio Young generation size ration
-XX:+UseG1GC Use Garbage-First Garbage Collector

Optimization strategy

As a starting point for large datasets as rule of thumb around 350 MB heap memory per million structure is recommended when a single 1024 bit fingerprint is used as a staring point. This memory requirement can be lowered with various GC settings detailed below.

Preprocessing

Usually a single molecular descriptor / molecule storage is kept in memory during the preprocessing step so a generous heap size can be allowed. To track memory usage during preprocessing use profiling data collection described in Profiling and execution statistics. The goal is to provide enough heap size to fit all storages into the old generation and make sure that the young generation memory pool can be emptied during GC. With the default ParalellGc in Java 7 / 8 this can ensured by a combination of -Xmx and -XX:NewRatio options. It is also possible to use G1 garbage collector for the preprocessing steps.

Server setup

Starting with version 0.3.2 the embedded server in gui.sh can invoke a more precise allocated memory measurements. Informations on the state of garbage collectors/memory pools over the REST API and Web UI are also available. One can launch the server with a generous amount of heap memory (possibly on a subset of the exposed resources) and use these functionalities to assess the room for optimization.

Choosing garbage collector

Document Available collectors from the Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide gives an overview on garbage collector selection criteria.

Example results

Results from a measurement with 844 million molecules and a single 1024 bit fingerprint on Amazon EC2 are discussed.

Instance details:

Type: Memory optimized, r4.16xlarge (See https://aws.amazon.com/ec2/pricing/on-demand/)
vCPU: 64 cores
ECU: 195 units
Memory: 488 GB
Storage: 500 GB EBS
Price: $4.742 per hour

Time measurements

Step	Time	Per Molecule	Speed
Master molecule storage creation (844 M molecules)	5685 s, 1h 34 min	6.7 us	8.9 M/min
CFP-7 fingerprint calculation (844 M mol)	7367 s, 2h 03 min	8.7 us	6.8 M/min
Read Master Molecule Storage (40 GB file)	289 s, 4 min 49 s	342 ns	141 MB/s
Read Master ID Storage (16 GB file)	113 s, 1 min 53 s	133 ns	141 MB/s
Read CFP7 fingerprint (109 GB file)	930 s, 15 min 30 s	1.1 us	117 MB/s
Real time search from WebUI	4 s	4.7 ns	211 M/s
Dissimilarity distribution	1 min	71.1 ns	14 M/s

Size measurements

Size measurements were done by loading them into the embedded server gui.sh and using the experimental size measurement functions of the Web UI.

Item	Size
Input file size	844879022 molecules, 55195139329 characters (including separator), approximately 51 GiB raw
MMS + MID memory usage (sizeinfo)	55.3 GiB, 39.5 GiB for molecules, 15.7 GiB for IDs
CFP-7 memory usage (sizeinfo)	120.3 GiB

Based on the size measurements molecule storage (molecules and IDs) has an approximately 10% direct memory overhead on the uncompressed original SMILES size. The fingerprint storage has an approximately 20% overhead on the stored fingerprint bits count (844.8e6 / 2^9 ~ 100.7 GiB is the raw storage used for fingerprint counts).

The required memory for this setup is 175.6 GiB; allowing around 10% overhead for the running JVM resulted in 193 GiB final result which was used with G1 garbage collector (-XX:+UseG1GC -Xmx193g) finally. This is approximately 244 MiB memory per million molecules after optimization.

Expected changes

Creating and retrieving garbage collector statistics and calculating resource memory usages are currently unprivileged operations: all of the REST API clients are allowed to issue such request. With the introduction of authorization capabilities in the near future this will be configurable.

Currently garbage collection can not be invoked from the REST API, possibly a privileged endpoint will be provided.