[chemfp] FPSFormat in OpenBabel

Andrew Dalke dalke at dalkescientific.com
Fri Oct 21 06:36:26 EDT 2011


Hi Chris,

On Oct 20, 2011, at 10:00 PM, Chris Morley wrote:

> In Openbabel's development code there is now an output format which writes OpenBabel fingerprints to an fps file.


Excellent news! World domination proceeds apace. :)


> I've taken the form in http://code.google.com/p/chem-fingerprints/wiki/ob2fps as the standard. The fingerprint strings seem to be required to have an even number of characters, otherwise FP4 with 307 bits would have 77 hex characters rather than the 78 it has on this page. This will not affect the fingerprint but might matter in how the reading and writing of fps files are implemented.


That is correct. Chemfp fingerprints are essentially 8-bit byte strings. The FPS format uses the hex of the corresponding byte string. A hex character encodes only 4 bits of data, so the resulting hex encoding will be twice a long as the fingerprint byte length.

This makes it much easier to convert between the byte and hex versions. For example, a Python conversion might just be:

byte_fp = hex_fp.decode("hex")
hex_fp = byte_fp.encode("hex")

Cheers,

Andrew
dalke at dalkescientific.com




More information about the chemfp mailing list