A certain type of data analysis, like filtering by specific parameters, diversity selection, virtual screening or docking, require you to have full data locally. Full data download and its preparation for use require a certain amount of effort on the user’s side. Following MolPort user requests, we created two examples of workflows that automate this process using KNIME. Now you can get to MolPort data in KNIME easier.
The first workflow named “LoadAllMolPortMolecules” does not require data access credentials. With this workflow you get chemical structures of all available compounds, their MolPort IDs and website links to acquire further information. You can easily share this example with colleagues to provide them with a fast and easy way to access MolPort data. The process of downloading the complete data set (100 MB) and loading it to KNIME may take less than 5 minutes and requires 2 GB of space on your hard drive. All you must do is set up a folder for local file storage.
The second workflow, named “LoadMolPortExtendedData”, allows you to acquire extended information version of data. In this case there are multiple downloadable files.. For each compound additional data includes the largest available stock amount, the fastest shipping time, price ranges for 1mg/5mg/50mg, whether it is available as a Screening Compound or a Building Block, the QC type, the InChI and the InChIKey. Access to such data is also free, however requires use of credentials. To get yours, simply send a message from your corporate email address to firstname.lastname@example.org. Then set up the second workflow example to get full data version.
Some companies have a restriction for FTP access. You may need to work with your system administrator to add MolPort FTP location to allowed list. Alternatively, contact us to get access to files through HTTPS.
Both workflows contain descriptions and are compatible with KNIME version 3.3 or later.
Files for download: