The data from tables can be extracted from a PDF but you will need a PDF converter tool that converts the data from PDF’s into Excel (Word may occasionally be helpful too). There are many PDF converter tools that extract the data and text from PDF’s. The NIF currently uses “Jade PDF Data Capture” but this product is a plug-in for Adobe Acrobat 8 and has been discontinued.
Able2Extract PDF Converter 6 (http://www.investintech.com/?gclid=CMqog5nIvqMCFRJNagod-H6-fA) is an option available online that has been tested by NIF and offers a 7 day free trial. You may find additional options by typing, “PDF converter” into your search engine.
Able2Extract allows users to view and convert PDF, HTML and text files.
Instructions for use of Able2Extract
To open a PDF, HTML or text file in Able2Extract:
- Select the Open command from the File Menu
- Select the file you wish to open and type or click on its name
- Able2Extract will open the selected file.
Use your mouse to select the areas of the page that you wish to convert.
You must first point the mouse to where you would like to begin your selection, click on the mouse to activate selection, and then drag the mouse over the area that you wish to convert.
Once the selected area (which should appear shaded in black) is chosen, you are now ready to convert.
You can choose either to select large and wide areas of the PDF document, or you can choose to select specific columns or tables, which often helps to improve the conversion output accuracy.
Selecting Data to be converted
- Click the left mouse button beginning at the position from which you want to select text.
- Drag the mouse pointer (while still holding left button pressed) over the portion of the text you want to select.
- Once you have made the desired selection, release the left mouse button.
Converting the selected data to destination format:
- Click on the Excel icon along the top menu bar denoted with an “E”
After extraction, the data will need to be checked for accuracy as symbols, Greek letters, number of cells each gene occupies, etc., often do not get transferred correctly.