Convert XML to PDF (Adobe Fomrs) Hello, I am working on interactive adobe Forms and the form FM return the PDF in a XML format (xstring) through the parameter FPFORMOUTPUT-XML. Now we have a requirement to send this XML (Xstring) data to another system and create a pdf by using same data.
![]() How does it workWhy A-PDF Form Data ExtractorNo copy and paste forms again
You do not need copy PDF forms data from hundreds submit PDF files again. Using it, you can batch process PDF forms one time.
Visual form fields extraction rule editor
A-PDF Form Data Extractor provide a visual rule editor to allow you define the output field, default value and order etc. See below for a quick impression.
You can also import and export the rules for use other place.
Output to CSV or XML files
Create one single CSV/XML file from all PDF files. That means, you can easily use your rules anywhere.
Save Money
A-PDF Form Data Extractor is a standalone program costing only $39. It does NOT require Adobe Acrobat Pro, which costs hundreds of dollars.
A-PDF Form Data Extractor Command Line
A-PDF Form Data Extractor Command line (PFDECMD.exe) can be used as a Windows console utility that silent extract PDF form(s) data to the CSV or XML file .
USAGE
See also
What's Form Data About?
Traditionally, filled paper forms are transferred and stored as a kind of all inclusive data unit. Breaking the data out of a paper form means manually copying it to another location. So, it is very common to think of a form and the data entered into the form as a single entity,i.e., that the form is the data. But of course, this is not true. The form is a user interface for collecting the data, and in many cases for presenting the data. In the end, it is the data in which we are interested. The form is just a way to get the data. In practice, this idea of form/data separation is much more evident in electronic forms than it is in paper forms. This is mainly because in electronic forms it is extremely easy to move data into and out of a form.
But now we have to ask the question, 'how exactly is data moved in and out of forms?', and 'what does the data look like when it is not in the form?'. These are the two critical issues when it comes to creating real world form/data solutions: the transfer mechanism and the data format, which are the topics covered here.
For PDF forms in Acrobat/Reader there are many different ways to move the data in a variety of data formats. It is important to have this wide range of options in order to satisfy a wide range of workflows. Any particular solution is heavily dependent on exactly how both the form and the data are used.
Form Data PrimerForm data, simply put, is a linear set of Name/Value pairs. This is true for form data in all its different incarnations, from the form itself, to the transfer mechanisms, to all the data formats.NamesNames are the primary way that data is identified, so they need to be unique within their context. On a PDF form, the data name is the name of the form field. When data is extracted to a data format, it is the field name that is used as the name in the format. For example, if PDF form data is extracted to an Excel spreadsheet, then field names become the column names on the spreadsheet. Names pervade the entire data workflow process. It is important to start out with names that make sense so the data can be easily identified and handled later in the process.
There are a couple basic best practices to follow when creating names.
Values
Within the context of a PDF form, data is typically a value representing a number, text, a date, or true/false. All these types of values are relatively small (in data size) and can be easily converted into a text string. Text representation is important because most data formats and data handling mechanisms are intended to handle text. However, it is possible with a PDF to submit both image and raw file data, which is neither text nor small. These are special cases that require specific types of data formats and data handling, and will be covered separately. Most of the discussion and other articles on this topic will only be about handling typical form data values, for the standard form fields.
PDF form fields are enshrined in the ISO 32000 specification, so all compliant PDF viewers implement them in exactly the same way. But data handling is largely dependent on the PDF viewer and/or tool set used to manipulate the PDF form. PDF viewer/tool vendors are not required to implement JavaScript or data handling in the same ways as Acrobat. However, there are practices, formats, and standards that are common for all data handling, and all the good PDF viewers/tools also follow Adobe's lead in one way or another. So, while the techniques discussed here are specific to Adobe Acrobat/Reader, many may also work/integrate with other applications. Compatibility is not guaranteed or implied.
Moving Data in AcrobatIn most cases the functionality for importing and exporting are complimentary, i.e., data moves in both directions with the same mechanisms. Where Acrobat provides a function/mechanism for exporting data to a particular format, it also provides a function/mechanism for importing from that format. But not all of these functions/mechanisms are equal. For example, anything that touches the user's local file system requires privilege. Also, some are not available to Acrobat Reader without special Reader Extensions, or not at all. When developing a data workflow solution it is important to understand these limitations. Below is a table that lists each mechanism and the associated restrictions. The first two are manual operations the user performs from the Acrobat User Interface, which is why 'Privilege is set to Not Applicable', the user always has privilege for manual operations. Each mechanism is discussed in more detail in the associated articles.Methods for Moving Data
Exporting Data (moving data out of a PDF)
In the standard form usage model the user fills out the form and then submits the form/data back to the form owner. The first 3 export mechanisms listed in the table above (skipping the very first entry, which is import only) handle this standard model. Each one saves formatted data to a file on the user's file system, emails a data file, or submits a data file to a server. All of these mechanisms use industry standard data formats and treat the form data as a single, flat block (except in the case of the XML formats, which are hierarchical by nature) of data.
The bottom three options in the table are not straightforward, but provide greater data handling flexibility.
Importing Data (moving data into a PDF)
Most of the Import functionality in Acrobat parallels the export functions, but there are some interesting and useful variations. For example, the first option in the table is 'Drag and Drop'. Both the FDF and XFDF data formats are PDF specific formats, so Acrobat immediately recognizes them as form data. This means that they can be dragged and dropped directly onto a file open in Acrobat. Both also contain links to the original PDF form. In most cases, simply opening one of these data files will open the original form and populate it.
Another example is the 5th option in the table 'JavaScript Read File'. The Acrobat scripting model provides a couple functions that read raw file data. A script can literally open any file on the user's system (or in a file attachment) and parse data out of it. For JavaScript, the ability to parse data is usually limited to plain text files.
There are three main uses for importing data into a form.
Presenting Data
It's often the case that data needs to be presented in a different way than how it was collected. To do this with PDF, the form data is exported using any of the standard methods and then imported into a different form that uses the same form field names. The form field names provide the data mapping, from one form into another. There are many variations on this idea, such as using data from several different forms, and custom scripts that perform special data handling when fields have complex configurations and/or don't have the same names as the original data. One popular variation on this is 'Variable Data'.
Variable Data (Mail Merge)
Variable data means consecutively loading different data sets into the same form, where a data set could be a row in a spreadsheet or database. Each data set import is saved to a different name, printed, or emailed. This type of operation is used to create form letters, invoices, receipts, and many other types of documents. This technique is also commonly called a 'Mail Merge' and there are many 3rd party tools for doing it inside and outside of Acrobat. It can also be done in Acrobat using a custom automation script. Any of the programmed techniques from the table above could be used to created a Variable Data solution.
Pre-Filling Form Fields
There are many reasons for pre-filling a subset of fields on a form. As an example, consider an order form, such as the one shown below. This screenshot shows the form open in Acrobat Professional, with the Attachments panel on the left and the Add-ons panel on the right. There are 3 sets of fields on this form that use some type of automatic 'pre-fill'. Each one is done in a different way, representing the 3 general locations from which data can be imported/acquired.
Data Storage LocationsThere are a large variety of locations where data can be exported to and imported from. The general categories of these locations are outlined in the 'Pre-Filling Form fields' section above, essentially external to Acrobat/PDF and internal to Acrobat/PDF. The range and flexibility of how these locations can be used depends on the particular mechanism used.
As noted in the 'Exporting Data' section above, the last two entries on the table, IAC and plug-in, are both completely custom solutions, so they have the greatest flexibility/capabilities of all the data transfer mechanisms. But, they also cost the most to develop, and solutions using these mechanisms are generally restricted to Acrobat Professional/Standard. Many solutions from 3rd party vendors will use one of these mechanisms.
The manual methodologies at the top of the table are restricted to accessing data files on the user's local file system. However, the local file system could include networked drives as well as remote (virtual) folders that are mapped to the local file system.
The JavaScript model has functionality for accessing the complete range of data storage options (as discussed in the pre-filling example), but any one function/method has limitations, and this is where the discussion is focused.
Data Formats
Not all data formats are equal. Each one has different features that will determine its suitability for a particular solution. The table below provides a brief description of the formats most commonly used with Acrobat and PDF forms.
The second column 'Native Format' is marked as 'Yes' when there is a JavaScript function for importing and exporting data in this format.
The third column 'Data-Sets' indicates the number of individual. This is a common text based format similar to CSV. Each row in the file is a different data set. Recognized by most applications that handle data. This is the only format where the JavaScript function allows data to be imported from any data set in the file.CSVNoManyCommon and very old text based format, where each row is a different dataset. Recognized by most applications that handle data. Acrobat does not provide specific JavaScript functions for handling this format, so it requires custom scripting. However, one of the data export menu items writes to this format for merging data from several forms. CSV is automatically recognized by Excel as a single page in a spreadsheet, so it is an excellent format for importing form data into Excel.XMLYes1XML is a general purpose, text based data format. Unlike the other data formats discussed above, it is capable of representing complex hierarchical data structures. This format is capable of holding more than one data set, but when specifically selecting 'XML' as the format, Acrobat exports a single data set with an '.xml' file extension and uses a simple hierarchical grammar based on field group names. Other formats listed here are XML based, but use a more complex grammar and are saved with a different file extension.XFDYes1XML forms format that can also contain data. Created for what became Adobe LiveCycle Forms, now AEM forms. It is a proprietary Adobe XML based form sold into the enterprise market. Looks like PDF on the outside, but isn't PDF on the inside. Acrobat will import/export this format with regular PDF forms, but it's only really useful for AEM forms on the AEM server tools.XDPYes1Another XML data/form format for AEM forms. Adobe created this one, primarily as a data format that can contain the original XML form (not PDF form). Acrobat will also import/export this format with regular PDFs, but like XFD it is not very useful in this context.JSONNoManyJavaScript Object Notation. Quite literally a text string of the JavaScript code for creating a JavaScript Object. Not very efficient with size, but very easy to transmit, store, create and evaluate in JavaScript. Popularized in web browser scripting, it's now used everywhere. In Acrobat the official JSON toolset was added in the DC version. To use this in previous versions use the object.toSource() and eval() functions. The toSource() function does not create strict JSON format, but it works if the JSON is only parsed with the eval() function.ExcelNoManyThe Excel file format is proprietary to Microsoft, although the specification is public and anyone can create Excel files. Acrobat does not currently provide any way to interact directly with Excel or Excel files. But, there are three indirect ways to get data into Excel. 1) export to CSV, TXT, or XML and import this file in the Excel app. 2) Write a custom Excel Add-in with VBA that uses the Acrobat IAC to acquire form data. 3) Write an Acrobat plug-in that either writes Excel format directly, or interacts with the Excel app.HTMLYes1This is the data format in an HTTP Post when an HTML form submits to a web server. It's a very simple name/value pair text format. Acrobat provides the ability to use this format on a form submit so that a PDF form can be submitted to the same server script that would be used for a web form. Unfortunately, the return data needs to be something Acrobat understands. This is where the scheme usually fails because Acrobat does not understand HTML, except to convert it to a PDF.
Download Tools for Data Import/Export
*Import/Export Excel Data as Text*Acrobat can import and export data to/from a Tab Delimited Text file, which is one of the formats recognized by Microsoft Excel. This package demonstrates the process of the import/export process.
*Fill Form from external XML file* This Automation tool uses data from an external XML file to populate fields on a LiveCycle or AcroForm PDF.
![]()
*Using the Global Object Sample*A set of scripts that demonstrate using the Global Object, which is used to share and persist data.
*HTTP Request Tester* Demonstrates the use of the NET.HTTP object, which was added to Acrobat JavaScript in Acrobat 9. Use this object to connect your automation scripts to the Web. Provides raw data access to all major HTTP operations, Get, Post, etc.
*Load Drop Down From CSV*This tool sets up a drop list (ComboBox) field with entries from a CSV file. In addition, it also sets up a data structure in the form for populating other fields from the list selection.
*File Folder Access Tester* This script provides a dialog for discovering the correct path to any folder. It also tests the ability to access a file in that folder from Acrobat JavaScript.
*Insert/Edit PDF Metadata* Shortcut tool for quickly entering and modifying PDF Document Level Metadata. Uses a toolbar button to activate a custom metadata dialog
*Auto-Populate From Drop-Down* Form Pre-Filling Example: Demonstrates various techniques, from simple to advanced, for populating form fields from selections on a Drop-Down list (ComboBox).
Articles and Scripts for Data Import/ExportAcrobat Forms TrainingForms
This video tutorial series covers everything you need to know about how to create forms and form workflow processes with Acrobat. It starts out with a set of no holds barred videos that introduce conc... keep reading
Auto-Populating Form Fields from a Drop-Down List(ComboBox)AcroForm, Field, Event,Data, List
Scripts and techniques for setting the values of form fields automatically from a selection on a drop-down list.... keep reading
Auto-Filling a Drop List with a Drop Listform, list
This article presents techniques and Scripts for automatically setting the list entries in a drop down (combobox) field from a selection in another drop down field. Includes sample files.... keep reading
![]() Database, Data Handling, XFA, Automation
LiveCycle PDF forms have a built-in ability to connect to a database through OLE DB drivers. While it can be extremely useful to have a direct connection between a form and an associated database, th... keep reading
Acquiring Raw File Data
External data, i.e., data outside of Acrobat or a PDF file, is often a very important part of a workflow process. For example, information on customers, products, employees, etc. are typically stored in Excel files, databases or on a server. One of the most common issues with automating such a workflow process is getting the data from the external file or data source into the automation script. This article provides techniques and script examples for acquiring external data.... keep reading
Importing and Exporting Excel DataAutomation, Excel, Database, AcroForm
This article explains exactly how to transfer data, in both directions, between an excel file and Acrobat. Scripts are provided for importing and exporting in a variety of scenarios, including a looping scenerio for performing variable data operations and mail merge.... keep reading
Free Sample PDF Files with scripts
These free sample PDF files contain scripts for common, complex, and interesting scripting tasks in Acrobat. Many more are available in the Members Only Download Library. Feel free to browse through... keep reading
![]() Comments are closed.
|
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
December 2022
Categories |