Drupal7: Recipe to save Webform2PDF pdfs to server directory

David G - DrupalProblem:

A website blew up on me (I think due to human error). This site was built quickly in ~2 weeks and uses the Webform module to collect academic references for incoming students for research programs. While the site was built quickly, I do use for example the Backup and Migrate module to routinely backup the website at least daily. But, the site got hosed, and there were 150+ webform submissions which I did not know if the client had successfully used Webform2PDF default PDF mechanism to archive all the submissions to PDF… I wanted to generate every single webform submission as a PDF and store it on the server, and email the client all these PDFs as insurance while we triage the site problems and possibly reset the webapp to an initial known state. Read on to see how I save all webform submissions as PDFs using contributed modules to automate the process.

Initial Brainstorming and Design of Solution:

As I mentioned the site is using Webform and Webform2PDF to create PDFs of submissions. Out of the box webform pdfs are built dynamically when a url is visited and streamed to the client; by initial design the Webform2PDF module does not store generated pdf files on the server.

Being a regular contributed to Drupal StackExchange I’ve actually seen this question asked before, ” Where can I download all the submission PDFs of a Webform? … sorry, you can’t.”.

As the reply to that post hints at, perhaps there is an automated solution …

 Automating PDF Generation to Server Disk

This is my solution I created in about 2 hours for a client (including debugging and running it all, etc). This is what I consider a proof-of-concept and by no means a fully polished product. In a nutshell:

  1. I want to make a url link on a staff administrative dashboard to generate the PDFs.
  2. When pressed this link should process a rule that fetches a listing of all webform submissions default PDF download URLs provided by the webform2pdf module.
  3. The generated list of links should in-turn be processed by the (drupal) system and save the PDFs to disk.

So here is what the client sees for these steps in the final product. They see an admin panel with a link to generate PDFs:

Staff are presented a Link to export webform submissions as PDFs to a server folder.

Staff are presented a Link to export webform submissions as PDFs to a server folder.

Upon clicking the link the PDFs are saved to the server disk. Awesome! This beats clicking a Download link approximately 165 times!

Final pdf files sitting on server folder. They are given a default name from the webform2pdf module of WebformID and then Submission ID dot pdf.

Final pdf files sitting on server folder. They are given a default name from the webform2pdf module of WebformID and then Submission ID dot pdf.

The Recipe

  • I created a Dashboard using the Panels module. This was a component released with the site originally to be used by staff. I added a custom content item to the Panel Content with a link for PDF generation at /generate-webform-submission-pdfs.
  • This custom URL is connected to a Rule using the Rules Link Event module.The Rules Link Event modules allows arbitrary internal site urls to be tied to be Actions Rules responds to. This is accomplished internally by Rules Link Event by providing a hook_menu implementation of actionable URL. My custom url in Rules Link Event has the following configuration in /admin/config/workflow/rules/links of save_all_pdfs|staff-dashboard/save-all-pdfs|Save All Submitted Webforms to Disk (see config screen):

    Rules Live Event module configuration.

    Rules Live Event module configuration.

  • The Rule processed by the url being clicked does 2 things:
    • Load the result of a Views listing of webform submissions (pdf urls)
    • Then processes that resultset to save the files to disk.

So this rule has 3 major pieces of configuration:

The view to output the url listing:

View to generate URL listing. A view Page generates the View, a "Rule" view exposes the output to be usable by Rules.

View to generate URL listing. A view Page generates the View, a “Rule” view exposes the output to be usable by Rules.

The Rule Configuration to do the processing logic which has 2 parts, some Rules configuration and some custom code:

Rule Action configuration (gui version).

Rule Action configuration (gui version).

Since the above image may not be perfectly clear, here is a Rules Export of my rule configuration:

{ "rules_rmp_respond_to_saveall_webforms_link_click" : {
    "LABEL" : "Respond to SaveAll Webforms Link Click",
    "PLUGIN" : "reaction rule",
    "OWNER" : "rules",
    "TAGS" : [ "rmp" ],
    "REQUIRES" : [ "php", "rules", "rules_linkevent" ],
    "ON" : { "rules_linkevent_save_all_pdfs" : [] },
    "IF" : [
      { "user_has_role" : {
        "account" : ["site:current-user"],
        "roles" : { "value" : { "2" : "2" } }
    "DO" : [
      { "php_eval" : { "code" : "$filedir = \u0027private:\/\/archived-pdfs\u0027;\r\n$prepared = file_prepare_directory($filedir, FILE_MODIFY_PERMISSIONS | FILE_CREATE_DIRECTORY);" } },
      { "variable_add" : {
          "USING" : { "type" : "integer", "value" : "0" },
          "PROVIDE" : { "variable_added" : { "url_counter" : "url counter" } }
      { "VIEW LOOP" : {
          "VIEW" : "rmp_webform_pdf_submission_links",
          "DISPLAY" : "views_rules_1",
          "ROW VARIABLES" : { "sid" : { "sid" : "Webform submissions: Sid" } },
          "DO" : [
            { "php_eval" : { "code" : "global $user;\r\nglobal $base_url;\r\n$filedir = \u0027private:\/\/archived-pdfs\u0027;\r\n\r\n$url = $base_url . $sid;\r\n\/\/ drupal session cookie. eg, we\u0027re the logged in validated user calling this.\r\n\/\/ www.drupal.org\/node\/80675#comment-7079954\r\n\/\/ and Corey was helpful with php.net\/session_name as SESSION wasn\u0027t happy.\r\n$options = array();\r\n$options[\u0027headers\u0027][\u0027Cookie\u0027] = session_name() . \u0027=\u0027 . $user-\u003Esid;\r\n$options[\u0027timeout\u0027] = 60;\r\ndrupal_set_message(\u0027fetched url: \u0027 . $url);\r\n\r\n$http_return = drupal_http_request($url, $options);\r\n\r\nif ($http_return-\u003Ecode == \u0027200\u0027) {\r\n  $filename = preg_match(\u0027\/\u0022([^\u0022]*)\u0022\/\u0027, $http_return-\u003Eheaders[\u0027content-disposition\u0027], $fnames);\r\n  file_unmanaged_save_data($http_return-\u003Edata, $filedir . \u0027\/\u0027 . $fnames[1] , FILE_EXISTS_REPLACE);\r\n}" } }
      { "redirect" : { "url" : "staff-dashboard" } }

I will say the embedded PHP logic in this rule requires some explanation. It’s only a few lines of code — but it’s a little tricky!


global $user;
global $base_url;
$filedir = 'private://archived-pdfs';

$url = $base_url . $sid;
$options = array();
$options['headers']['Cookie'] = session_name() . '=' . $user->sid;
$options['timeout'] = 60;
drupal_set_message('fetched url: ' . $url);

$http_return = drupal_http_request($url, $options);

if ($http_return->code == '200') {
  $filename = preg_match('/"([^"]*)"/', $http_return->headers['content-disposition'], $fnames);
  file_unmanaged_save_data($http_return->data, $filedir . '/' . $fnames[1] , FILE_EXISTS_REPLACE);

A brief breakdown of the logical steps in this code are:

  • We take the logged in User’s session values and build a remote URL request using Drupal 7’s Core API function drupal_http_request. This performs a remote url request and retrieves the response — in this case we’re actually requesting a page on our own site! And passing the credentials of the current user to authenticate as a valid user of the system. The returned data, if a response was received OK, is then saved to a private folder on the system; overwriting any existing file with a similar name.

All these pieces working together provide the functionality we want from Drupal in an automated fashion.

Pros/Cons and Additional Ideas on This Approach

  • URL placement for the action link is real flexible it can be placed anywhere in the system (unlike Rules Link which only embeds links into content type pages). This has 1 drawback I can see, without custom coding or being smart with where you place this link — the link can be insecure. My dashboard can only be visited by users of the proper role. Consider this when you embed your link on your site.
  • Similarly my View to create the url listing is locked down by requiring a Staff role to access the page. Since we’re gluing drupal components together to build this automation we need to assure the pieces aren’t exposing private information of the underlying system to the public.
  • I’ve not seen this type of usage in drupal_http_request blogged about much in the Drupalsphere. My site runs via 100% SSL for all pages — if you’re not running SSL and passing cookie data in the header I would hate to think some MITM attack could cause security issues on your website.
  • My proof-of-concept handles 165 generated PDFs fast enough to beat webserver timeout issues. In a more full fledged tool of this you’d likely want to Queue work to be completed in a Batch process.
  • I may possibly seek to codify this as a new feature for Webform2PDF or create a sub-module to meet these needs.
  • I don’t like embedded PHP in Rules or Views. The embedded PHP could be abstracted into a custom module function called by the Rule. Also the faking of a user account in drupal_http_request could be wrapped as a helper function in a custom module call_internal_drupal_url_as_user($url, $options, $uid) and then be used in future projects for example.
  • In this case I just want the files on disk somewhere. My sysadmin will create an FTP account for the client to access the folder on the webserver if they need it. An additional task could be to zip these files on the Server and send that ZIPfile from another Rule component — but I don’t need that functionality today!

I hope you find this blog post really useful — I know I learned alot and had fun piecing this together!

Looking for quality web hosting? Look no further than Arvixe Web Hosting!

Tags: , , , | Posted under Drupal | RSS 2.0

Author Spotlight

David Gurba

I am a web programmer currently employed at UCSB. I have been developing web applications professionally for 8+ years now. For the last 5 years I’ve been actively developing websites primarily in PHP using Drupal. I have experience using LAMP and developing data driven websites for clients in aviation, higher education and e-commerce. If you’d like to contact me I can be reached at david.gurba@arvixe.com

Leave a Reply

Your email address will not be published. Required fields are marked *